DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: David Marchand <david.marchand@redhat.com>, dev <dev@dpdk.org>,
	John McNamara <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	iain.barker@oracle.com, edwin.leung@oracle.com,
	maxime.coquelin@redhat.com
Subject: Re: [dpdk-dev] [PATCH] eal: add option to not store segment fd's
Date: Fri, 29 Mar 2019 12:05:32 +0000	[thread overview]
Message-ID: <b6ce21eb-dae1-7858-a03a-6a5c1b6a35eb@intel.com> (raw)
Message-ID: <20190329120532.R13OFlr7GC6bkYio1qNcm2lrOrce24Vp60B09DkD7Pw@z> (raw)
In-Reply-To: <1682850.JO3elT0QtZ@xps>

On 29-Mar-19 11:34 AM, Thomas Monjalon wrote:
> 29/03/2019 11:33, Burakov, Anatoly:
>> On 29-Mar-19 9:50 AM, David Marchand wrote:
>>> On Fri, Feb 22, 2019 at 6:12 PM Anatoly Burakov
>>> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
>>>
>>>      Due to internal glibc limitations [1], DPDK may exhaust internal
>>>      file descriptor limits when using smaller page sizes, which results
>>>      in inability to use system calls such as select() by user
>>>      applications.
>>>
>>>      While the problem can be worked around using --single-file-segments
>>>      option, it does not work if --legacy-mem mode is also used. Add a
>>>      (yet another) EAL flag to disable storing fd's internally. This
>>>      will sacrifice compability with Virtio with vhost-backend, but
>>>      at least select() and friends will work.
>>>
>>>      [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html
>>>
>>>
>>> Sorry, I am a bit lost and I never took the time to look in the new
>>> memory allocation system.
>>> This gives the impression that we are accumulating workarounds, between
>>> legacy-mem, single-file-segments, now no-seg-fds.
>>
>> Yep. I don't like this any more than you do, but i think there are users
>> of all of these, so we can't just drop them willy-nilly. My great hope
>> was that by now everyone would move on to use VFIO so legacy mem
>> wouldn't be needed (the only reason it exists is to provide
>> compatibility for use cases where lots of IOVA-contiguous memory is
>> required, and VFIO cannot be used), but apparently that is too much to
>> ask :/
>>
>>>
>>> Iiuc, everything revolves around the need for per page locks.
>>> Can you summarize why we need them?
>>
>> The short answer is multiprocess. We have to be able to map and unmap
>> pages individually, and for that we need to be sure that we can, in
>> fact, remove a page because no one else uses it. We also need to store
>> fd's because virtio with vhost-user backend needs them to work, because
>> it relies on sharing memory between processes using fd's.
> 
> It's a pity adding an option to workaround a limitation of a corner case.
> It adds complexity that we will have to support forever,
> and it's even not perfect because of vhost.
> 
> Might there be another solution?
> 

If there is one, i'm all ears. I don't see any solutions aside from 
adding limitations.

For example, we could drop the single/multi file segments mode and just 
make single file segments a default and the only available mode, but 
this has certain risks because older kernels do not support fallocate() 
on hugetlbfs.

We could further draw a line in the sand, and say that, for example, 
19.11 (or 20.11) will not have legacy mem mode, and everyone should use 
VFIO by now and if you don't it's your own fault.

We could also cut down on the number of fd's we use in single-file 
segments mode by not using locks and simply deleting pages in the 
primary, but yanking out hugepages from under secondaries' feet makes me 
feel uneasy, even if technically by the time that happens, they're not 
supposed to be used anyway. This could mean that the patch is no longer 
necessary because we don't use that many fd's any more.

However, if we are to support all that we support now, the only option 
here is to pile on more workarounds.

-- 
Thanks,
Anatoly

  parent reply	other threads:[~2019-03-29 12:05 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-22 17:12 Anatoly Burakov
2019-03-29  9:50 ` David Marchand
2019-03-29  9:50   ` David Marchand
2019-03-29 10:33   ` Burakov, Anatoly
2019-03-29 10:33     ` Burakov, Anatoly
2019-03-29 11:34     ` Thomas Monjalon
2019-03-29 11:34       ` Thomas Monjalon
2019-03-29 12:05       ` Burakov, Anatoly [this message]
2019-03-29 12:05         ` Burakov, Anatoly
2019-03-29 12:40         ` Thomas Monjalon
2019-03-29 12:40           ` Thomas Monjalon
2019-03-29 13:24           ` Burakov, Anatoly
2019-03-29 13:24             ` Burakov, Anatoly
2019-03-29 13:34             ` Thomas Monjalon
2019-03-29 13:34               ` Thomas Monjalon
2019-03-29 14:21               ` Burakov, Anatoly
2019-03-29 14:21                 ` Burakov, Anatoly
2019-03-29 13:35             ` Maxime Coquelin
2019-03-29 13:35               ` Maxime Coquelin
2019-03-29 17:55 ` [dpdk-dev] [PATCH v2 1/2] memalloc: refactor segment resizing code Anatoly Burakov
2019-03-29 17:55   ` Anatoly Burakov
2019-03-29 17:55 ` [dpdk-dev] [PATCH v2 2/2] memalloc: do not use lockfiles for single file segments mode Anatoly Burakov
2019-03-29 17:55   ` Anatoly Burakov
2019-04-02 14:08   ` Thomas Monjalon
2019-04-02 14:08     ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6ce21eb-dae1-7858-a03a-6a5c1b6a35eb@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=edwin.leung@oracle.com \
    --cc=iain.barker@oracle.com \
    --cc=john.mcnamara@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).