DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: David Marchand <david.marchand@redhat.com>, dev <dev@dpdk.org>,
	John McNamara <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	iain.barker@oracle.com, edwin.leung@oracle.com,
	maxime.coquelin@redhat.com
Subject: Re: [dpdk-dev] [PATCH] eal: add option to not store segment fd's
Date: Fri, 29 Mar 2019 13:40:14 +0100	[thread overview]
Message-ID: <3255576.YcZt162MTL@xps> (raw)
In-Reply-To: <b6ce21eb-dae1-7858-a03a-6a5c1b6a35eb@intel.com>

29/03/2019 13:05, Burakov, Anatoly:
> On 29-Mar-19 11:34 AM, Thomas Monjalon wrote:
> > 29/03/2019 11:33, Burakov, Anatoly:
> >> On 29-Mar-19 9:50 AM, David Marchand wrote:
> >>> On Fri, Feb 22, 2019 at 6:12 PM Anatoly Burakov
> >>> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> >>>
> >>>      Due to internal glibc limitations [1], DPDK may exhaust internal
> >>>      file descriptor limits when using smaller page sizes, which results
> >>>      in inability to use system calls such as select() by user
> >>>      applications.
> >>>
> >>>      While the problem can be worked around using --single-file-segments
> >>>      option, it does not work if --legacy-mem mode is also used. Add a
> >>>      (yet another) EAL flag to disable storing fd's internally. This
> >>>      will sacrifice compability with Virtio with vhost-backend, but
> >>>      at least select() and friends will work.
> >>>
> >>>      [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html
> >>>
> >>>
> >>> Sorry, I am a bit lost and I never took the time to look in the new
> >>> memory allocation system.
> >>> This gives the impression that we are accumulating workarounds, between
> >>> legacy-mem, single-file-segments, now no-seg-fds.
> >>
> >> Yep. I don't like this any more than you do, but i think there are users
> >> of all of these, so we can't just drop them willy-nilly. My great hope
> >> was that by now everyone would move on to use VFIO so legacy mem
> >> wouldn't be needed (the only reason it exists is to provide
> >> compatibility for use cases where lots of IOVA-contiguous memory is
> >> required, and VFIO cannot be used), but apparently that is too much to
> >> ask :/
> >>
> >>>
> >>> Iiuc, everything revolves around the need for per page locks.
> >>> Can you summarize why we need them?
> >>
> >> The short answer is multiprocess. We have to be able to map and unmap
> >> pages individually, and for that we need to be sure that we can, in
> >> fact, remove a page because no one else uses it. We also need to store
> >> fd's because virtio with vhost-user backend needs them to work, because
> >> it relies on sharing memory between processes using fd's.
> > 
> > It's a pity adding an option to workaround a limitation of a corner case.
> > It adds complexity that we will have to support forever,
> > and it's even not perfect because of vhost.
> > 
> > Might there be another solution?
> > 
> 
> If there is one, i'm all ears. I don't see any solutions aside from 
> adding limitations.
> 
> For example, we could drop the single/multi file segments mode and just 
> make single file segments a default and the only available mode, but 
> this has certain risks because older kernels do not support fallocate() 
> on hugetlbfs.
> 
> We could further draw a line in the sand, and say that, for example, 
> 19.11 (or 20.11) will not have legacy mem mode, and everyone should use 
> VFIO by now and if you don't it's your own fault.
> 
> We could also cut down on the number of fd's we use in single-file 
> segments mode by not using locks and simply deleting pages in the 
> primary, but yanking out hugepages from under secondaries' feet makes me 
> feel uneasy, even if technically by the time that happens, they're not 
> supposed to be used anyway. This could mean that the patch is no longer 
> necessary because we don't use that many fd's any more.

This last option is interesting. Is it realistic?

  parent reply	other threads:[~2019-03-29 12:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-22 17:12 Anatoly Burakov
2019-03-29  9:50 ` David Marchand
2019-03-29  9:50   ` David Marchand
2019-03-29 10:33   ` Burakov, Anatoly
2019-03-29 10:33     ` Burakov, Anatoly
2019-03-29 11:34     ` Thomas Monjalon
2019-03-29 11:34       ` Thomas Monjalon
2019-03-29 12:05       ` Burakov, Anatoly
2019-03-29 12:05         ` Burakov, Anatoly
2019-03-29 12:40         ` Thomas Monjalon [this message]
2019-03-29 12:40           ` Thomas Monjalon
2019-03-29 13:24           ` Burakov, Anatoly
2019-03-29 13:24             ` Burakov, Anatoly
2019-03-29 13:34             ` Thomas Monjalon
2019-03-29 13:34               ` Thomas Monjalon
2019-03-29 14:21               ` Burakov, Anatoly
2019-03-29 14:21                 ` Burakov, Anatoly
2019-03-29 13:35             ` Maxime Coquelin
2019-03-29 13:35               ` Maxime Coquelin
2019-03-29 17:55 ` [dpdk-dev] [PATCH v2 1/2] memalloc: refactor segment resizing code Anatoly Burakov
2019-03-29 17:55   ` Anatoly Burakov
2019-03-29 17:55 ` [dpdk-dev] [PATCH v2 2/2] memalloc: do not use lockfiles for single file segments mode Anatoly Burakov
2019-03-29 17:55   ` Anatoly Burakov
2019-04-02 14:08   ` Thomas Monjalon
2019-04-02 14:08     ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3255576.YcZt162MTL@xps \
    --to=thomas@monjalon.net \
    --cc=anatoly.burakov@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=edwin.leung@oracle.com \
    --cc=iain.barker@oracle.com \
    --cc=john.mcnamara@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=maxime.coquelin@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).