DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wiles, Keith" <keith.wiles@intel.com>
To: Iain Barker <iain.barker@oracle.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"edwin.leung@oracle.com" <edwin.leung@oracle.com>
Subject: Re: [dpdk-dev] Question about DPDK hugepage fd change
Date: Tue, 5 Feb 2019 20:29:12 +0000	[thread overview]
Message-ID: <ACD02F93-6C71-4DEB-95B8-001F9AAD9E21@intel.com> (raw)
In-Reply-To: <da423cbe-2ff7-4d3f-8479-9ec0a501b689@default>



> On Feb 5, 2019, at 12:56 PM, Iain Barker <iain.barker@oracle.com> wrote:
> 
> Hi everyone,
> 
> We just updated our application from DPDK 17.11.4 (LTS) to DPDK 18.11 (LTS) and we noticed a regression.
> 
> Our host platform is providing 2MB huge pages, so for 8GB reservation this means 4000 pages are allocated.
> 
> This worked fine in the prior LTS, but after upgrading DPDK what we are seeing is that select() on an fd is failing.
> 
> select() works fine when the process starts up, but does not work after DPDK has been initialized.
> 
> We did some investigation and found in the DPDK patches linked below, the hugepage tracking mechanism was changed from mmap to an array of file descriptors, and the rlimit for fd's is raised from the default to allow more fd's to be open.
> 
> https://mails.dpdk.org/archives/dev/2018-September/110890.html
> https://mails.dpdk.org/archives/dev/2018-September/110889.html
> 
> The problem is that the GNU C library (glibc) has a limit for the maximum fd passed to select(), and is hard-coded in the POSIX header file and libc at 1024 (and probably many other OS libraries too as a result).
> 
> Raising the rlimit for fd >1024 has undefined results, per the manpage:
> 
> http://man7.org/linux/man-pages/man2/select.2.html
> An fd_set is a fixed size buffer.  Executing FD_CLR() or FD_SET()
> with a value of fd that is negative or is equal to or larger than
> FD_SETSIZE will result in undefined behavior.  Moreover, POSIX
> requires fd to be a valid file descriptor.
> 
> The Linux kernel allows file descriptor sets of arbitrary size,
> determining the length of the sets to be checked from the value of
> nfds.  However, in the glibc implementation, the fd_set type is fixed
> in size.
> 
> Specifically, libc's header include/sys/select.h has an array of fd's which is FD_SETSIZE deep.
> __fd_mask fds_bits[__FD_SETSIZE / __NFDBITS];
> 
> and usr/include/linux/posix_types.h is hard-coded with
> #define __FD_SETSIZE  1024
> 
> As this define and array are in libc, they are used in many libraries on a Linux system. So to use setsize >1024 means recompiling OS libraries and any other package that needs to use FDs, or ensuring that no library used by the application ever calls select() on an fd set. That seems an unreasonable burden...
> 
> Any thoughts?

Would poll work here instead?
> 
> thanks,
> Iain

Regards,
Keith

  reply	other threads:[~2019-02-05 20:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-05 18:56 Iain Barker
2019-02-05 20:29 ` Wiles, Keith [this message]
2019-02-05 21:27   ` Iain Barker
2019-02-05 21:36     ` Wiles, Keith
2019-02-05 21:49       ` Iain Barker
2019-02-05 22:02         ` Wiles, Keith
2019-02-06 13:57           ` Iain Barker
2019-02-07 11:15             ` Burakov, Anatoly
2019-02-22 17:08             ` Burakov, Anatoly
2019-02-27 13:57               ` Iain Barker
2019-02-27 18:02                 ` Edwin Leung
2019-02-28 10:36                   ` Burakov, Anatoly
     [not found] <1820650896.208393.1633616003335.ref@mail.yahoo.com>
2021-10-07 14:13 ` Vijay Atreya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ACD02F93-6C71-4DEB-95B8-001F9AAD9E21@intel.com \
    --to=keith.wiles@intel.com \
    --cc=dev@dpdk.org \
    --cc=edwin.leung@oracle.com \
    --cc=iain.barker@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).