From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by dpdk.org (Postfix) with ESMTP id 215755F19 for ; Tue, 5 Feb 2019 19:57:02 +0100 (CET) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x15IrtXs061686 for ; Tue, 5 Feb 2019 18:57:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=8jI1kCQHk7T7zf/du2rwQPbcIC+ej1bAd5Eq5YvDDxI=; b=3iQi0WJdiXYGu3TJymNSbUoKdltMd/XwRm9rULmV6mS/VJcId+UkcctAvG3Z/IdWjnqg 8/grgF9cFjtooaSsnhllehv0jsiFHmAHjiB9+hJhfjNqwxgCIhtajHIxw9WibgEVUEoU caQYo/iEkJOsk+1Ki6dyQvcwdtO85xuuf28HQu8POU3Iugk0mlAI4FOfu6lQbvDoJFvm wwH1HOF9Xn3fTau1Yl7ti+6cRUsKXXI2J44kRIcZ88T9rLdZ9iq6PnF+L30lURBjNaZh Iu/NJM+VIoRiv2scxv9A9qv11NEHRbQDuFT8XPeD5hT9weRu2CbWT7aDa3q+IZ8VxKCU 3Q== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2qd9arcx72-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 05 Feb 2019 18:57:02 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x15IuuHx010952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 5 Feb 2019 18:56:56 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x15IuuiQ003915 for ; Tue, 5 Feb 2019 18:56:56 GMT Reply-By: X-Message-Flag: MIME-Version: 1.0 Message-ID: Date: Tue, 5 Feb 2019 18:56:55 +0000 (UTC) From: Iain Barker Sender: Iain Barker To: dev@dpdk.org Cc: edwin.leung@oracle.com X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 15.0.5101.0 (x86)] Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9158 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1031 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902050144 Subject: [dpdk-dev] Question about DPDK hugepage fd change X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2019 18:57:03 -0000 Hi everyone, We just updated our application from DPDK 17.11.4 (LTS) to DPDK 18.11 (LTS)= and we noticed a regression. Our host platform is providing 2MB huge pages, so for 8GB reservation this = means 4000 pages are allocated. This worked fine in the prior LTS, but after upgrading DPDK what we are see= ing is that select() on an fd is failing. select() works fine when the process starts up, but does not work after DPD= K has been initialized. We did some investigation and found in the DPDK patches linked below, the h= ugepage tracking mechanism was changed from mmap to an array of file descri= ptors, and the rlimit for fd's is raised from the default to allow more fd'= s to be open. https://mails.dpdk.org/archives/dev/2018-September/110890.html https://mails.dpdk.org/archives/dev/2018-September/110889.html The problem is that the GNU C library (glibc) has a limit for the maximum f= d passed to select(), and is hard-coded in the POSIX header file and libc a= t 1024 (and probably many other OS libraries too as a result). Raising the rlimit for fd >1024 has undefined results, per the manpage: http://man7.org/linux/man-pages/man2/select.2.html An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor. The Linux kernel allows file descriptor sets of arbitrary size, determining the length of the sets to be checked from the value of nfds. However, in the glibc implementation, the fd_set type is fixed in size. Specifically, libc's header include/sys/select.h has an array of fd's which= is FD_SETSIZE deep. __fd_mask fds_bits[__FD_SETSIZE / __NFDBITS]; and usr/include/linux/posix_types.h is hard-coded with #define __FD_SETSIZE=A0 1024 As this define and array are in libc, they are used in many libraries on a = Linux system. So to use setsize >1024 means recompiling OS libraries and an= y other package that needs to use FDs, or ensuring that no library used by = the application ever calls select() on an fd set. That seems an unreasonabl= e burden... Any thoughts? thanks, Iain