From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 216AB1BE0 for ; Tue, 4 Sep 2018 17:02:05 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Sep 2018 08:02:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,329,1531810800"; d="scan'208";a="88860607" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga002.jf.intel.com with ESMTP; 04 Sep 2018 08:02:03 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w84F22hd019989; Tue, 4 Sep 2018 16:02:02 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w84F220H008519; Tue, 4 Sep 2018 16:02:02 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w84F229F008515; Tue, 4 Sep 2018 16:02:02 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: tiwei.bie@intel.com, ray.kinsella@intel.com, zhihong.wang@intel.com, maxime.coquelin@redhat.com, kuralamudhan.ramakrishnan@intel.com Date: Tue, 4 Sep 2018 16:01:53 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 0/9] Improve running DPDK without hugetlbfs mounpoint X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Sep 2018 15:02:06 -0000 This patchset further improves DPDK support for running without hugetlbfs mountpoints. First of all, it enables using memfd-created hugepages in in-memory mode. This way, instead of anonymous hugepages, we can get proper fd's for each page (or for the entire segment, if we're using single-file segments). Memfd will be used automatically if support for it was compiled and is available at runtime, however DPDK will fall back to using anonymous hugepages if such support is not available. The other thing this patchset does is exposing segment fd's through an external API. There is a lot of ugliness in current virtio/vhost code that deals with finding hugepage files through procfs, while all virtio really needs are fd's referring to the pages, and their offsets. Using this API, virtio will be able to access segment fd's directly, without the procfs magic. As a bonus, because we enabled use of memfd (given that sufficiently recent kernel version is used), once virtio support for getting segment fd's using the new API is implemented, virtio will also be able to work without having hugetlbfs mountpoints. Virtio support is not provided in this patchset, coordination and implementation of it is up to virtio maintainers. Once virtio support for this is in place, DPDK will have one less barrier for adoption in container space. v1->v2: - Added a new API to retrieve segment offset into its fd Anatoly Burakov (9): fbarray: fix detach in noshconf mode eal: don't allow legacy mode with in-memory mode mem: raise maximum fd limit unconditionally memalloc: rename lock list to fd list memalloc: track page fd's in non-single file mode memalloc: add EAL-internal API to get and set segment fd's mem: add external API to retrieve page fd from EAL mem: allow querying offset into segment fd mem: support using memfd segments for in-memory mode lib/librte_eal/bsdapp/eal/eal_memalloc.c | 19 + lib/librte_eal/common/eal_common_fbarray.c | 4 + lib/librte_eal/common/eal_common_memory.c | 107 ++++- lib/librte_eal/common/eal_common_options.c | 12 +- lib/librte_eal/common/eal_memalloc.h | 11 + lib/librte_eal/common/include/rte_memory.h | 97 +++++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 449 +++++++++++++++++---- lib/librte_eal/linuxapp/eal/eal_memory.c | 64 ++- lib/librte_eal/rte_eal_version.map | 4 + 9 files changed, 669 insertions(+), 98 deletions(-) -- 2.17.1