From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 3DFFB1B432 for ; Wed, 28 Nov 2018 05:59:44 +0100 (CET) X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Nov 2018 20:59:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,289,1539673200"; d="scan'208";a="112760643" Received: from btwcube1.sh.intel.com (HELO debian) ([10.67.104.173]) by orsmga001.jf.intel.com with ESMTP; 27 Nov 2018 20:59:42 -0800 Date: Wed, 28 Nov 2018 12:57:53 +0800 From: Tiwei Bie To: Anatoly Burakov Cc: dev@dpdk.org, przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, ray.kinsella@intel.com Message-ID: <20181128045753.GA26256@debian> References: <6b5e1de44f42e18a6a9326a06eb800c6bb8ddfdf.1542130721.git.anatoly.burakov@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <6b5e1de44f42e18a6a9326a06eb800c6bb8ddfdf.1542130721.git.anatoly.burakov@intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [dpdk-dev] [PATCH 19.02 2/2] mem: use memfd for no-huge mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2018 04:59:45 -0000 On Tue, Nov 13, 2018 at 05:54:48PM +0000, Anatoly Burakov wrote: > When running in no-huge mode, we anonymously allocate our memory. > While this works for regular NICs and vdev's, it's not suitable > for memory sharing scenarios such as virtio with vhost_user > backend. > > To fix this, allocate no-huge memory using memfd, and register > it with memalloc just like any other memseg fd. This will enable > using rte_memseg_get_fd() API with --no-huge EAL flag. > > Signed-off-by: Anatoly Burakov > --- > lib/librte_eal/linuxapp/eal/eal_memory.c | 46 ++++++++++++++++++++++-- > 1 file changed, 44 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c > index 48b23ce19..8feac2c56 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES > #include > #include > @@ -1345,12 +1346,15 @@ eal_legacy_hugepage_init(void) > /* hugetlbfs can be disabled */ > if (internal_config.no_hugetlbfs) { > struct rte_memseg_list *msl; > + int n_segs, cur_seg, fd, memfd, flags; > uint64_t page_sz; > - int n_segs, cur_seg; > > /* nohuge mode is legacy mode */ > internal_config.legacy_mem = 1; > > + /* nohuge mode is single-file segments mode */ > + internal_config.single_file_segments = 1; > + > /* create a memseg list */ > msl = &mcfg->memsegs[0]; > > @@ -1363,8 +1367,36 @@ eal_legacy_hugepage_init(void) > return -1; > } > > + /* set up parameters for anonymous mmap */ > + fd = -1; > + flags = MAP_PRIVATE | MAP_ANONYMOUS; > + > + /* create a memfd and store it in the segment fd table */ > + memfd = memfd_create("nohuge", 0); > + if (memfd < 0) { > + RTE_LOG(ERR, EAL, "Cannot create memfd: %s\n", > + strerror(errno)); > + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); > + } else { > + /* we got an fd - now resize it */ > + if (ftruncate(memfd, internal_config.memory) < 0) { > + RTE_LOG(ERR, EAL, "Cannot resize memfd: %s\n", > + strerror(errno)); > + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); > + close(memfd); > + } else { > + /* creating memfd-backed file was successful. > + * we want changes to memfd to be visible to > + * other processes (such as vhost backend), so > + * map it as shared memory. > + */ > + RTE_LOG(DEBUG, EAL, "Using memfd for anonymous memory\n"); > + fd = memfd; > + flags = MAP_SHARED; > + } > + } > addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, > - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + flags, fd, 0); > if (addr == MAP_FAILED) { > RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, > strerror(errno)); > @@ -1375,6 +1407,16 @@ eal_legacy_hugepage_init(void) > msl->socket_id = 0; > msl->len = internal_config.memory; > > + /* we're in single-file segments mode, so only the segment list > + * fd needs to be set up. > + */ > + if (fd != -1) { > + if (eal_memalloc_set_seg_list_fd(0, fd) < 0) { > + RTE_LOG(ERR, EAL, "Cannot set up segment list fd\n"); > + /* not a serious error, proceed */ > + } > + } Hi Anatoly, Thanks for the work! It seems the support for getting fd offset is missing in no-huge mode. I got below error in virtio-user while trying this series with --no-huge: update_memory_region(): Failed to get offset, ms=0x10002e000 rte_errno=19 Thanks > + > /* populate memsegs. each memseg is one page long */ > for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { > arr = &msl->memseg_arr; > -- > 2.17.1