From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 686D758FA for ; Fri, 1 Jun 2018 13:00:33 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Jun 2018 04:00:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,465,1520924400"; d="scan'208";a="60412288" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.101]) ([10.237.220.101]) by fmsmga001.fm.intel.com with ESMTP; 01 Jun 2018 04:00:31 -0700 To: Dariusz Stojaczyk , dev@dpdk.org References: <1527857469-159391-1-git-send-email-dariuszx.stojaczyk@intel.com> From: "Burakov, Anatoly" Message-ID: <4ad00805-bf12-7426-4040-6a44c51226f2@intel.com> Date: Fri, 1 Jun 2018 12:00:29 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <1527857469-159391-1-git-send-email-dariuszx.stojaczyk@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 1/2] memalloc: do not leave unmapped holes in EAL virtual memory area X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jun 2018 11:00:33 -0000 On 01-Jun-18 1:51 PM, Dariusz Stojaczyk wrote: > EAL reserves a huge area in virtual address space > to provide virtual address contiguity for e.g. > future memory extensions (memory hotplug). During > memory hotplug, if the hugepage mmap succeeds but > doesn't suffice EAL's requiriments, the EAL would > unmap this mapping straight away, leaving a hole in > its virtual memory area and making it available > to everyone. As EAL still thinks it owns the entire > region, it may try to mmap it later with MAP_FIXED, > possibly overriding a user's mapping that was made > in the meantime. > > This patch ensures each hole is mapped back by EAL, > so that it won't be available to anyone else. > > Signed-off-by: Dariusz Stojaczyk > --- Generally, if the commit is a bugfix, reference to the original commit that introduces the issue should be part of the commit message. See DPDK contribution guidelines [1] and "git fixline" [2]. Also, this bug is present in 18.05, so you should also Cc: stable@dpdk.org (same applies for all of your other patches for memalloc). [1] http://dpdk.org/doc/guides/contributing/patches.html [2] http://dpdk.org/dev > lib/librte_eal/linuxapp/eal/eal_memalloc.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > index 8c11f98..b959ea5 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > @@ -39,6 +39,7 @@ > #include "eal_filesystem.h" > #include "eal_internal_cfg.h" > #include "eal_memalloc.h" > +#include "eal_private.h" > > /* > * not all kernel version support fallocate on hugetlbfs, so fall back to > @@ -490,6 +491,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id, > int ret = 0; > int fd; > size_t alloc_sz; > + int flags; > + void *new_addr; > > /* takes out a read lock on segment or segment list */ > fd = get_seg_fd(path, sizeof(path), hi, list_idx, seg_idx); > @@ -585,6 +588,20 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id, > > mapped: > munmap(addr, alloc_sz); > + flags = MAP_FIXED; > +#ifdef RTE_ARCH_PPC_64 > + flags |= MAP_HUGETLB; > +#endif > + new_addr = eal_get_virtual_area(addr, &alloc_sz, alloc_sz, 0, flags); Page size shouldn't be zero, should be alloc_sz. > + if (new_addr != addr) { > + if (new_addr != NULL) { > + munmap(new_addr, alloc_sz); > + } > + /* We're leaving a hole in our virtual address space. If > + * somebody else maps this hole now, we might accidentally > + * override it in the future. */ > + rte_panic("can't mmap holes in our virtual address space"); I don't think rte_panic() here makes sense. First of all, we're now striving to not panic inside DPDK libraries, especially once initialization has already finished. But more importantly, when you reach this panic, you're deep in the memory allocation process, which means you hold a write lock to DPDK memory hotplug. If you crash now, the lock will never be released and subsequent memory hotplug requests will just deadlock. What's worse is you may be inside a secondary process, synchronizing the memory map with that of a primary, which means that even if you release the lock here, you're basically releasing someone else's lock, so behavior will be undefined at that point. I think an error message with the highest possible log level should suffice (CRIT?), as there's really nothing more we can do here. That said, how likely is this scenario? > + } > resized: > if (internal_config.single_file_segments) { > resize_hugefile(fd, path, list_idx, seg_idx, map_offset, > Generally, if the above is fixed, i'm OK with the patch. -- Thanks, Anatoly