From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 5CDB737B0 for ; Tue, 12 Mar 2019 12:02:12 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Mar 2019 04:02:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,471,1544515200"; d="scan'208";a="130930482" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.66]) ([10.237.220.66]) by fmsmga008.fm.intel.com with ESMTP; 12 Mar 2019 04:02:10 -0700 To: "Lilijun (Jerry, Cloud Networking)" , "dev@dpdk.org" Cc: "jerry.zhang@intel.com" , "ian.stokes@intel.com" References: <20190308053859.19980-1-jerry.lilijun@huawei.com> <5d2df2f7-76a7-4fb3-2911-af6a431a80a1@intel.com> <40280F65B1B0B44E8089ED31C01616EBA43F936C@dggeml529-mbx.china.huawei.com> From: "Burakov, Anatoly" Message-ID: <58a8100f-4ea1-88e5-c8dd-76ae57c09b12@intel.com> Date: Tue, 12 Mar 2019 11:02:09 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 In-Reply-To: <40280F65B1B0B44E8089ED31C01616EBA43F936C@dggeml529-mbx.china.huawei.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Mar 2019 11:02:12 -0000 On 12-Mar-19 1:47 AM, Lilijun (Jerry, Cloud Networking) wrote: > Hi Anatoly, > >> -----Original Message----- >> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com] >> Sent: Friday, March 08, 2019 5:38 PM >> To: Lilijun (Jerry, Cloud Networking) ; >> dev@dpdk.org >> Cc: jerry.zhang@intel.com; ian.stokes@intel.com >> Subject: Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for >> legacy mem >> >> On 08-Mar-19 5:38 AM, Lilijun wrote: >>> Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA >> spaces increase to above 30G. >>> Here we can unmap the unneed VA spaces in rte_memseg_list. >>> >>> Signed-off-by: Lilijun >>> --- >>> lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++- >>> 1 file changed, 12 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c >>> b/lib/librte_eal/linuxapp/eal/eal_memory.c >>> index 32feb41..56abdd2 100644 >>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c >>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c >>> @@ -1626,8 +1626,19 @@ void numa_error(char *where) >>> if (msl->base_va == NULL) >>> continue; >>> /* skip lists where there is at least one page allocated */ >>> - if (msl->memseg_arr.count > 0) >>> + if (msl->memseg_arr.count > 0) { >>> + if (internal_config.legacy_mem) { >>> + struct rte_fbarray *arr = &msl->memseg_arr; >>> + int idx = rte_fbarray_find_next_free(arr, 0); >>> + >>> + while (idx >= 0) { >>> + void *va = (void*)((char*)msl- >>> base_va + idx * msl->page_sz); >>> + munmap(va, msl->page_sz); >>> + idx = rte_fbarray_find_next_free(arr, >> idx + 1); >>> + } >> >> I am not entirely convinced this change is safe to do. Technically, this space is >> marked as free, so correctly written code should not attempt to access it, >> however it is still potentially dangerous to have memory area that is >> supposed to be allocated (according to data structures' >> parameters), but isn't. >> >> If you are deallocating the VA space, ideally you should also resize the >> memseg list (as in, change its length), because that leftover memory area is >> no longer valid. However, this then presents us with a mismatch between >> (va_start + len) and (va_start + page_sz * memseg_arr.len), which may >> break things further. > > Yes, you're right, here we need resize the memseg length. I will update it if this patch is needed. Resizing memseg list is not the best course of action because fbarray itself doesn't support resizing, so you'll end up with a mismatch between length of memory and length of fbarray backing the memseg list. See below suggestion for implementation. >> >> May i ask what is the purpose of this change? I mean, i understand the part >> about unused VA space sitting there, but what is the consequence of that? >> This isn't 32-bit codepath, and in 64-bit there's plenty of address space to go >> around, and this memory doesn't take up any system resources anyway >> because it is read-only anonymous memory, and is therefore backed by zero >> page instead of real pages. So, what's wrong with just leaving it there? > > This change will cause a issues: when dpdk apps crashed, the coredump file will become too large. > > Thanks. You must have different default coredump settings than i do, because i haven't seen Linux attempting to dump the entire address space before (i have seen FreeBSD do that, mind you...). > >> >> I don't see any advantage of this change, and i see plenty of disadvantages, >> so for now i'm inclined to NACK this particular patch. >> >> _However_, i should note that if you feel this is very important feature to >> have and would still like to implement it, my advise would be to look at how >> 32-bit code works, and model the 64-bit implementation after that, because >> 32-bit codepath does exactly what you propose, and doesn't leave unused >> address space. The above is the way to go as far as implementing this particular feature goes: this has to be done at memseg list allocation time, not post-factum, when memseg lists are already allocated. >> >>> + } > continue; >>> + } >>> /* this is an unused list, deallocate it */ >>> mem_sz = msl->len; >>> munmap(msl->base_va, mem_sz); >>> >> >> >> -- >> Thanks, >> Anatoly -- Thanks, Anatoly