From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 07DB11B549 for ; Mon, 26 Nov 2018 14:16:58 +0100 (CET) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20181126131657euoutp0260ccfd80b6ee7e715f166a8cf9527664~qrvuH22Nj2317623176euoutp02H for ; Mon, 26 Nov 2018 13:16:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20181126131657euoutp0260ccfd80b6ee7e715f166a8cf9527664~qrvuH22Nj2317623176euoutp02H DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1543238217; bh=E/+Gb3SPLMCPhXfOYO2bpw3KpQ6n4yaiGHgQcCFmirk=; h=Subject:To:From:Date:In-Reply-To:References:From; b=WGv4ne4uHCuEzy50WoYBj3GZrp0YjVPdpuQK5GeXsGzPhDrcsTWq+IH9njvZ9qX1S IzvTbsSJ7WKpOZtq6/D/egAi/q1+ehrqVj/OCZUyMvDR0AWWAbC6X2tWxZE5zHq+rf YMEf2tlPDzvxz4dGo30NA3n16hCtyaMn/RoMi4Aw= Received: from eusmges3new.samsung.com (unknown [203.254.199.245]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20181126131657eucas1p10e934d34380c40a5149605e89391da18~qrvtvC28r2477824778eucas1p1I; Mon, 26 Nov 2018 13:16:57 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges3new.samsung.com (EUCPMTA) with SMTP id B0.1D.04806.842FBFB5; Mon, 26 Nov 2018 13:16:56 +0000 (GMT) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20181126131656eucas1p2d16ec1592ef499a126641f0a3f515506~qrvs1k4wn1759617596eucas1p2c; Mon, 26 Nov 2018 13:16:56 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20181126131656eusmtrp2147430a91f526c4a60901e7d7582cd7d~qrvs0RDZH1203412034eusmtrp2X; Mon, 26 Nov 2018 13:16:56 +0000 (GMT) X-AuditID: cbfec7f5-34dff700000012c6-fc-5bfbf248e7fe Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id F0.13.04128.842FBFB5; Mon, 26 Nov 2018 13:16:56 +0000 (GMT) Received: from [106.109.129.180] (unknown [106.109.129.180]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20181126131655eusmtip1bfee87a1512f7481dc2c670e0450f829~qrvsY-TM20633806338eusmtip1v; Mon, 26 Nov 2018 13:16:55 +0000 (GMT) To: "Burakov, Anatoly" , Asaf Sinai , "dev@dpdk.org" , Thomas Monjalon From: Ilya Maximets Message-ID: <6ce31b20-19ea-ccaf-17d4-f36ab3959710@samsung.com> Date: Mon, 26 Nov 2018 16:16:55 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <12283bd1-ea0d-38d1-f64d-508596e48cd9@intel.com> Content-Language: en-GB Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: H4sIAAAAAAAAA02Sa0hTcRjG/e+csx3N2XEufFlSNuyDYtqNWGRakLQvgl+CaIRNPajkppyj lkamec1ISvEaM23GdFKpLDMx0ynzEl4KK81QkYnOUHIaeGPmdpT89nue9334Py/8SUykJSRk nDqJZtTKeCnfBW8xrQ+fkFs3FSdLloJkM5NaTJbZqcNlS9b3PNlo3rpAZp3uwy8R8o3qV4Rc 227hyU3jpQL5xPIwIS806FE4ccMlKJqOj0uhmcDgWy6xWZ1r/MQh77ul3wMyUJ+kADmTQJ0F Xe4sVoBcSBFVh6B06vGuWEVgbprbFSsICgdtxF6k8Zmezw10CDp+WQlOLCPoy3mE7MKDKkbQ WPvHsSamNAi6BusE9jyf8oeBhh5kZyEVDONPNHw749Rx0H/U4HY+RF2H/KkGPrfjDv0VZofv TF2EkYliB2OUJzxcrSc4PgpZ7547ygJVLID2rCo+V/YKfJnuwTn2gIVeg4BjL9j+8ILH8QOY yrYgLpyPoMxo2x2EgOH30E6A3HnBF962BXL2ZcjJbyXsNlBuMLboznVwg6KWMoyzhZCfK+K2 fWCzS4dxLIHxpZXdBnL4UZ/He4qOVe67snLfZZX7Lqv836Ea4XrkSSezqhiaPaOm7wSwShWb rI4JiEpQNaOdr/PZ1vu3FXVsRRoRRSKpq3B+a0MhIpQpbKrKiIDEpGJhWO2mQiSMVqam0UxC BJMcT7NGdJjEpZ7Ce07TChEVo0yib9N0Is3sTXmksyQDeRi8R23XLEansvTwkPJssTi9La8o fvFnubUz6uvaxEur7kBFemgmHGx+baKzqw0RvooZzUoVfoRJDT3tdbXAf254NnfjZk3CQtr9 SEHPN7NZHXchVx1isZwTnV8P638DNX6uk1rIY+oHamv8fBq6txVN/Z/mu0dK+pCpbUyKs7HK U34Ywyr/AVaW0L42AwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrFIsWRmVeSWpSXmKPExsVy+t/xu7oen35HGxy8yWnx6N5iZovGA8tZ LN592s5kcaX9J7vFpwcnWBxYPX4tWMrqsXjPSyaPYzensXvc/nie1aNvyyrGANYoPZui/NKS VIWM/OISW6VoQwsjPUNLCz0jE0s9Q2PzWCsjUyV9O5uU1JzMstQifbsEvYzmAz/YCs4pVEy7 ptfAeEKqi5GTQ0LARGLDxFVsXYxcHEICSxkljvzeywaRkJL48esCK4QtLPHnWhdU0XtGia97 F4I5wgKTGSWeb58K5ogIzGWUmDmniRWibBmzxIPXz8BmsQnoSJxafYQRxOYVsJO42TsXLM4i oCqxau9cFhBbVCBC4uzLdVA1ghInZz4Bi3MK2EpcuD0ZzGYWUJf4M+8SM4QtLtH0ZSUrhC0v 0bx1NvMERsFZSNpnIWmZhaRlFpKWBYwsqxhFUkuLc9Nzi430ihNzi0vz0vWS83M3MQLjadux n1t2MHa9Cz7EKMDBqMTD++LPr2gh1sSy4srcQ4wSHMxKIry+S35HC/GmJFZWpRblxxeV5qQW H2I0BXpuIrOUaHI+MNbzSuINTQ3NLSwNzY3Njc0slMR5zxtURgkJpCeWpGanphakFsH0MXFw SjUweqdrSCxlYQg49Zj96s1bU67/zjTRrsu71Xhx0Zn0deU6sT1F/BKs2idirquFVHiLFk7/ fuzM1v9GtwIaep68XqvLvF7iz7qXIpnRssb7LFZMqohZZ9e9WWKdKesn3pm7p2YedPZ6vk3E peXttl172Hc5L8z9uP7QvfgVwavMl6S+eaLTt+3cQyWW4oxEQy3mouJEAJmVohO9AgAA X-CMS-MailID: 20181126131656eucas1p2d16ec1592ef499a126641f0a3f515506 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20181126125108epcas1p2c649d8fa87fb739c2f13435e5ba80d88 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20181126125108epcas1p2c649d8fa87fb739c2f13435e5ba80d88 References: <2b09cec8-0883-2ed2-0264-aeef871ea6a9@intel.com> <518f9333-8d80-0fa2-d391-b4c8df181508@intel.com> <12283bd1-ea0d-38d1-f64d-508596e48cd9@intel.com> Subject: Re: [dpdk-dev] CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES: no difference in memory pool allocations, when enabling/disabling this configuration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2018 13:16:59 -0000 On 26.11.2018 15:50, Burakov, Anatoly wrote: > On 26-Nov-18 11:43 AM, Burakov, Anatoly wrote: >> On 26-Nov-18 11:33 AM, Asaf Sinai wrote: >>> Hi Anatoly, >>> >>> We did not check it with "testpmd", only with our application. >>>  From the beginning, we did not enable this configuration (look at attached files), and everything works fine. >>> Of course we rebuild DPDK, when we change configuration. >>> Please note that we use DPDK 17.11.3, maybe this is why it works fine? >> >> Just tested with DPDK 17.11, and yes, it does work the way you are describing. This is not intended behavior. I will look into it. >> > > +CC author of commit introducing CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES. > > Looking at the code, i think this config option needs to be reworked and we should clarify what we mean by this option. It appears that i've misunderstood what this option actually intended to do, and i also think it's naming could be improved because it's confusing and misleading. > > In 17.11, this option does *not* prevent EAL from using NUMA - it merely disables using libnuma to perform memory allocation. This looks like intended (if counter-intuitive) behavior - disabling this option will simply revert DPDK to working as it did before this option was introduced (i.e. best-effort allocation). This is why your code still works - because EAL still does allocate memory on socket 1, and *knows* that it's socket 1 memory. It still supports NUMA. > > The commit message for these changes states that the actual purpose of this option is to enable "balanced" hugepage allocation. In case of cgroups limitations, previously, DPDK would've exhausted all hugepages on master core's socket before attempting to allocate from other sockets, but by the time we've reached cgroups limits on numbers of hugepages, we might not have reached socket 1 and thus missed out on the pages we could've allocated, but didn't. Using libnuma solves this issue, because now we can allocate pages on sockets we want, instead of hoping we won't run out of hugepages before we get the memory we need. > > In 18.05 onwards, this option works differently (and arguably wrong). More specifically, it disallows allocations on sockets other than 0, and it also makes it so that EAL does not check which socket the memory *actually* came from. So, not only allocating memory from socket 1 is disabled, but allocating from socket 0 may even get you memory from socket 1! I'd consider this as a bug. > > +CC Thomas > > The CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES option is a misnomer, because it makes it seem like this option disables NUMA support, which is not the case. > > I would also argue that it is not relevant to 18.05+ memory subsystem, and should only work in legacy mode, because it is *impossible* to make it work right in the new memory subsystem, and here's why: > > Without libnuma, we have no way of "asking" the kernel to allocate a hugepage on a specific socket - instead, any allocation will most likely happen on socket from which the allocation came from. For example, if user program's lcore is on socket 1, allocation on socket 0 will actually allocate a page on socket 1. > > If we don't check for page's NUMA node affinity (which is what currently happens) - we get performance degradation because we may unintentionally allocate memory on wrong NUMA node. If we do check for this - then allocation of memory on socket 1 from lcore on socket 0 will almost never succeed, because kernel will always give us pages on socket 0. > > Put it simply, there is no sane way to make this option work for the new memory subsystem - IMO it should be dropped, and libnuma should be made a hard dependency on Linux. I agree that new memory model could not work without libnuma, i.e. will lead to unpredictable memory allocations with no any respect to requested socket_id's. I also agree that CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES is only sane for a legacy memory model. It looks like we have no other choice than just drop the option and make the code unconditional, i.e. have hard dependency on libnuma. Best regards, Ilya Maximets.