From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by dpdk.org (Postfix) with ESMTP id 05B7F5699 for ; Mon, 26 Nov 2018 14:20:26 +0100 (CET) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20181126132025euoutp0177d5ff430a8b66656ac2c2994da544ad~qrywJKau52629926299euoutp01d for ; Mon, 26 Nov 2018 13:20:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20181126132025euoutp0177d5ff430a8b66656ac2c2994da544ad~qrywJKau52629926299euoutp01d DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1543238426; bh=VbyzivK2b7CYsp3Z+6tZzwRg7WIuLlZT+XJV/4DVIGY=; h=Subject:From:To:Date:In-Reply-To:References:From; b=kmOUF+E7oaysKxXPtkRrh8kX1IU0iPuyaqt7sswu7wNxfD6zxwEjeNsqyp7vb0ASz Xq0CFqwpPxrt1aOm6Rb/WFJnjTamo9T7QkRhLZWDRrUpLlC6cH5NP1m7BOnwlogCsU Butj56g42kLb6gUtXRWd1+2tIB6jBgcp40V5GnmM= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20181126132025eucas1p1e9e9a6f3d911f467b1d5695537bbc8b2~qryvuEm1b0528305283eucas1p1N; Mon, 26 Nov 2018 13:20:25 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id 8E.67.04294.913FBFB5; Mon, 26 Nov 2018 13:20:25 +0000 (GMT) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20181126132024eucas1p1dbdcabcc030b74bf411302999d87e5a1~qryu61ELx0226802268eucas1p1g; Mon, 26 Nov 2018 13:20:24 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20181126132024eusmtrp226c4a57aa892c14c51fe83d99df432e0~qryusWMv71304413044eusmtrp2R; Mon, 26 Nov 2018 13:20:24 +0000 (GMT) X-AuditID: cbfec7f4-84fff700000010c6-ad-5bfbf319bac6 Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id F3.7F.04284.813FBFB5; Mon, 26 Nov 2018 13:20:24 +0000 (GMT) Received: from [106.109.129.180] (unknown [106.109.129.180]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20181126132024eusmtip21352f40ae3b54c3f9c3e72b0bd9b1a54~qryuU48nN0709107091eusmtip2A; Mon, 26 Nov 2018 13:20:23 +0000 (GMT) From: Ilya Maximets To: "Burakov, Anatoly" , Asaf Sinai , "dev@dpdk.org" , Thomas Monjalon Message-ID: <5a045a4f-4037-cebf-ea02-7018c28918d0@samsung.com> Date: Mon, 26 Nov 2018 16:20:23 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <6ce31b20-19ea-ccaf-17d4-f36ab3959710@samsung.com> Content-Language: en-GB Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrMKsWRmVeSWpSXmKPExsWy7djPc7qSn39HG7yYL2Dx6N5iZovGA8tZ LN592s5kcaX9J7vFpwcnWBxYPX4tWMrqsXjPSyaPYzensXvc/nie1aNvyyrGANYoLpuU1JzM stQifbsErozZP+8xFyxQqmj7epm5gfGXdBcjJ4eEgInE9+OLWbsYuTiEBFYwSkx4cocNwvnC KDHncyNU5jOjxKPVu5lhWhp/3mCHSCxnlPjR8IMRwvnIKHF7wXGwjLDAZEaJDUs+sIG0sAno SJxafQSsSkRgLqPEwbMrgKo4OHgF7CTuN5mB1LAIqEq0vf3MCmKLCkRIdNxfDdbLKyAocXLm ExYQm1PAXuLi1RYmEJtZQFyi6ctKVghbXqJ562xmkPkSAovYJa7cOsYCcauLxN3ZM5kgbGGJ V8e3sEPYMhL/d86HitdL3G95yQjR3MEoMf3QP6iEvcSW1+fADmUW0JRYv0sfIuwo0dqxgxUk LCHAJ3HjrSDEDXwSk7ZNZ4YI80p0tAlBVKtI/D64HBpyUhI3332GusBD4vrKdqYJjIqzkHw5 C8lns5B8NgvhhgWMLKsYxVNLi3PTU4uN8lLL9YoTc4tL89L1kvNzNzECk87pf8e/7GDc9Sfp EKMAB6MSD++EX7+ihVgTy4orcw8xSnAwK4nw+i75HS3Em5JYWZValB9fVJqTWnyIUZqDRUmc t5rhQbSQQHpiSWp2ampBahFMlomDU6qBUWqfz887t08fLpV/N9PpXKP809T3K4W5t8+4NI05 dZ+v3oTk7S0fl88O0fj95e4M2dMu3wXviO3NvMMe5s3dosBqY15Vb+2ywk/5bEXvVv2cNb8e JbTd2bxXZJlQoM6GtCVhe/bUNrPbcpoKv8h2MJ+28fw3DY3PhzgXr/3AqvVIOPKm6HSFDCWW 4oxEQy3mouJEAJVgm/Y2AwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrFIsWRmVeSWpSXmKPExsVy+t/xe7oSn39HGxz+I27x6N5iZovGA8tZ LN592s5kcaX9J7vFpwcnWBxYPX4tWMrqsXjPSyaPYzensXvc/nie1aNvyyrGANYoPZui/NKS VIWM/OISW6VoQwsjPUNLCz0jE0s9Q2PzWCsjUyV9O5uU1JzMstQifbsEvYzZP+8xFyxQqmj7 epm5gfGXdBcjJ4eEgIlE488b7F2MXBxCAksZJaZubmOBSEhJ/Ph1gRXCFpb4c62LDaLoPaPE limHmUAcYYHJjBLPt09lA6liE9CROLX6CCNIQkRgLqPEzDlNrBAtt5klbj14BzSXg4NXwE7i fpMZSAOLgKpE29vPYCtEBSIkzr5cxwhi8woISpyc+QTsDE4Be4mLV1uYQGxmAXWJP/MuMUPY 4hJNX1ayQtjyEs1bZzNPYBSchaR9FpKWWUhaZiFpWcDIsopRJLW0ODc9t9hQrzgxt7g0L10v OT93EyMwnrYd+7l5B+OljcGHGAU4GJV4eF/8+RUtxJpYVlyZe4hRgoNZSYTXd8nvaCHelMTK qtSi/Pii0pzU4kOMpkDPTWSWEk3OB8Z6Xkm8oamhuYWlobmxubGZhZI473mDyighgfTEktTs 1NSC1CKYPiYOTqkGxl5nkTPG6vH7zXczLLa4MyUlLv9WXMaJuvD4pe5FhyqOzqpYwfnmjFu/ fWe2OYdUlbXrlcdTbmY9DquY8GP5jgsTHhnK/C0Q6UzSN7hYJqk6Y+0G8RNajLGmO5p7I/Ys i186y+aTRISvcdvVi04H9B5L/Chf+Ury3AKWpiPzXu+Pqrq0MpKTV4mlOCPRUIu5qDgRAJp/ uES9AgAA X-CMS-MailID: 20181126132024eucas1p1dbdcabcc030b74bf411302999d87e5a1 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20181126125108epcas1p2c649d8fa87fb739c2f13435e5ba80d88 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20181126125108epcas1p2c649d8fa87fb739c2f13435e5ba80d88 References: <2b09cec8-0883-2ed2-0264-aeef871ea6a9@intel.com> <518f9333-8d80-0fa2-d391-b4c8df181508@intel.com> <12283bd1-ea0d-38d1-f64d-508596e48cd9@intel.com> <6ce31b20-19ea-ccaf-17d4-f36ab3959710@samsung.com> Subject: Re: [dpdk-dev] CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES: no difference in memory pool allocations, when enabling/disabling this configuration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2018 13:20:27 -0000 On 26.11.2018 16:16, Ilya Maximets wrote: > On 26.11.2018 15:50, Burakov, Anatoly wrote: >> On 26-Nov-18 11:43 AM, Burakov, Anatoly wrote: >>> On 26-Nov-18 11:33 AM, Asaf Sinai wrote: >>>> Hi Anatoly, >>>> >>>> We did not check it with "testpmd", only with our application. >>>>  From the beginning, we did not enable this configuration (look at attached files), and everything works fine. >>>> Of course we rebuild DPDK, when we change configuration. >>>> Please note that we use DPDK 17.11.3, maybe this is why it works fine? >>> >>> Just tested with DPDK 17.11, and yes, it does work the way you are describing. This is not intended behavior. I will look into it. >>> >> >> +CC author of commit introducing CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES. >> >> Looking at the code, i think this config option needs to be reworked and we should clarify what we mean by this option. It appears that i've misunderstood what this option actually intended to do, and i also think it's naming could be improved because it's confusing and misleading. >> >> In 17.11, this option does *not* prevent EAL from using NUMA - it merely disables using libnuma to perform memory allocation. This looks like intended (if counter-intuitive) behavior - disabling this option will simply revert DPDK to working as it did before this option was introduced (i.e. best-effort allocation). This is why your code still works - because EAL still does allocate memory on socket 1, and *knows* that it's socket 1 memory. It still supports NUMA. >> >> The commit message for these changes states that the actual purpose of this option is to enable "balanced" hugepage allocation. In case of cgroups limitations, previously, DPDK would've exhausted all hugepages on master core's socket before attempting to allocate from other sockets, but by the time we've reached cgroups limits on numbers of hugepages, we might not have reached socket 1 and thus missed out on the pages we could've allocated, but didn't. Using libnuma solves this issue, because now we can allocate pages on sockets we want, instead of hoping we won't run out of hugepages before we get the memory we need. >> >> In 18.05 onwards, this option works differently (and arguably wrong). More specifically, it disallows allocations on sockets other than 0, and it also makes it so that EAL does not check which socket the memory *actually* came from. So, not only allocating memory from socket 1 is disabled, but allocating from socket 0 may even get you memory from socket 1! > > I'd consider this as a bug. > >> >> +CC Thomas >> >> The CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES option is a misnomer, because it makes it seem like this option disables NUMA support, which is not the case. >> >> I would also argue that it is not relevant to 18.05+ memory subsystem, and should only work in legacy mode, because it is *impossible* to make it work right in the new memory subsystem, and here's why: >> >> Without libnuma, we have no way of "asking" the kernel to allocate a hugepage on a specific socket - instead, any allocation will most likely happen on socket from which the allocation came from. For example, if user program's lcore is on socket 1, allocation on socket 0 will actually allocate a page on socket 1. >> >> If we don't check for page's NUMA node affinity (which is what currently happens) - we get performance degradation because we may unintentionally allocate memory on wrong NUMA node. If we do check for this - then allocation of memory on socket 1 from lcore on socket 0 will almost never succeed, because kernel will always give us pages on socket 0. >> >> Put it simply, there is no sane way to make this option work for the new memory subsystem - IMO it should be dropped, and libnuma should be made a hard dependency on Linux. > > I agree that new memory model could not work without libnuma, i.e. will > lead to unpredictable memory allocations with no any respect to requested > socket_id's. I also agree that CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES is only > sane for a legacy memory model. > It looks like we have no other choice than just drop the option and make > the code unconditional, i.e. have hard dependency on libnuma. > We, probably, could compile this code and have hard dependency only for platforms with 'RTE_MAX_NUMA_NODES > 1'. > Best regards, Ilya Maximets.