From: Ilya Maximets <i.maximets@samsung.com>
To: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>,
Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: dev@dpdk.org, David Marchand <david.marchand@6wind.com>,
Heetae Ahn <heetae82.ahn@samsung.com>,
Yuanhan Liu <yuanhan.liu@linux.intel.com>,
Jianfeng Tan <jianfeng.tan@intel.com>,
Neil Horman <nhorman@tuxdriver.com>,
Yulong Pei <yulong.pei@intel.com>,
stable@dpdk.org, Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] [PATCH] mem: balanced allocation of hugepages
Date: Mon, 10 Apr 2017 11:05:56 +0300 [thread overview]
Message-ID: <b4d7e98b-773e-9927-ce5c-b3807b9a4b94@samsung.com> (raw)
In-Reply-To: <b9291962-ceb3-06e3-445d-5089fb0868c0@intel.com>
On 10.04.2017 10:51, Sergio Gonzalez Monroy wrote:
> On 10/04/2017 08:11, Ilya Maximets wrote:
>> On 07.04.2017 18:44, Thomas Monjalon wrote:
>>> 2017-04-07 18:14, Ilya Maximets:
>>>> Hi All.
>>>>
>>>> I wanted to ask (just to clarify current status):
>>>> Will this patch be included in current release (acked by maintainer)
>>>> and then I will upgrade it to hybrid logic or I will just prepare v3
>>>> with hybrid logic for 17.08 ?
>>> What is your preferred option Ilya?
>> I have no strong opinion on this. One thought is that it could be
>> nice if someone else could test this functionality with current
>> release before enabling it by default in 17.08.
>>
>> Tomorrow I'm going on vacation. So I'll post rebased version today
>> (there are few fuzzes with current master) and you with Sergio may
>> decide what to do.
>>
>> Best regards, Ilya Maximets.
>>
>>> Sergio?
>
> I would be inclined towards v3 targeting v17.08. IMHO it would be more clean this way.
OK.
I've sent rebased version just in case.
>
> Sergio
>
>>>
>>>> On 27.03.2017 17:43, Ilya Maximets wrote:
>>>>> On 27.03.2017 16:01, Sergio Gonzalez Monroy wrote:
>>>>>> On 09/03/2017 12:57, Ilya Maximets wrote:
>>>>>>> On 08.03.2017 16:46, Sergio Gonzalez Monroy wrote:
>>>>>>>> Hi Ilya,
>>>>>>>>
>>>>>>>> I have done similar tests and as you already pointed out, 'numactl --interleave' does not seem to work as expected.
>>>>>>>> I have also checked that the issue can be reproduced with quota limit on hugetlbfs mount point.
>>>>>>>>
>>>>>>>> I would be inclined towards *adding libnuma as dependency* to DPDK to make memory allocation a bit more reliable.
>>>>>>>>
>>>>>>>> Currently at a high level regarding hugepages per numa node:
>>>>>>>> 1) Try to map all free hugepages. The total number of mapped hugepages depends if there were any limits, such as cgroups or quota in mount point.
>>>>>>>> 2) Find out numa node of each hugepage.
>>>>>>>> 3) Check if we have enough hugepages for requested memory in each numa socket/node.
>>>>>>>>
>>>>>>>> Using libnuma we could try to allocate hugepages per numa:
>>>>>>>> 1) Try to map as many hugepages from numa 0.
>>>>>>>> 2) Check if we have enough hugepages for requested memory in numa 0.
>>>>>>>> 3) Try to map as many hugepages from numa 1.
>>>>>>>> 4) Check if we have enough hugepages for requested memory in numa 1.
>>>>>>>>
>>>>>>>> This approach would improve failing scenarios caused by limits but It would still not fix issues regarding non-contiguous hugepages (worst case each hugepage is a memseg).
>>>>>>>> The non-contiguous hugepages issues are not as critical now that mempools can span over multiple memsegs/hugepages, but it is still a problem for any other library requiring big chunks of memory.
>>>>>>>>
>>>>>>>> Potentially if we were to add an option such as 'iommu-only' when all devices are bound to vfio-pci, we could have a reliable way to allocate hugepages by just requesting the number of pages from each numa.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>> Hi Sergio,
>>>>>>>
>>>>>>> Thanks for your attention to this.
>>>>>>>
>>>>>>> For now, as we have some issues with non-contiguous
>>>>>>> hugepages, I'm thinking about following hybrid schema:
>>>>>>> 1) Allocate essential hugepages:
>>>>>>> 1.1) Allocate as many hugepages from numa N to
>>>>>>> only fit requested memory for this numa.
>>>>>>> 1.2) repeat 1.1 for all numa nodes.
>>>>>>> 2) Try to map all remaining free hugepages in a round-robin
>>>>>>> fashion like in this patch.
>>>>>>> 3) Sort pages and choose the most suitable.
>>>>>>>
>>>>>>> This solution should decrease number of issues connected with
>>>>>>> non-contiguous memory.
>>>>>> Sorry for late reply, I was hoping for more comments from the community.
>>>>>>
>>>>>> IMHO this should be default behavior, which means no config option and libnuma as EAL dependency.
>>>>>> I think your proposal is good, could you consider implementing such approach on next release?
>>>>> Sure, I can implement this for 17.08 release.
>>>>>
>>>>>>>> On 06/03/2017 09:34, Ilya Maximets wrote:
>>>>>>>>> Hi all.
>>>>>>>>>
>>>>>>>>> So, what about this change?
>>>>>>>>>
>>>>>>>>> Best regards, Ilya Maximets.
>>>>>>>>>
>>>>>>>>> On 16.02.2017 16:01, Ilya Maximets wrote:
>>>>>>>>>> Currently EAL allocates hugepages one by one not paying
>>>>>>>>>> attention from which NUMA node allocation was done.
>>>>>>>>>>
>>>>>>>>>> Such behaviour leads to allocation failure if number of
>>>>>>>>>> available hugepages for application limited by cgroups
>>>>>>>>>> or hugetlbfs and memory requested not only from the first
>>>>>>>>>> socket.
>>>>>>>>>>
>>>>>>>>>> Example:
>>>>>>>>>> # 90 x 1GB hugepages availavle in a system
>>>>>>>>>>
>>>>>>>>>> cgcreate -g hugetlb:/test
>>>>>>>>>> # Limit to 32GB of hugepages
>>>>>>>>>> cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test
>>>>>>>>>> # Request 4GB from each of 2 sockets
>>>>>>>>>> cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ...
>>>>>>>>>>
>>>>>>>>>> EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
>>>>>>>>>> EAL: 32 not 90 hugepages of size 1024 MB allocated
>>>>>>>>>> EAL: Not enough memory available on socket 1!
>>>>>>>>>> Requested: 4096MB, available: 0MB
>>>>>>>>>> PANIC in rte_eal_init():
>>>>>>>>>> Cannot init memory
>>>>>>>>>>
>>>>>>>>>> This happens beacause all allocated pages are
>>>>>>>>>> on socket 0.
>>>>>>>>>>
>>>>>>>>>> Fix this issue by setting mempolicy MPOL_PREFERRED for each
>>>>>>>>>> hugepage to one of requested nodes in a round-robin fashion.
>>>>>>>>>> In this case all allocated pages will be fairly distributed
>>>>>>>>>> between all requested nodes.
>>>>>>>>>>
>>>>>>>>>> New config option RTE_LIBRTE_EAL_NUMA_AWARE_HUGEPAGES
>>>>>>>>>> introduced and disabled by default because of external
>>>>>>>>>> dependency from libnuma.
>>>>>>>>>>
>>>>>>>>>> Cc:<stable@dpdk.org>
>>>>>>>>>> Fixes: 77988fc08dc5 ("mem: fix allocating all free hugepages")
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Ilya Maximets<i.maximets@samsung.com>
>>>>>>>>>> ---
>>>>>>>>>> config/common_base | 1 +
>>>>>>>>>> lib/librte_eal/Makefile | 4 ++
>>>>>>>>>> lib/librte_eal/linuxapp/eal/eal_memory.c | 66 ++++++++++++++++++++++++++++++++
>>>>>>>>>> mk/rte.app.mk | 3 ++
>>>>>>>>>> 4 files changed, 74 insertions(+)
>>>>>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
>>>>> Thanks.
>>>>>
>>>>> Best regards, Ilya Maximets.
>>>>>
>>>
>>>
>>>
>>>
>
>
>
>
next prev parent reply other threads:[~2017-04-10 8:06 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20170216130139eucas1p2512567d6f5db9eaac5ee840b56bf920a@eucas1p2.samsung.com>
2017-02-16 13:01 ` Ilya Maximets
2017-02-16 13:26 ` Tan, Jianfeng
2017-02-16 13:55 ` Ilya Maximets
2017-02-16 13:57 ` Ilya Maximets
2017-02-16 13:31 ` Bruce Richardson
2017-03-06 9:34 ` Ilya Maximets
2017-03-08 13:46 ` Sergio Gonzalez Monroy
2017-03-09 12:57 ` Ilya Maximets
2017-03-27 13:01 ` Sergio Gonzalez Monroy
2017-03-27 14:43 ` Ilya Maximets
2017-04-07 15:14 ` Ilya Maximets
2017-04-07 15:44 ` Thomas Monjalon
2017-04-10 7:11 ` Ilya Maximets
2017-04-10 7:51 ` Sergio Gonzalez Monroy
2017-04-10 8:05 ` Ilya Maximets [this message]
[not found] ` <CGME20170410080425eucas1p27fd424ae58151f13b1a7a3723aa4ad1e@eucas1p2.samsung.com>
2017-04-10 8:04 ` [dpdk-dev] [PATCH v2] " Ilya Maximets
2017-04-10 10:03 ` Thomas Monjalon
[not found] ` <CGME20170606062227eucas1p2c49a95fb0fe11a4cadd5b4ceeb9712b1@eucas1p2.samsung.com>
2017-06-06 6:22 ` [dpdk-dev] [PATCH v3 0/2] Balanced " Ilya Maximets
[not found] ` <CGME20170606062232eucas1p11d2c304a28353d32b93ddfbd134d4da9@eucas1p1.samsung.com>
2017-06-06 6:22 ` [dpdk-dev] [PATCH v3 1/2] mem: balanced " Ilya Maximets
[not found] ` <CGME20170606062237eucas1p1de58fdde1bff816e480e50308804ba7a@eucas1p1.samsung.com>
2017-06-06 6:22 ` [dpdk-dev] [PATCH v3 2/2] config: enable vhost numa awareness by default Ilya Maximets
[not found] ` <CGME20170606081359eucas1p2f7eafa1abc346c5bb910c783df1d1520@eucas1p2.samsung.com>
2017-06-06 8:13 ` [dpdk-dev] [PATCH v4 0/2] Balanced allocation of hugepages Ilya Maximets
[not found] ` <CGME20170606081403eucas1p20c561b9177a51cfe58dd53b76cbfaaf7@eucas1p2.samsung.com>
2017-06-06 8:13 ` [dpdk-dev] [PATCH v4 1/2] mem: balanced " Ilya Maximets
[not found] ` <CGME20170606081409eucas1p2eed4a7dc49f1028c723f8c0a7a61fadf@eucas1p2.samsung.com>
2017-06-06 8:13 ` [dpdk-dev] [PATCH v4 2/2] config: enable vhost numa awareness by default Ilya Maximets
[not found] ` <CGME20170606133348eucas1p1cc5c3c05f88b2101c2ea47b26e0cac24@eucas1p1.samsung.com>
2017-06-06 13:33 ` [dpdk-dev] [PATCH v5 0/2] Balanced allocation of hugepages Ilya Maximets
[not found] ` <CGME20170606133352eucas1p13d1e860e996057a50a084f9365189e4d@eucas1p1.samsung.com>
2017-06-06 13:33 ` [dpdk-dev] [PATCH v5 1/2] mem: balanced " Ilya Maximets
[not found] ` <CGME20170606133354eucas1p284ae347e9ff07d6e8ab2bc09344ad1e5@eucas1p2.samsung.com>
2017-06-06 13:33 ` [dpdk-dev] [PATCH v5 2/2] config: enable vhost numa awareness by default Ilya Maximets
2017-06-08 11:21 ` [dpdk-dev] [PATCH v5 0/2] Balanced allocation of hugepages Ilya Maximets
2017-06-08 12:14 ` Bruce Richardson
2017-06-08 15:44 ` Sergio Gonzalez Monroy
2017-06-14 6:11 ` Ilya Maximets
2017-06-19 11:10 ` Hemant Agrawal
2017-06-20 13:07 ` Thomas Monjalon
2017-06-20 13:58 ` Ilya Maximets
2017-06-20 14:35 ` Thomas Monjalon
2017-06-20 14:58 ` Sergio Gonzalez Monroy
2017-06-20 15:41 ` Jerin Jacob
2017-06-20 15:51 ` Sergio Gonzalez Monroy
2017-06-21 8:14 ` Hemant Agrawal
2017-06-21 8:25 ` Sergio Gonzalez Monroy
2017-06-21 8:36 ` Ilya Maximets
2017-06-21 8:41 ` Jerin Jacob
2017-06-21 8:49 ` Thomas Monjalon
2017-06-21 9:27 ` Jerin Jacob
2017-06-21 9:58 ` Thomas Monjalon
2017-06-21 10:29 ` Jerin Jacob
2017-06-21 10:36 ` Ilya Maximets
2017-06-21 11:22 ` Jerin Jacob
2017-06-21 11:29 ` Thomas Monjalon
2017-06-27 9:13 ` Hemant Agrawal
2017-06-27 9:26 ` Thomas Monjalon
2017-06-27 9:48 ` Hemant Agrawal
[not found] ` <CGME20170621080434eucas1p18d3d4e4133c1cf885c849d022806408d@eucas1p1.samsung.com>
2017-06-21 8:04 ` [dpdk-dev] [PATCH v6 " Ilya Maximets
[not found] ` <CGME20170621080441eucas1p2dc01b29e7c8e4c1546ace6cd76ae51ff@eucas1p2.samsung.com>
2017-06-21 8:04 ` [dpdk-dev] [PATCH v6 1/2] mem: balanced " Ilya Maximets
2017-06-21 8:51 ` Thomas Monjalon
2017-06-21 8:58 ` Bruce Richardson
2017-06-21 9:25 ` Ilya Maximets
2017-06-21 9:34 ` Bruce Richardson
2017-06-21 9:28 ` Thomas Monjalon
[not found] ` <CGME20170621080448eucas1p28951fac6e4910cc599fe88d7edac9734@eucas1p2.samsung.com>
2017-06-21 8:04 ` [dpdk-dev] [PATCH v6 2/2] config: enable vhost numa awareness by default Ilya Maximets
[not found] ` <CGME20170621100837eucas1p1c570092cac733a66d939ca7ff04ac9e6@eucas1p1.samsung.com>
2017-06-21 10:08 ` [dpdk-dev] [PATCH v7 0/2] Balanced allocation of hugepages Ilya Maximets
[not found] ` <CGME20170621100841eucas1p1114078b1d8a38920c3633e9bddbabc02@eucas1p1.samsung.com>
2017-06-21 10:08 ` [dpdk-dev] [PATCH v7 1/2] mem: balanced " Ilya Maximets
[not found] ` <CGME20170621100845eucas1p2a457b1694d20de8e2d8126df679c43ae@eucas1p2.samsung.com>
2017-06-21 10:08 ` [dpdk-dev] [PATCH v7 2/2] config: enable vhost numa awareness by default Ilya Maximets
2017-06-27 9:20 ` Hemant Agrawal
2017-06-26 10:44 ` [dpdk-dev] [PATCH v7 0/2] Balanced allocation of hugepages Ilya Maximets
2017-06-26 14:07 ` Jerin Jacob
2017-06-26 15:33 ` Sergio Gonzalez Monroy
2017-06-27 8:42 ` Ilya Maximets
[not found] ` <CGME20170627084632eucas1p28133ee4b425b3938e2564fca03e1140b@eucas1p2.samsung.com>
2017-06-27 8:46 ` [dpdk-dev] [PATCH v8 " Ilya Maximets
[not found] ` <CGME20170627084637eucas1p2c591db905fa9f143fa5dbb3c08fae82f@eucas1p2.samsung.com>
2017-06-27 8:46 ` [dpdk-dev] [PATCH v8 1/2] mem: balanced " Ilya Maximets
[not found] ` <CGME20170627084641eucas1p182cac065efef74445ffa234a6dcbb23d@eucas1p1.samsung.com>
2017-06-27 8:46 ` [dpdk-dev] [PATCH v8 2/2] config: enable vhost numa awareness by default Ilya Maximets
2017-06-27 9:18 ` Hemant Agrawal
2017-06-27 9:21 ` Thomas Monjalon
2017-06-27 9:41 ` Hemant Agrawal
2017-06-27 9:59 ` Thomas Monjalon
2017-06-27 9:59 ` Jerin Jacob
2017-06-27 12:17 ` Hemant Agrawal
2017-06-27 12:45 ` Jerin Jacob
2017-06-27 13:00 ` Hemant Agrawal
2017-06-27 9:19 ` Thomas Monjalon
2017-06-27 10:26 ` Ilya Maximets
[not found] ` <CGME20170627102447eucas1p15a57bbaaf46944c0935d4ef71b55cd83@eucas1p1.samsung.com>
2017-06-27 10:24 ` [dpdk-dev] [PATCH v9 0/2] Balanced allocation of hugepages Ilya Maximets
[not found] ` <CGME20170627102451eucas1p2254d8679f70e261b9db9d2123aa80091@eucas1p2.samsung.com>
2017-06-27 10:24 ` [dpdk-dev] [PATCH v9 1/2] mem: balanced " Ilya Maximets
2017-06-28 10:30 ` Sergio Gonzalez Monroy
2017-06-29 5:32 ` Hemant Agrawal
2017-06-29 5:48 ` Ilya Maximets
2017-06-29 6:08 ` Ilya Maximets
[not found] ` <CGME20170627102454eucas1p14b2a1024d77158ad0bf40d62e6ad4365@eucas1p1.samsung.com>
2017-06-27 10:24 ` [dpdk-dev] [PATCH v9 2/2] config: enable vhost numa awareness by default Ilya Maximets
2017-06-29 5:31 ` Hemant Agrawal
[not found] ` <CGME20170629055928eucas1p17e823d821cfe95953bfa59dc9883ca4f@eucas1p1.samsung.com>
2017-06-29 5:59 ` [dpdk-dev] [PATCH v10 0/2] Balanced allocation of hugepages Ilya Maximets
[not found] ` <CGME20170629055933eucas1p1e5eba5f07850f63f9afbd48e6ca64c42@eucas1p1.samsung.com>
2017-06-29 5:59 ` [dpdk-dev] [PATCH v10 1/2] mem: balanced " Ilya Maximets
2017-06-29 7:03 ` Hemant Agrawal
[not found] ` <CGME20170629055940eucas1p1c9adcb26bec3ce5de97fe56753fd941a@eucas1p1.samsung.com>
2017-06-29 5:59 ` [dpdk-dev] [PATCH v10 2/2] config: enable vhost numa awareness by default Ilya Maximets
2017-06-30 15:50 ` Thomas Monjalon
2017-06-29 6:29 ` [dpdk-dev] [PATCH v10 0/2] Balanced allocation of hugepages Jerin Jacob
2017-06-30 8:36 ` Ilya Maximets
2017-06-30 16:12 ` [dpdk-dev] [PATCH v11 " Thomas Monjalon
2017-06-30 16:12 ` [dpdk-dev] [PATCH v11 1/2] mem: balanced " Thomas Monjalon
2017-06-30 16:12 ` [dpdk-dev] [PATCH v11 2/2] config: enable vhost NUMA awareness by default Thomas Monjalon
2017-07-01 10:59 ` [dpdk-dev] [PATCH v11 0/2] Balanced allocation of hugepages Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b4d7e98b-773e-9927-ce5c-b3807b9a4b94@samsung.com \
--to=i.maximets@samsung.com \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@6wind.com \
--cc=dev@dpdk.org \
--cc=heetae82.ahn@samsung.com \
--cc=jianfeng.tan@intel.com \
--cc=nhorman@tuxdriver.com \
--cc=sergio.gonzalez.monroy@intel.com \
--cc=stable@dpdk.org \
--cc=thomas.monjalon@6wind.com \
--cc=yuanhan.liu@linux.intel.com \
--cc=yulong.pei@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).