DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Kumar, Ravi1" <Ravi1.Kumar@amd.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] DPDK 18.05 only works with up to 4 NUMAs systems
Date: Fri, 5 Oct 2018 08:56:01 +0000	[thread overview]
Message-ID: <MW2PR12MB2570E0A780C17A79A57785C7AEEB0@MW2PR12MB2570.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1d3910a0-7673-4be7-27f9-41f75a4a4cf6@intel.com>

>On 24-Jul-18 10:39 AM, Kumar, Ravi1 wrote:
>>>
>>>
>>> -----Original Message-----
>>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>>> Sent: Tuesday, July 24, 2018 2:33 PM
>>> To: Kumar, Ravi1 <Ravi1.Kumar@amd.com>; dev@dpdk.org
>>> Subject: Re: [dpdk-dev] DPDK 18.05 only works with up to 4 NUMAs 
>>> systems
>>>
>>> On 24-Jul-18 9:09 AM, Kumar, Ravi1 wrote:
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>>>>> Sent: Monday, July 16, 2018 4:05 PM
>>>>> To: Kumar, Ravi1 <Ravi1.Kumar@amd.com>; dev@dpdk.org
>>>>> Subject: Re: [dpdk-dev] DPDK 18.05 only works with up to 4 NUMAs 
>>>>> systems
>>>>>
>>>>> On 14-Jul-18 10:44 AM, Kumar, Ravi1 wrote:
>>>>>>
>>>>>> Memory setup with 2M pages works with the default configuration.
>>>>>> With the default configuration and 2M hugepages
>>>>>>
>>>>>> 1.            Total amount of memory for each NUMA zone does not
>>>>>> exceed 128G (CONFIG_RTE_MAX_MEM_MB_PER_TYPE).
>>>>>>
>>>>>> 2.            Total number of segment lists per NUMA is limited to
>>>>>> 32768 (CONFIG_RTE_MAX_MEMSEG_PER_TYPE).   This constraint is met for
>>>>>> each numa zone.  This is the limiting factor for memory per numa 
>>>>>> with 2M hugepages and the default configuration.
>>>>>>
>>>>>> 3.            The data structures are capable of supporting 64G of
>>>>>> memory for each numa zone (32768 segments * 2M hugepagesize).
>>>>>>
>>>>>> 4.            8 NUMA zones * 64G = 512G.   Therefore the total for all
>>>>>> numa zones does not exceed 512G (CONFIG_RTE_MAX_MEM_MB).
>>>>>>
>>>>>> 5.            Resources are capable of allocating up to 64G per NUMA
>>>>>> zone.  Things will work as long as there are enough 2M hugepages  
>>>>>> to cover the memory  needs of the DPDK applications AND no memory 
>>>>>> zone needs more than 64G.
>>>>>>
>>>>>> With the default configuration and 1G hugepages
>>>>>>
>>>>>> 1.            Total amount of memory for each NUMA zone is limited to
>>>>>> 128G (CONFIG_RTE_MAX_MEM_MB_PER_TYPE).  This constraint is hit for 
>>>>>> each numa zone.  This is the limiting factor for memory per numa.
>>>>>>
>>>>>> 2.            Total number of segment lists (128) does not exceed
>>>>>> 32768 (CONFIG_RTE_MAX_MEMSEG_PER_TYPE).    There are 128 segments per NUMA.
>>>>>>
>>>>>> 3.            The data structures are capable of supporting 128G of
>>>>>> memory for each numa zone (128 segments * 1G hugepagesize).
>>>>>> However, only the first four NUMA zones get initialized before we 
>>>>>> hit CONFIG_RTE_MAX_MEM_MB (512G).
>>>>>>
>>>>>> 4.            The total for all numa zones is limited to 512G
>>>>>> (CONFIG_RTE_MAX_MEM_MB).  This  limit is  hit after configuring the
>>>>>> first four NUMA zones (4 x 128G = 512G).   The rest of the NUMA zones
>>>>>> cannot allocate memory.
>>>>>>
>>>>>> Apparently, it is intended to support max 8 NUMAs by default 
>>>>>> (CONFIG_RTE_MAX_NUMA_NODES=8), but when 1G hugepages are use, it 
>>>>>> can only support up to 4 NUMAs.
>>>>>>
>>>>>> Possible workarounds when using 1G hugepages:
>>>>>>
>>>>>> 1.            Decrease CONFIG_RTE_MAX_MEM_MB_PER_TYPE to 65536 (limit
>>>>>> of 64G per NUMA zone).  This is probably the best option unless 
>>>>>> you need a lot of memory in any given NUMA.
>>>>>>
>>>>>> 2.            Or, increase CONFIG_RTE_MAX_MEM_MB to 1048576.
>>>>>
>>>>> Hi Ravi,
>>>>>
>>>>> OK this makes it much clearer, thanks!
>>>>>
>>>>> I think the first one should be done. I think 64G per NUMA node is 
>>>>> still a reasonable amount of memory and it makes the default work 
>>>>> (i think we can go as far as reducing this limit to 32G per type!), 
>>>>> and whoever has issues with it can change 
>>>>> CONFIG_RTE_MAX_MEM_MB_PER_TYPE or CONFIG_RTE_MAX_MEM_MB for their 
>>>>> use case. That's what these options are there for :)
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Anatoly
>>>>>
>>>>
>>>> Hi Anatoly,
>>>>
>>>> Thanks a lot. Will the next release include this change?
>>>>
>>>> Regards,
>>>> Ravi
>>>>
>>>
>>> No one has submitted a patch for this, so not at this moment. I will do so now, but i cannot guarantee it getting merged in 18.08 since it's almost RC2 time, and introducing such a change may be too big a risk.
>>>
>>> --
>>> Thanks,
>>> Anatoly
>> Thanks Anatoly. I understand.
>> 
>> Regards,
>> Ravi
>> 
>
>Hi Ravi,
>
>In addition to predefined limits that aren't suitable for platforms with high numbers of sockets, the calculation method itself has had numerous bugs that prevented it from working if limits were changed. I've come up with a patch that should fix the issue without the need to change the
>config:
>
>http://patches.dpdk.org/patch/46112/
>
>It would be great if you could test it!
>
>--
>Thanks,
>Anatoly

Hi Anatoly,

Thank you very much. This looks really good. I will test this and get back to you. 

Regards,
Ravi

  reply	other threads:[~2018-10-05  8:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-22 16:37 Kumar, Ravi1
2018-06-25 16:16 ` Burakov, Anatoly
2018-06-28  7:03   ` Kumar, Ravi1
2018-06-28  8:42     ` Burakov, Anatoly
2018-07-14  9:44       ` Kumar, Ravi1
2018-07-16 10:35         ` Burakov, Anatoly
2018-07-24  8:09           ` Kumar, Ravi1
2018-07-24  9:03             ` Burakov, Anatoly
2018-07-24  9:39               ` Kumar, Ravi1
2018-10-05  8:32                 ` Burakov, Anatoly
2018-10-05  8:56                   ` Kumar, Ravi1 [this message]
2018-10-04 17:07 Sandeep Raman
2018-10-05 15:31 ` Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW2PR12MB2570E0A780C17A79A57785C7AEEB0@MW2PR12MB2570.namprd12.prod.outlook.com \
    --to=ravi1.kumar@amd.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).