Re: [dpdk-dev] [PATCH] config: reduce memory requirements for DPDK

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Kevin Traynor <ktraynor@redhat.com>,
	Thomas Monjalon <thomas@monjalon.net>
Cc: dev@dpdk.org, Ravi1.Kumar@amd.com,
	jerin.jacob@caviumnetworks.com, hemant.agrawal@nxp.com,
	yskoh@mellanox.com, arybchenko@solarflare.com,
	damarion@cisco.com, stephen@networkplumber.org,
	olivier.matz@6wind.com, christian.ehrhardt@canonical.com,
	bluca@debian.org
Subject: Re: [dpdk-dev] [PATCH] config: reduce memory requirements for DPDK
Date: Thu, 26 Jul 2018 10:51:54 +0100	[thread overview]
Message-ID: <3615ec97-73d4-78a8-9ddf-a8cabb0b1797@intel.com> (raw)
In-Reply-To: <0cde5b6f-558a-08cf-0a03-29eeb7772618@redhat.com>

On 25-Jul-18 6:43 PM, Kevin Traynor wrote:
> On 07/24/2018 01:03 PM, Thomas Monjalon wrote:
>> 24/07/2018 13:04, Burakov, Anatoly:
>>> On 24-Jul-18 11:23 AM, Thomas Monjalon wrote:
>>>> 24/07/2018 12:03, Anatoly Burakov:
>>>>> It has been reported that current memory limitations do not work
>>>>> well on an 8-socket machines in default configuration when big
>>>>> page sizes are used [1].
>>>>>
>>>>> Fix it by reducing memory amount reserved by DPDK by default to
>>>>> 32G per page size per NUMA node. This translates to allowing us
>>>>> to reserve 32G per page size per NUMA node on 8 nodes with 2
>>>>> page sizes.
>>>>>
>>>>> [1] https://mails.dpdk.org/archives/dev/2018-July/108071.html
>>>>>
>>>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>> ---
>>>>>
>>>>> Notes:
>>>>>       We could have increased CONFIG_RTE_MAX_MEM_MB but this would've
>>>>>       brought other potential problems due to increased memory
>>>>>       preallocation, and secondary process initialization is flaky
>>>>>       enough as it is. I am willing to bet that 32G per page size is
>>>>>       more than enough for the majority of use cases, and any
>>>>>       application with bigger requirements could adjust config options
>>>>>       itself.
>>>> [...]
>>>>> -CONFIG_RTE_MAX_MEMSEG_PER_TYPE=32768
>>>>> -CONFIG_RTE_MAX_MEM_MB_PER_TYPE=131072
>>>>> +CONFIG_RTE_MAX_MEMSEG_PER_TYPE=16384
>>>>> +CONFIG_RTE_MAX_MEM_MB_PER_TYPE=32768
>>>>
>>>> Ideally, it should be a run-time option.
>>>>
>>>
>>> It can be, yes, and this can be worked on for next release. However, we
>>> also need to have good default values that work across all supported
>>> platforms.
>>
>> Yes sure, we can wait the next release for a run-time option.
>>
>> How can we be sure these default values are good enough?
> 
> Why add a new limitation? Why not take the other approach that was
> suggested of increasing the max possible memory?

The commit notes explain that :) Basically, increasing total amount of 
allocate-able memory increases risk of secondary processes not working 
due to inability to map segments at the same addresses. Granted, the 
"usual" case of running DPDK on a 1- or 2-socket machine with 1- or 2- 
pagesizes will not be affected by increase in total amounts of memory, 
so things will stay as they are.

However, reducing the memory requirements will reduce the VA space 
consumption for what i perceive to be most common case (under 32G per 
page size per NUMA node) thereby improving secondary process experience, 
while still enabling 8 NUMA nodes with two page sizes to work on default 
settings.

> 
> If there is new limitations or backwards compatibility issues with
> default settings compared with before the large memory management
> rework, then it would be good to have that clear in the docs at a high
> level for the users who want to update.

Agreed, i will follow this patch up with doc updates.

> 
> It would also help a lot to add what the implications and limits for
> changing the most important defines are - will it be slower? will it not
> work above X? etc.

The main impact is amount of VA-contiguous memory you can have in DPDK, 
and amount of total memory you can have in DPDK. How that memory works 
(slower, faster etc.) is not affected.

So, for example, if you really needed a one VA-contiguous memzone of 20 
gigabytes - yes, this change would affect you. However, i suspect this 
is not a common case, and given that you've went that far, increasing 
memory limits would not be such a big deal anyway.

> 
>> It would be good to have several acks from various projects or companies.
>>
>>
> 
> 
> 
> 
> 
> 


-- 
Thanks,
Anatoly

     prev parent reply	other threads:[~2018-07-26  9:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-24 10:03 Anatoly Burakov
2018-07-24 10:23 ` Thomas Monjalon
2018-07-24 11:04   ` Burakov, Anatoly
2018-07-24 12:03     ` Thomas Monjalon
2018-07-25 17:43       ` Kevin Traynor
2018-07-26  9:51         ` Burakov, Anatoly [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3615ec97-73d4-78a8-9ddf-a8cabb0b1797@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=Ravi1.Kumar@amd.com \
    --cc=arybchenko@solarflare.com \
    --cc=bluca@debian.org \
    --cc=christian.ehrhardt@canonical.com \
    --cc=damarion@cisco.com \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=ktraynor@redhat.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).