DPDK patches and discussions
 help / color / mirror / Atom feed
From: Newman Poborsky <newman555p@gmail.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] rte_mempool_create fails with ENOMEM
Date: Thu, 8 Jan 2015 09:19:39 +0100	[thread overview]
Message-ID: <CAHW=9Pvwze9RJ2-Km6-HRq7QjxeYkq+tagT7g-w73k_DaVT1FQ@mail.gmail.com> (raw)
In-Reply-To: <CAHW=9PtiuHN=d5J1aMbp_T9YUVMw3Bu8s7zS_83TY4J0LE=VUQ@mail.gmail.com>

I finally found the time to try this and I noticed that on a server
with 1 NUMA node, this works, but if  server has 2 NUMA nodes than by
default memory policy, reserved hugepages are divided on each node and
again DPDK test app fails for the reason already mentioned. I found
out that 'solution' for this is to deallocate hugepages on node1
(after boot) and leave them only on node0:
echo 0 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages

Could someone please explain what changes when there are hugepages on
both nodes? Does this cause some memory fragmentation so that there
aren't enough contiguous segments? If so, how?

Thanks!

Newman

On Mon, Dec 22, 2014 at 11:48 AM, Newman Poborsky <newman555p@gmail.com> wrote:
> On Sat, Dec 20, 2014 at 2:34 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> You can reserve hugepages on the kernel cmdline (GRUB).
>
> Great, thanks, I'll try that!
>
> Newman
>
>>
>> On Fri, Dec 19, 2014 at 12:13 PM, Newman Poborsky <newman555p@gmail.com>
>> wrote:
>>>
>>> On Thu, Dec 18, 2014 at 9:03 PM, Ananyev, Konstantin <
>>> konstantin.ananyev@intel.com> wrote:
>>>
>>> >
>>> >
>>> > > -----Original Message-----
>>> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
>>> > > Konstantin
>>> > > Sent: Thursday, December 18, 2014 5:43 PM
>>> > > To: Newman Poborsky; dev@dpdk.org
>>> > > Subject: Re: [dpdk-dev] rte_mempool_create fails with ENOMEM
>>> > >
>>> > > Hi
>>> > >
>>> > > > -----Original Message-----
>>> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Newman Poborsky
>>> > > > Sent: Thursday, December 18, 2014 1:26 PM
>>> > > > To: dev@dpdk.org
>>> > > > Subject: [dpdk-dev] rte_mempool_create fails with ENOMEM
>>> > > >
>>> > > > Hi,
>>> > > >
>>> > > > could someone please provide any explanation why sometimes mempool
>>> > creation
>>> > > > fails with ENOMEM?
>>> > > >
>>> > > > I run my test app several times without any problems and then I
>>> > > > start
>>> > > > getting ENOMEM error when creating mempool that are used for
>>> > > > packets.
>>> > I try
>>> > > > to delete everything from /mnt/huge, I increase the number of huge
>>> > pages,
>>> > > > remount /mnt/huge but nothing helps.
>>> > > >
>>> > > > There is more than enough memory on server. I tried to debug
>>> > > > rte_mempool_create() call and it seems that after server is
>>> > > > restarted
>>> > free
>>> > > > mem segments are bigger than 2MB, but after running test app for
>>> > several
>>> > > > times, it seems that all free mem segments have a size of 2MB, and
>>> > since I
>>> > > > am requesting 8MB for my packet mempool, this fails.  I'm not really
>>> > sure
>>> > > > that this conclusion is correct.
>>> > >
>>> > > Yes,rte_mempool_create uses  rte_memzone_reserve() to allocate
>>> > > single physically continuous chunk of memory.
>>> > > If no such chunk exist, then it would fail.
>>> > > Why physically continuous?
>>> > > Main reason - to make things easier for us, as in that case we don't
>>> > have to worry
>>> > > about situation when mbuf crosses page boundary.
>>> > > So you can overcome that problem like that:
>>> > > Allocate max amount of memory you would need to hold all mbufs in
>>> > > worst
>>> > case (all pages physically disjoint)
>>> > > using rte_malloc().
>>> >
>>> > Actually my wrong: rte_malloc()s wouldn't help you here.
>>> > You probably need to allocate some external (not managed by EAL) memory
>>> > in
>>> > that case,
>>> > may be mmap() with MAP_HUGETLB, or something similar.
>>> >
>>> > > Figure out it's physical mappings.
>>> > > Call  rte_mempool_xmem_create().
>>> > > You can look at: app/test-pmd/mempool_anon.c as a reference.
>>> > > It uses same approach to create mempool over 4K pages.
>>> > >
>>> > > We probably add similar function into mempool API
>>> > (create_scatter_mempool or something)
>>> > > or just add a new flag (USE_SCATTER_MEM) into rte_mempool_create().
>>> > > Though right now it is not there.
>>> > >
>>> > > Another quick alternative - use 1G pages.
>>> > >
>>> > > Konstantin
>>> >
>>>
>>>
>>> Ok, thanks for the explanation. I understand that this is probably an OS
>>> question more than DPDK, but is there a way to again allocate a contiguous
>>> memory for n-th run of my test app?  It seems that hugepages get
>>> divded/separated to individual 2MB hugepage. Shouldn't OS's memory
>>> management system try to group those hupages back to one contiguous chunk
>>> once my app/process is done?   Again, I know very little about Linux
>>> memory
>>> management and hugepages, so forgive me if this is a stupid question.
>>> Is rebooting the OS the only way to deal with this problem?  Or should I
>>> just try to use 1GB hugepages?
>>>
>>> p.s. Konstantin, sorry for the double reply, I accidentally forgot to
>>> include dev list in my first reply  :)
>>>
>>> Newman
>>>
>>> >
>>> > > >
>>> > > > Does anybody have any idea what to check and how running my test app
>>> > > > several times affects hugepages?
>>> > > >
>>> > > > For me, this doesn't make any since because after test app exits,
>>> > resources
>>> > > > should be freed, right?
>>> > > >
>>> > > > This has been driving me crazy for days now. I tried reading a bit
>>> > > > more
>>> > > > theory about hugepages, but didn't find out anything that could help
>>> > me.
>>> > > > Maybe it's something else and completely trivial, but I can't figure
>>> > > > it
>>> > > > out, so any help is appreciated.
>>> > > >
>>> > > > Thank you!
>>> > > >
>>> > > > BR,
>>> > > > Newman P.
>>> >
>>
>>

  reply	other threads:[~2015-01-08  8:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-18 13:25 Newman Poborsky
2014-12-18 14:21 ` Alex Markuze
2014-12-18 17:42 ` Ananyev, Konstantin
2014-12-18 20:03   ` Ananyev, Konstantin
2014-12-19 20:13     ` Newman Poborsky
2014-12-20  1:34       ` Stephen Hemminger
2014-12-22 10:48         ` Newman Poborsky
2015-01-08  8:19           ` Newman Poborsky [this message]
2015-01-10 19:26             ` Liran Zvibel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHW=9Pvwze9RJ2-Km6-HRq7QjxeYkq+tagT7g-w73k_DaVT1FQ@mail.gmail.com' \
    --to=newman555p@gmail.com \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).