From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <newman555p@gmail.com>
Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com
 [209.85.217.171]) by dpdk.org (Postfix) with ESMTP id 4467C5A08
 for <dev@dpdk.org>; Thu,  8 Jan 2015 09:19:40 +0100 (CET)
Received: by mail-lb0-f171.google.com with SMTP id w7so1599894lbi.2
 for <dev@dpdk.org>; Thu, 08 Jan 2015 00:19:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=xOqMv1ud4iIPGMSfSbzK4ovniQziHlSYlxXFB6/oyUg=;
 b=q2ItjVN2PSXL7rucVJo/siGfPbUccWNfgomrFJODP9PjcOW1Ftks/y766/w93dVb5w
 8ZHRzWFCt0PUOZC9SlIQo2rf9ctOtwzV4WeCRR2gUcB46Ftd/pUkj6950iZGeLgsHWgN
 Z6+JPIpQMAVbk1JXnvMKl8+5TZ+UBScEZst2Zjzx3reqlBeeuT4OhcLBrmBL9Ix8BKcv
 GUIhyvOG/aaNeLptTsSZ06yYssCiZv11d6TjbXv9L5s7oSNug0fidB5AwwToXQvZy0gJ
 AmjjmPfHutWVSmaX+GOARfybGLEhkM4nG1qNtq9g0FlbTPSAdrxyGu/Ej8krGZzNKMyN
 FdGA==
MIME-Version: 1.0
X-Received: by 10.112.172.194 with SMTP id be2mr11125049lbc.53.1420705179941; 
 Thu, 08 Jan 2015 00:19:39 -0800 (PST)
Received: by 10.25.216.133 with HTTP; Thu, 8 Jan 2015 00:19:39 -0800 (PST)
In-Reply-To: <CAHW=9PtiuHN=d5J1aMbp_T9YUVMw3Bu8s7zS_83TY4J0LE=VUQ@mail.gmail.com>
References: <CAHW=9PuuEjnA5jnuGaHkB28aaznrJdNysh=398oE3LwOFovQQg@mail.gmail.com>
 <2601191342CEEE43887BDE71AB977258213C2046@IRSMSX105.ger.corp.intel.com>
 <2601191342CEEE43887BDE71AB977258213C2099@IRSMSX105.ger.corp.intel.com>
 <CAHW=9PsyYKrV1MBXT8pWYk4jN4_UChk7zbbrB17q3uEpcfETew@mail.gmail.com>
 <CAOaVG17AQ1Bipo0kwuDPja=j80ME6bmu_Lr-VT3J57zd2qYH6Q@mail.gmail.com>
 <CAHW=9PtiuHN=d5J1aMbp_T9YUVMw3Bu8s7zS_83TY4J0LE=VUQ@mail.gmail.com>
Date: Thu, 8 Jan 2015 09:19:39 +0100
Message-ID: <CAHW=9Pvwze9RJ2-Km6-HRq7QjxeYkq+tagT7g-w73k_DaVT1FQ@mail.gmail.com>
From: Newman Poborsky <newman555p@gmail.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Content-Type: text/plain; charset=UTF-8
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] rte_mempool_create fails with ENOMEM
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Jan 2015 08:19:40 -0000

I finally found the time to try this and I noticed that on a server
with 1 NUMA node, this works, but if  server has 2 NUMA nodes than by
default memory policy, reserved hugepages are divided on each node and
again DPDK test app fails for the reason already mentioned. I found
out that 'solution' for this is to deallocate hugepages on node1
(after boot) and leave them only on node0:
echo 0 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages

Could someone please explain what changes when there are hugepages on
both nodes? Does this cause some memory fragmentation so that there
aren't enough contiguous segments? If so, how?

Thanks!

Newman

On Mon, Dec 22, 2014 at 11:48 AM, Newman Poborsky <newman555p@gmail.com> wrote:
> On Sat, Dec 20, 2014 at 2:34 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> You can reserve hugepages on the kernel cmdline (GRUB).
>
> Great, thanks, I'll try that!
>
> Newman
>
>>
>> On Fri, Dec 19, 2014 at 12:13 PM, Newman Poborsky <newman555p@gmail.com>
>> wrote:
>>>
>>> On Thu, Dec 18, 2014 at 9:03 PM, Ananyev, Konstantin <
>>> konstantin.ananyev@intel.com> wrote:
>>>
>>> >
>>> >
>>> > > -----Original Message-----
>>> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
>>> > > Konstantin
>>> > > Sent: Thursday, December 18, 2014 5:43 PM
>>> > > To: Newman Poborsky; dev@dpdk.org
>>> > > Subject: Re: [dpdk-dev] rte_mempool_create fails with ENOMEM
>>> > >
>>> > > Hi
>>> > >
>>> > > > -----Original Message-----
>>> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Newman Poborsky
>>> > > > Sent: Thursday, December 18, 2014 1:26 PM
>>> > > > To: dev@dpdk.org
>>> > > > Subject: [dpdk-dev] rte_mempool_create fails with ENOMEM
>>> > > >
>>> > > > Hi,
>>> > > >
>>> > > > could someone please provide any explanation why sometimes mempool
>>> > creation
>>> > > > fails with ENOMEM?
>>> > > >
>>> > > > I run my test app several times without any problems and then I
>>> > > > start
>>> > > > getting ENOMEM error when creating mempool that are used for
>>> > > > packets.
>>> > I try
>>> > > > to delete everything from /mnt/huge, I increase the number of huge
>>> > pages,
>>> > > > remount /mnt/huge but nothing helps.
>>> > > >
>>> > > > There is more than enough memory on server. I tried to debug
>>> > > > rte_mempool_create() call and it seems that after server is
>>> > > > restarted
>>> > free
>>> > > > mem segments are bigger than 2MB, but after running test app for
>>> > several
>>> > > > times, it seems that all free mem segments have a size of 2MB, and
>>> > since I
>>> > > > am requesting 8MB for my packet mempool, this fails.  I'm not really
>>> > sure
>>> > > > that this conclusion is correct.
>>> > >
>>> > > Yes,rte_mempool_create uses  rte_memzone_reserve() to allocate
>>> > > single physically continuous chunk of memory.
>>> > > If no such chunk exist, then it would fail.
>>> > > Why physically continuous?
>>> > > Main reason - to make things easier for us, as in that case we don't
>>> > have to worry
>>> > > about situation when mbuf crosses page boundary.
>>> > > So you can overcome that problem like that:
>>> > > Allocate max amount of memory you would need to hold all mbufs in
>>> > > worst
>>> > case (all pages physically disjoint)
>>> > > using rte_malloc().
>>> >
>>> > Actually my wrong: rte_malloc()s wouldn't help you here.
>>> > You probably need to allocate some external (not managed by EAL) memory
>>> > in
>>> > that case,
>>> > may be mmap() with MAP_HUGETLB, or something similar.
>>> >
>>> > > Figure out it's physical mappings.
>>> > > Call  rte_mempool_xmem_create().
>>> > > You can look at: app/test-pmd/mempool_anon.c as a reference.
>>> > > It uses same approach to create mempool over 4K pages.
>>> > >
>>> > > We probably add similar function into mempool API
>>> > (create_scatter_mempool or something)
>>> > > or just add a new flag (USE_SCATTER_MEM) into rte_mempool_create().
>>> > > Though right now it is not there.
>>> > >
>>> > > Another quick alternative - use 1G pages.
>>> > >
>>> > > Konstantin
>>> >
>>>
>>>
>>> Ok, thanks for the explanation. I understand that this is probably an OS
>>> question more than DPDK, but is there a way to again allocate a contiguous
>>> memory for n-th run of my test app?  It seems that hugepages get
>>> divded/separated to individual 2MB hugepage. Shouldn't OS's memory
>>> management system try to group those hupages back to one contiguous chunk
>>> once my app/process is done?   Again, I know very little about Linux
>>> memory
>>> management and hugepages, so forgive me if this is a stupid question.
>>> Is rebooting the OS the only way to deal with this problem?  Or should I
>>> just try to use 1GB hugepages?
>>>
>>> p.s. Konstantin, sorry for the double reply, I accidentally forgot to
>>> include dev list in my first reply  :)
>>>
>>> Newman
>>>
>>> >
>>> > > >
>>> > > > Does anybody have any idea what to check and how running my test app
>>> > > > several times affects hugepages?
>>> > > >
>>> > > > For me, this doesn't make any since because after test app exits,
>>> > resources
>>> > > > should be freed, right?
>>> > > >
>>> > > > This has been driving me crazy for days now. I tried reading a bit
>>> > > > more
>>> > > > theory about hugepages, but didn't find out anything that could help
>>> > me.
>>> > > > Maybe it's something else and completely trivial, but I can't figure
>>> > > > it
>>> > > > out, so any help is appreciated.
>>> > > >
>>> > > > Thank you!
>>> > > >
>>> > > > BR,
>>> > > > Newman P.
>>> >
>>
>>