patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Ilya Maximets <i.maximets@samsung.com>
To: "Tan, Jianfeng" <jianfeng.tan@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	David Marchand <david.marchand@6wind.com>,
	"Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>
Cc: Heetae Ahn <heetae82.ahn@samsung.com>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>,
	Neil Horman <nhorman@tuxdriver.com>,
	"Pei, Yulong" <yulong.pei@intel.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-stable] [PATCH] mem: balanced allocation of hugepages
Date: Thu, 16 Feb 2017 16:57:36 +0300	[thread overview]
Message-ID: <52ccca5d-6b83-b8b0-66e0-bdc6ac000798@samsung.com> (raw)
In-Reply-To: <920faebb-7042-2223-26ac-84e6dd02b13e@samsung.com>



On 16.02.2017 16:55, Ilya Maximets wrote:
> Hi,
> 
> On 16.02.2017 16:26, Tan, Jianfeng wrote:
>> Hi,
>>
>>> -----Original Message-----
>>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>>> Sent: Thursday, February 16, 2017 9:01 PM
>>> To: dev@dpdk.org; David Marchand; Gonzalez Monroy, Sergio
>>> Cc: Heetae Ahn; Yuanhan Liu; Tan, Jianfeng; Neil Horman; Pei, Yulong; Ilya
>>> Maximets; stable@dpdk.org
>>> Subject: [PATCH] mem: balanced allocation of hugepages
>>>
>>> Currently EAL allocates hugepages one by one not paying
>>> attention from which NUMA node allocation was done.
>>>
>>> Such behaviour leads to allocation failure if number of
>>> available hugepages for application limited by cgroups
>>> or hugetlbfs and memory requested not only from the first
>>> socket.
>>>
>>> Example:
>>> 	# 90 x 1GB hugepages availavle in a system
>>>
>>> 	cgcreate -g hugetlb:/test
>>> 	# Limit to 32GB of hugepages
>>> 	cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test
>>> 	# Request 4GB from each of 2 sockets
>>> 	cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ...
>>>
>>> 	EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
>>> 	EAL: 32 not 90 hugepages of size 1024 MB allocated
>>> 	EAL: Not enough memory available on socket 1!
>>> 	     Requested: 4096MB, available: 0MB
>>> 	PANIC in rte_eal_init():
>>> 	Cannot init memory
>>>
>>> 	This happens beacause all allocated pages are
>>> 	on socket 0.
>>
>> For such an use case, why not just use "numactl --interleave=0,1 <DPDK app> xxx"?
> 
> Unfortunately, interleave policy doesn't work for me. I suspect kernel configuration
> blocks this or I don't understand something in kernel internals.
> I'm using 3.10 rt kernel from rhel7.
> 
> I tried to set up MPOL_INTERLEAVE in code and it doesn't work for me. Your example
> with numactl doesn't work too:
> 
> # Limited to 8GB of hugepages
> cgexec -g hugetlb:test testpmd --socket-mem=4096,4096

Sorry,
cgexec -g hugetlb:test numactl --interleave=0,1 ./testpmd --socket-mem=4096,4096 ..


> 
> EAL: Setting up physically contiguous memory...
> EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
> EAL: 8 not 90 hugepages of size 1024 MB allocated
> EAL: Hugepage /dev/hugepages/rtemap_0 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_1 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_2 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_3 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_4 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_5 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_6 is on socket 0
> EAL: Hugepage /dev/hugepages/rtemap_7 is on socket 0
> EAL: Not enough memory available on socket 1! Requested: 4096MB, available: 0MB
> PANIC in rte_eal_init():
> Cannot init memory
> 
> Also, using numactl will affect all the allocations in application. This may
> cause additional unexpected issues.
> 
>>
>> Do you see use case like --socket-mem 2048,1024 and only three 1GB-hugepage are allowed?
> 
> This case will work with my patch.
> But the opposite one '--socket-mem=1024,2048' will fail.
> To be clear, we need to allocate all required memory at first
> from each numa node and then allocate all other available pages
> in round-robin fashion. But such solution looks a little ugly.
> 
> What do you think?
> 
> Best regards, Ilya Maximets.
> 
> 

  reply	other threads:[~2017-02-16 13:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20170216130139eucas1p2512567d6f5db9eaac5ee840b56bf920a@eucas1p2.samsung.com>
2017-02-16 13:01 ` Ilya Maximets
2017-02-16 13:26   ` Tan, Jianfeng
2017-02-16 13:55     ` Ilya Maximets
2017-02-16 13:57       ` Ilya Maximets [this message]
2017-02-16 13:31   ` [dpdk-stable] [dpdk-dev] " Bruce Richardson
2017-03-06  9:34   ` [dpdk-stable] " Ilya Maximets
2017-03-08 13:46     ` Sergio Gonzalez Monroy
2017-03-09 12:57       ` Ilya Maximets
2017-03-27 13:01         ` Sergio Gonzalez Monroy
2017-03-27 14:43           ` Ilya Maximets
2017-04-07 15:14             ` Ilya Maximets
2017-04-07 15:44               ` Thomas Monjalon
2017-04-10  7:11                 ` Ilya Maximets
2017-04-10  7:51                   ` Sergio Gonzalez Monroy
2017-04-10  8:05                     ` Ilya Maximets

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52ccca5d-6b83-b8b0-66e0-bdc6ac000798@samsung.com \
    --to=i.maximets@samsung.com \
    --cc=david.marchand@6wind.com \
    --cc=dev@dpdk.org \
    --cc=heetae82.ahn@samsung.com \
    --cc=jianfeng.tan@intel.com \
    --cc=nhorman@tuxdriver.com \
    --cc=sergio.gonzalez.monroy@intel.com \
    --cc=stable@dpdk.org \
    --cc=yuanhan.liu@linux.intel.com \
    --cc=yulong.pei@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).