From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sergio.gonzalez.monroy@intel.com>
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
 by dpdk.org (Postfix) with ESMTP id 8E4072BE1
 for <dev@dpdk.org>; Wed, 28 Jun 2017 12:30:36 +0200 (CEST)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by orsmga105.jf.intel.com with ESMTP; 28 Jun 2017 03:30:35 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.40,274,1496127600"; d="scan'208";a="104579056"
Received: from smonroyx-mobl.ger.corp.intel.com (HELO [10.237.221.32])
 ([10.237.221.32])
 by orsmga002.jf.intel.com with ESMTP; 28 Jun 2017 03:30:32 -0700
To: Ilya Maximets <i.maximets@samsung.com>, dev@dpdk.org,
 David Marchand <david.marchand@6wind.com>,
 Thomas Monjalon <thomas@monjalon.net>
References: <1498553186-24541-1-git-send-email-i.maximets@samsung.com>
 <1498559080-27331-1-git-send-email-i.maximets@samsung.com>
 <CGME20170627102451eucas1p2254d8679f70e261b9db9d2123aa80091@eucas1p2.samsung.com>
 <1498559080-27331-2-git-send-email-i.maximets@samsung.com>
Cc: Heetae Ahn <heetae82.ahn@samsung.com>, Yuanhan Liu
 <yliu@fridaylinux.org>, Jianfeng Tan <jianfeng.tan@intel.com>,
 Neil Horman <nhorman@tuxdriver.com>, Yulong Pei <yulong.pei@intel.com>,
 Bruce Richardson <bruce.richardson@intel.com>,
 Jerin Jacob <jerin.jacob@caviumnetworks.com>
From: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Message-ID: <d48e114d-f32d-a65b-9423-fb335ffe9f1a@intel.com>
Date: Wed, 28 Jun 2017 11:30:31 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.1.1
MIME-Version: 1.0
In-Reply-To: <1498559080-27331-2-git-send-email-i.maximets@samsung.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH v9 1/2] mem: balanced allocation of hugepages
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jun 2017 10:30:37 -0000

On 27/06/2017 11:24, Ilya Maximets wrote:
> Currently EAL allocates hugepages one by one not paying attention
> from which NUMA node allocation was done.
>
> Such behaviour leads to allocation failure if number of available
> hugepages for application limited by cgroups or hugetlbfs and
> memory requested not only from the first socket.
>
> Example:
> 	# 90 x 1GB hugepages availavle in a system
>
> 	cgcreate -g hugetlb:/test
> 	# Limit to 32GB of hugepages
> 	cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test
> 	# Request 4GB from each of 2 sockets
> 	cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ...
>
> 	EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
> 	EAL: 32 not 90 hugepages of size 1024 MB allocated
> 	EAL: Not enough memory available on socket 1!
> 	     Requested: 4096MB, available: 0MB
> 	PANIC in rte_eal_init():
> 	Cannot init memory
>
> 	This happens beacause all allocated pages are
> 	on socket 0.
>
> Fix this issue by setting mempolicy MPOL_PREFERRED for each hugepage
> to one of requested nodes using following schema:
>
> 	1) Allocate essential hugepages:
> 		1.1) Allocate as many hugepages from numa N to
> 		     only fit requested memory for this numa.
> 		1.2) repeat 1.1 for all numa nodes.
> 	2) Try to map all remaining free hugepages in a round-robin
> 	   fashion.
> 	3) Sort pages and choose the most suitable.
>
> In this case all essential memory will be allocated and all remaining
> pages will be fairly distributed between all requested nodes.
>
> New config option RTE_EAL_NUMA_AWARE_HUGEPAGES introduced and
> enabled by default for linuxapp except armv7 and dpaa2.
> Enabling of this option adds libnuma as a dependency for EAL.
>
> Fixes: 77988fc08dc5 ("mem: fix allocating all free hugepages")
>
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
>   config/common_base                        |   1 +
>   config/common_linuxapp                    |   1 +
>   config/defconfig_arm-armv7a-linuxapp-gcc  |   3 +
>   config/defconfig_arm64-dpaa2-linuxapp-gcc |   3 +
>   lib/librte_eal/linuxapp/eal/Makefile      |   3 +
>   lib/librte_eal/linuxapp/eal/eal_memory.c  | 120 ++++++++++++++++++++++++++++--
>   mk/rte.app.mk                             |   3 +
>   7 files changed, 126 insertions(+), 8 deletions(-)

Good stuff Ilya!

Hemant, Jerin, could you also ack the patch if you are happy with it? 
Thanks.

Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>