patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
@ 2018-05-23 12:53 Gowrishankar
  2018-06-21  5:27 ` Gowrishankar
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Gowrishankar @ 2018-05-23 12:53 UTC (permalink / raw)
  To: pradeep, Chao Zhu; +Cc: Gowrishankar Muthukrishnan, stable

From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>

Earlier powerpc arch encountered an issue in secondary process
to map hugepages in same VA range as mapped by primary process.
By then, proposed fix was to use nr_overcommit_hugepages from 
the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
it solved respecting address hints in mmap calls, this fix
introduced limitation of maximum VA space that, primary process
in DPDK can create upon hugepages, to physical RAM size (almost).

This patch cleans up this limitation by

 1. reverting the previous patch so that, virtual address space
    range is not a constraint (like other arch).
        
 2. reverse-indexing on hugepage files as the secondary
    process mmap them. Reversed addressing sequence makes 
    this mandate.
    
 3. Move slightly where munmap() is called in zero-mapped VA
    block, as secondary process would attach them.
    
All these changes has also been verified in x86 arch (and request
other arch maintainers too test this and give feedback).

Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>


IBM Internal:
=============
Refer to below bug to understand powerpc memory slice op:
 * https://bugzilla.linux.ibm.com/show_bug.cgi?id=141628#c2

This patch overcomes multiples issues we have been facing until 
now:

 * EAL: Cannot get a virtual area: Cannot allocate memory - warning misleading end users.
 * Multiple memsegs leading heap corruption for rte_malloc (https://dpdk.org/ml/archives/dev/2018-May/100085.html)
 * Bug 166908 - net/mlx5: RXQ allocation failure when sysfs vm.nr_overcommit_hugepages enabled
 * Bug 166960 - lpm: memory allocation failed as applications start when nr_overcommit_hugepages set

I am doing more validations on the patches, as I see initial tests pass from these known issues.

 * http://pastebin.hursley.ibm.com/12493

Gowrishankar Muthukrishnan (3):
  eal: access hugepage_file in reverse order for powerpc
  eal: reorder calling munmap on zero-mapped memory
  eal: reverse powerpc changes done for hugepage overcommit

 doc/guides/linux_gsg/sys_reqs.rst        |  6 ------
 lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
 2 files changed, 9 insertions(+), 19 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
  2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
@ 2018-06-21  5:27 ` Gowrishankar
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc Gowrishankar
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21  5:27 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
  Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan

From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>

Earlier powerpc arch encountered an issue in secondary process
to map hugepages in same VA range as mapped by primary process.
By then, proposed fix was to use nr_overcommit_hugepages from 
the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
it solved respecting address hints in mmap calls, this fix
introduced limitation of maximum VA space that, primary process
in DPDK can create upon hugepages, to physical RAM size (almost).

This patch cleans up this limitation by

 1. reverting the previous patch so that, virtual address space
    range is not a constraint (like other arch).
        
 2. reverse-indexing on hugepage files as the secondary
    process mmap them. Reversed addressing sequence makes 
    this mandate.
    
 3. Move slightly where munmap() is called in zero-mapped VA
    block, as secondary process would attach them.
    
All these changes has also been verified in x86 arch (and request
other arch maintainers too test this and give feedback).

Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>


Gowrishankar Muthukrishnan (3):
  eal: access hugepage_file in reverse order for powerpc
  eal: reorder calling munmap on zero-mapped memory
  eal: reverse powerpc changes done for hugepage overcommit

 doc/guides/linux_gsg/sys_reqs.rst        |  6 ------
 lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
 2 files changed, 9 insertions(+), 19 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc
  2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
  2018-06-21  5:27 ` Gowrishankar
@ 2018-06-21  5:27 ` Gowrishankar
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory Gowrishankar
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21  5:27 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
  Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan

From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>

In powerpc, secondary process has to read hugepage_files in
its reverse order of indexing, so that correct VA addr is
read from hp for hugepage mapping.

Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 16a181c..2dcb3b5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1455,7 +1455,11 @@ void numa_error(char *where)
 		/* find the hugepages for this segment and map them
 		 * we don't need to worry about order, as the server sorted the
 		 * entries before it did the second mmap of them */
+#ifdef RTE_ARCH_PPC_64
+		for (i = num_hp-1; i < num_hp && offset < mcfg->memseg[s].len; i--){
+#else
 		for (i = 0; i < num_hp && offset < mcfg->memseg[s].len; i++){
+#endif
 			if (hp[i].memseg_id == (int)s){
 				fd = open(hp[i].filepath, O_RDWR);
 				if (fd < 0) {
-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory
  2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
  2018-06-21  5:27 ` Gowrishankar
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc Gowrishankar
@ 2018-06-21  5:27 ` Gowrishankar
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit Gowrishankar
  2018-06-21  8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
  4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21  5:27 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
  Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan

From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>

In the current code path where secondary processes attaches memory
mapped area that primary process created, it releases one by one
zero-mapped memory in VA for the length of every iterated rte_memseg.
PowerPC mm would not be able to respect address hint as secondary
process would later ask to mmap hugepages in this freed area.
For powerpc mm to respect address hint, its entire underlying memory
slice (of about 1 TB in VA) should have been freed before re-mapping
requested. So, we move this munmap() slightly above, from its current
calling place so that, entire zero mapped area is alredy freed.

Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 2dcb3b5..3dcd6c2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1428,6 +1428,11 @@ void numa_error(char *where)
 			}
 			goto error;
 		}
+		/*
+		 * free previously mapped memory so we can map the
+		 * hugepages into this space shortly
+		 */
+		munmap(base_addr, mcfg->memseg[s].len);
 	}
 
 	size = getFileSize(fd_hugepage);
@@ -1445,12 +1450,7 @@ void numa_error(char *where)
 		void *addr, *base_addr;
 		uintptr_t offset = 0;
 		size_t mapping_size;
-		/*
-		 * free previously mapped memory so we can map the
-		 * hugepages into the space
-		 */
 		base_addr = mcfg->memseg[s].addr;
-		munmap(base_addr, mcfg->memseg[s].len);
 
 		/* find the hugepages for this segment and map them
 		 * we don't need to worry about order, as the server sorted the
-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit
  2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
                   ` (2 preceding siblings ...)
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory Gowrishankar
@ 2018-06-21  5:27 ` Gowrishankar
  2018-06-21  8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
  4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21  5:27 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
  Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan

From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>

Reverse the previous changes done for multiprocess support, as it is
addressed without relying in nr_overcommit_hugepages in powerpc.

Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
 doc/guides/linux_gsg/sys_reqs.rst        | 6 ------
 lib/librte_eal/linuxapp/eal/eal_memory.c | 8 --------
 2 files changed, 14 deletions(-)

diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst
index 3e7fe63..ee69c1a 100644
--- a/doc/guides/linux_gsg/sys_reqs.rst
+++ b/doc/guides/linux_gsg/sys_reqs.rst
@@ -207,12 +207,6 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes::
 
     For 1G pages, it is not possible to reserve the hugepage memory after the system has booted.
 
-    On IBM POWER system, the nr_overcommit_hugepages should be set to the same value as nr_hugepages.
-    For example, if the required page number is 128, the following commands are used::
-
-        echo 128 > /sys/kernel/mm/hugepages/hugepages-16384kB/nr_hugepages
-        echo 128 > /sys/kernel/mm/hugepages/hugepages-16384kB/nr_overcommit_hugepages
-
 Using Hugepages with the DPDK
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 3dcd6c2..56515cc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -280,11 +280,7 @@
 	do {
 		addr = mmap(addr,
 				(*size) + hugepage_sz, PROT_READ,
-#ifdef RTE_ARCH_PPC_64
-				MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
-#else
 				MAP_PRIVATE,
-#endif
 				fd, 0);
 		if (addr == MAP_FAILED)
 			*size -= hugepage_sz;
@@ -1397,11 +1393,7 @@ void numa_error(char *where)
 		 */
 		base_addr = mmap(mcfg->memseg[s].addr, mcfg->memseg[s].len,
 				 PROT_READ,
-#ifdef RTE_ARCH_PPC_64
-				 MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
-#else
 				 MAP_PRIVATE,
-#endif
 				 fd_zero, 0);
 		if (base_addr == MAP_FAILED ||
 		    base_addr != mcfg->memseg[s].addr) {
-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
  2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
                   ` (3 preceding siblings ...)
  2018-06-21  5:27 ` [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit Gowrishankar
@ 2018-06-21  8:50 ` Luca Boccassi
  2018-06-21  8:54   ` gowrishankar muthukrishnan
  4 siblings, 1 reply; 7+ messages in thread
From: Luca Boccassi @ 2018-06-21  8:50 UTC (permalink / raw)
  To: Gowrishankar, Sergio Gonzalez Monroy, Bruce Richardson,
	Konstantin Ananyev
  Cc: stable, Chao Zhu

On Thu, 2018-06-21 at 10:57 +0530, Gowrishankar wrote:
> From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
> 
> Earlier powerpc arch encountered an issue in secondary process
> to map hugepages in same VA range as mapped by primary process.
> By then, proposed fix was to use nr_overcommit_hugepages from 
> the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
> it solved respecting address hints in mmap calls, this fix
> introduced limitation of maximum VA space that, primary process
> in DPDK can create upon hugepages, to physical RAM size (almost).
> 
> This patch cleans up this limitation by
> 
>  1. reverting the previous patch so that, virtual address space
>     range is not a constraint (like other arch).
>         
>  2. reverse-indexing on hugepage files as the secondary
>     process mmap them. Reversed addressing sequence makes 
>     this mandate.
>     
>  3. Move slightly where munmap() is called in zero-mapped VA
>     block, as secondary process would attach them.
>     
> All these changes has also been verified in x86 arch (and request
> other arch maintainers too test this and give feedback).
> 
> Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.
> ibm.com>
> 
> 
> Gowrishankar Muthukrishnan (3):
>   eal: access hugepage_file in reverse order for powerpc
>   eal: reorder calling munmap on zero-mapped memory
>   eal: reverse powerpc changes done for hugepage overcommit
> 
>  doc/guides/linux_gsg/sys_reqs.rst        |  6 ------
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
>  2 files changed, 9 insertions(+), 19 deletions(-)

Hi,

which stable releases are these patches aimed at?

In the future, please consider using git send-email with --subject-
prefix 'PATCH xx.yy' so that it's included in the subject.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
  2018-06-21  8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
@ 2018-06-21  8:54   ` gowrishankar muthukrishnan
  0 siblings, 0 replies; 7+ messages in thread
From: gowrishankar muthukrishnan @ 2018-06-21  8:54 UTC (permalink / raw)
  To: Luca Boccassi, Sergio Gonzalez Monroy, Bruce Richardson,
	Konstantin Ananyev
  Cc: stable, Chao Zhu

On Thursday 21 June 2018 02:20 PM, Luca Boccassi wrote:
> On Thu, 2018-06-21 at 10:57 +0530, Gowrishankar wrote:
>> From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>>
>> Earlier powerpc arch encountered an issue in secondary process
>> to map hugepages in same VA range as mapped by primary process.
>> By then, proposed fix was to use nr_overcommit_hugepages from
>> the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
>> it solved respecting address hints in mmap calls, this fix
>> introduced limitation of maximum VA space that, primary process
>> in DPDK can create upon hugepages, to physical RAM size (almost).
>>
>> This patch cleans up this limitation by
>>
>>   1. reverting the previous patch so that, virtual address space
>>      range is not a constraint (like other arch).
>>          
>>   2. reverse-indexing on hugepage files as the secondary
>>      process mmap them. Reversed addressing sequence makes
>>      this mandate.
>>      
>>   3. Move slightly where munmap() is called in zero-mapped VA
>>      block, as secondary process would attach them.
>>      
>> All these changes has also been verified in x86 arch (and request
>> other arch maintainers too test this and give feedback).
>>
>> Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.
>> ibm.com>
>>
>>
>> Gowrishankar Muthukrishnan (3):
>>    eal: access hugepage_file in reverse order for powerpc
>>    eal: reorder calling munmap on zero-mapped memory
>>    eal: reverse powerpc changes done for hugepage overcommit
>>
>>   doc/guides/linux_gsg/sys_reqs.rst        |  6 ------
>>   lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
>>   2 files changed, 9 insertions(+), 19 deletions(-)
> Hi,
>
> which stable releases are these patches aimed at?
>
> In the future, please consider using git send-email with --subject-
> prefix 'PATCH xx.yy' so that it's included in the subject.
>
Sure Luca. This patch set is for v17.11 based LTS.

Thanks,
Gowrishankar

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-21  8:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
2018-06-21  5:27 ` Gowrishankar
2018-06-21  5:27 ` [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc Gowrishankar
2018-06-21  5:27 ` [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory Gowrishankar
2018-06-21  5:27 ` [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit Gowrishankar
2018-06-21  8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
2018-06-21  8:54   ` gowrishankar muthukrishnan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).