* [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
@ 2018-06-21 5:27 ` Gowrishankar
2018-06-21 5:27 ` [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc Gowrishankar
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21 5:27 UTC (permalink / raw)
To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan
From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Earlier powerpc arch encountered an issue in secondary process
to map hugepages in same VA range as mapped by primary process.
By then, proposed fix was to use nr_overcommit_hugepages from
the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
it solved respecting address hints in mmap calls, this fix
introduced limitation of maximum VA space that, primary process
in DPDK can create upon hugepages, to physical RAM size (almost).
This patch cleans up this limitation by
1. reverting the previous patch so that, virtual address space
range is not a constraint (like other arch).
2. reverse-indexing on hugepage files as the secondary
process mmap them. Reversed addressing sequence makes
this mandate.
3. Move slightly where munmap() is called in zero-mapped VA
block, as secondary process would attach them.
All these changes has also been verified in x86 arch (and request
other arch maintainers too test this and give feedback).
Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Gowrishankar Muthukrishnan (3):
eal: access hugepage_file in reverse order for powerpc
eal: reorder calling munmap on zero-mapped memory
eal: reverse powerpc changes done for hugepage overcommit
doc/guides/linux_gsg/sys_reqs.rst | 6 ------
lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
2 files changed, 9 insertions(+), 19 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
2018-06-21 5:27 ` Gowrishankar
@ 2018-06-21 5:27 ` Gowrishankar
2018-06-21 5:27 ` [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory Gowrishankar
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21 5:27 UTC (permalink / raw)
To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan
From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
In powerpc, secondary process has to read hugepage_files in
its reverse order of indexing, so that correct VA addr is
read from hp for hugepage mapping.
Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 16a181c..2dcb3b5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1455,7 +1455,11 @@ void numa_error(char *where)
/* find the hugepages for this segment and map them
* we don't need to worry about order, as the server sorted the
* entries before it did the second mmap of them */
+#ifdef RTE_ARCH_PPC_64
+ for (i = num_hp-1; i < num_hp && offset < mcfg->memseg[s].len; i--){
+#else
for (i = 0; i < num_hp && offset < mcfg->memseg[s].len; i++){
+#endif
if (hp[i].memseg_id == (int)s){
fd = open(hp[i].filepath, O_RDWR);
if (fd < 0) {
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
2018-06-21 5:27 ` Gowrishankar
2018-06-21 5:27 ` [dpdk-stable] [PATCH 1/3] eal: access hugepage_file in reverse order for powerpc Gowrishankar
@ 2018-06-21 5:27 ` Gowrishankar
2018-06-21 5:27 ` [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit Gowrishankar
2018-06-21 8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21 5:27 UTC (permalink / raw)
To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan
From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
In the current code path where secondary processes attaches memory
mapped area that primary process created, it releases one by one
zero-mapped memory in VA for the length of every iterated rte_memseg.
PowerPC mm would not be able to respect address hint as secondary
process would later ask to mmap hugepages in this freed area.
For powerpc mm to respect address hint, its entire underlying memory
slice (of about 1 TB in VA) should have been freed before re-mapping
requested. So, we move this munmap() slightly above, from its current
calling place so that, entire zero mapped area is alredy freed.
Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 2dcb3b5..3dcd6c2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1428,6 +1428,11 @@ void numa_error(char *where)
}
goto error;
}
+ /*
+ * free previously mapped memory so we can map the
+ * hugepages into this space shortly
+ */
+ munmap(base_addr, mcfg->memseg[s].len);
}
size = getFileSize(fd_hugepage);
@@ -1445,12 +1450,7 @@ void numa_error(char *where)
void *addr, *base_addr;
uintptr_t offset = 0;
size_t mapping_size;
- /*
- * free previously mapped memory so we can map the
- * hugepages into the space
- */
base_addr = mcfg->memseg[s].addr;
- munmap(base_addr, mcfg->memseg[s].len);
/* find the hugepages for this segment and map them
* we don't need to worry about order, as the server sorted the
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
` (2 preceding siblings ...)
2018-06-21 5:27 ` [dpdk-stable] [PATCH 2/3] eal: reorder calling munmap on zero-mapped memory Gowrishankar
@ 2018-06-21 5:27 ` Gowrishankar
2018-06-21 8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
4 siblings, 0 replies; 7+ messages in thread
From: Gowrishankar @ 2018-06-21 5:27 UTC (permalink / raw)
To: Sergio Gonzalez Monroy, Bruce Richardson, Konstantin Ananyev
Cc: stable, Chao Zhu, Gowrishankar Muthukrishnan
From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Reverse the previous changes done for multiprocess support, as it is
addressed without relying in nr_overcommit_hugepages in powerpc.
Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
doc/guides/linux_gsg/sys_reqs.rst | 6 ------
lib/librte_eal/linuxapp/eal/eal_memory.c | 8 --------
2 files changed, 14 deletions(-)
diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst
index 3e7fe63..ee69c1a 100644
--- a/doc/guides/linux_gsg/sys_reqs.rst
+++ b/doc/guides/linux_gsg/sys_reqs.rst
@@ -207,12 +207,6 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes::
For 1G pages, it is not possible to reserve the hugepage memory after the system has booted.
- On IBM POWER system, the nr_overcommit_hugepages should be set to the same value as nr_hugepages.
- For example, if the required page number is 128, the following commands are used::
-
- echo 128 > /sys/kernel/mm/hugepages/hugepages-16384kB/nr_hugepages
- echo 128 > /sys/kernel/mm/hugepages/hugepages-16384kB/nr_overcommit_hugepages
-
Using Hugepages with the DPDK
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 3dcd6c2..56515cc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -280,11 +280,7 @@
do {
addr = mmap(addr,
(*size) + hugepage_sz, PROT_READ,
-#ifdef RTE_ARCH_PPC_64
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
-#else
MAP_PRIVATE,
-#endif
fd, 0);
if (addr == MAP_FAILED)
*size -= hugepage_sz;
@@ -1397,11 +1393,7 @@ void numa_error(char *where)
*/
base_addr = mmap(mcfg->memseg[s].addr, mcfg->memseg[s].len,
PROT_READ,
-#ifdef RTE_ARCH_PPC_64
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
-#else
MAP_PRIVATE,
-#endif
fd_zero, 0);
if (base_addr == MAP_FAILED ||
base_addr != mcfg->memseg[s].addr) {
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
2018-05-23 12:53 [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Gowrishankar
` (3 preceding siblings ...)
2018-06-21 5:27 ` [dpdk-stable] [PATCH 3/3] eal: reverse powerpc changes done for hugepage overcommit Gowrishankar
@ 2018-06-21 8:50 ` Luca Boccassi
2018-06-21 8:54 ` gowrishankar muthukrishnan
4 siblings, 1 reply; 7+ messages in thread
From: Luca Boccassi @ 2018-06-21 8:50 UTC (permalink / raw)
To: Gowrishankar, Sergio Gonzalez Monroy, Bruce Richardson,
Konstantin Ananyev
Cc: stable, Chao Zhu
On Thu, 2018-06-21 at 10:57 +0530, Gowrishankar wrote:
> From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>
> Earlier powerpc arch encountered an issue in secondary process
> to map hugepages in same VA range as mapped by primary process.
> By then, proposed fix was to use nr_overcommit_hugepages from
> the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
> it solved respecting address hints in mmap calls, this fix
> introduced limitation of maximum VA space that, primary process
> in DPDK can create upon hugepages, to physical RAM size (almost).
>
> This patch cleans up this limitation by
>
> 1. reverting the previous patch so that, virtual address space
> range is not a constraint (like other arch).
>
> 2. reverse-indexing on hugepage files as the secondary
> process mmap them. Reversed addressing sequence makes
> this mandate.
>
> 3. Move slightly where munmap() is called in zero-mapped VA
> block, as secondary process would attach them.
>
> All these changes has also been verified in x86 arch (and request
> other arch maintainers too test this and give feedback).
>
> Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
> Cc: stable@dpdk.org
>
> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.
> ibm.com>
>
>
> Gowrishankar Muthukrishnan (3):
> eal: access hugepage_file in reverse order for powerpc
> eal: reorder calling munmap on zero-mapped memory
> eal: reverse powerpc changes done for hugepage overcommit
>
> doc/guides/linux_gsg/sys_reqs.rst | 6 ------
> lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
> 2 files changed, 9 insertions(+), 19 deletions(-)
Hi,
which stable releases are these patches aimed at?
In the future, please consider using git send-email with --subject-
prefix 'PATCH xx.yy' so that it's included in the subject.
--
Kind regards,
Luca Boccassi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le
2018-06-21 8:50 ` [dpdk-stable] [PATCH 0/3] eal: clean up mapping hugepages in secondary process for ppc64le Luca Boccassi
@ 2018-06-21 8:54 ` gowrishankar muthukrishnan
0 siblings, 0 replies; 7+ messages in thread
From: gowrishankar muthukrishnan @ 2018-06-21 8:54 UTC (permalink / raw)
To: Luca Boccassi, Sergio Gonzalez Monroy, Bruce Richardson,
Konstantin Ananyev
Cc: stable, Chao Zhu
On Thursday 21 June 2018 02:20 PM, Luca Boccassi wrote:
> On Thu, 2018-06-21 at 10:57 +0530, Gowrishankar wrote:
>> From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>>
>> Earlier powerpc arch encountered an issue in secondary process
>> to map hugepages in same VA range as mapped by primary process.
>> By then, proposed fix was to use nr_overcommit_hugepages from
>> the kernel and mmap using MAP_HUGETLB|MAP_ANONYMOUS flags. Though
>> it solved respecting address hints in mmap calls, this fix
>> introduced limitation of maximum VA space that, primary process
>> in DPDK can create upon hugepages, to physical RAM size (almost).
>>
>> This patch cleans up this limitation by
>>
>> 1. reverting the previous patch so that, virtual address space
>> range is not a constraint (like other arch).
>>
>> 2. reverse-indexing on hugepage files as the secondary
>> process mmap them. Reversed addressing sequence makes
>> this mandate.
>>
>> 3. Move slightly where munmap() is called in zero-mapped VA
>> block, as secondary process would attach them.
>>
>> All these changes has also been verified in x86 arch (and request
>> other arch maintainers too test this and give feedback).
>>
>> Fixes: 284ae3e9ff ("eal/ppc: fix mmap for memory initialization")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.
>> ibm.com>
>>
>>
>> Gowrishankar Muthukrishnan (3):
>> eal: access hugepage_file in reverse order for powerpc
>> eal: reorder calling munmap on zero-mapped memory
>> eal: reverse powerpc changes done for hugepage overcommit
>>
>> doc/guides/linux_gsg/sys_reqs.rst | 6 ------
>> lib/librte_eal/linuxapp/eal/eal_memory.c | 22 +++++++++-------------
>> 2 files changed, 9 insertions(+), 19 deletions(-)
> Hi,
>
> which stable releases are these patches aimed at?
>
> In the future, please consider using git send-email with --subject-
> prefix 'PATCH xx.yy' so that it's included in the subject.
>
Sure Luca. This patch set is for v17.11 based LTS.
Thanks,
Gowrishankar
^ permalink raw reply [flat|nested] 7+ messages in thread