DPDK patches and discussions
 help / color / mirror / Atom feed
* DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
@ 2022-07-12  6:05 Asaf Sinai
  2022-07-12 13:13 ` Burakov, Anatoly
  0 siblings, 1 reply; 8+ messages in thread
From: Asaf Sinai @ 2022-07-12  6:05 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 5146 bytes --]

Hi,

We run DPDK 19.11.3 with multi processes.
When using external memory for DPDK (instead of huge pages), we are unable to receive traffic in the secondary processes.


  *   We have external physical memory regions (excluded from Linux):

[ULP-NG]# cat /proc/cmdline
console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0 rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0
memmap=0x1000000!0x60000000 memmap=0x90000000!0x800000000 memmap=0x90000000!0x1800000000 quiet cloud-init=disabled


  *   We make all DPDK initializations in the primary process, including mapping the mentioned memory regions via "/dev/mem".

After these memory mapping, we register the external memory, as described in http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#support-for-externally-allocated-memory, and allocate memory pools:



...

/* Map physical to virtual */

int memFd = open("/dev/mem", O_RDWR);

extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;

extMemInfo[i].memRegSize = 0x0000000080000000;

extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize, PROT_READ | PROT_WRITE,

                             MAP_SHARED, memFd, memRegPhysAddr[i]);



/* Create heap */

sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);

rv = rte_malloc_heap_create(extMemInfo[i].heapName);



/* Save heap socket ID */

rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);

extMemInfo[i].socketId = rv;



/* Add memory region to heap */

rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName, extMemInfo[i].memRegAddr,

                                extMemInfo[i].memRegSize, NULL, 0, extMemInfo[i].pageSize);

...

/* Allocate memory pool */

memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE, poolCacheSize,

                             sizeof(struct rte_pktmbuf_pool_private),

                             rte_pktmbuf_pool_init, NULL,

                             und_pktmbuf_init, NULL, extMemInfo[i].socketId, DPDK_NO_FLAGS);

...



Please note, that during calls to "rte_malloc_heap_memory_add", we see the following warnings:


EAL: WARNING! Base virtual address hint (0x2101005000 != 0x7ffff4627000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x210100b000 != 0x7ffff4619000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes



And after executing "rte_mempool_create", physical addresses of the allocated memory zones are bad:


[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0, socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104, pageSize=2097152
[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256, vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0, paddr=0xffffffffffffffff-0x1840ff, len=1589504, hugepage_sz=2MB



  *   In the next step, we spawn the secondary processes from the primary one, using fork().

This way, all DPDK data and memory mappings are the same on all processes. For example:

Mapping in primary process:
cat /proc/1875/maps
...
7ffec8000000-7fff58000000 rw-s 1800000000 00:06 5                        /dev/mem
7fff58000000-7fffe8000000 rw-s 800000000 00:06 5                         /dev/mem
...

Mapping in secondary process:
cat /proc/2676/maps
...
7ffec8000000-7fff58000000 rw-s 1800000000 00:06 5                        /dev/mem
7fff58000000-7fffe8000000 rw-s 800000000 00:06 5                         /dev/mem
...


  *   Unfortunately, when traffic is received by the secondary processes, we see the following printout on the primary process:

i40e_dev_alarm_handler(): ICR0: malicious programming detected


  *   Additionally, we saw no example of using external physical shared memory on secondary processes.

DPDK test applications show usage of anonymous private memory on the primary process only.

What are we missing?
Can you please advise?

Thanks,
Asaf

[Radware]
Asaf Sinai
ND SW Engineer
Email: asafsi@radware.com<mailto:asafsi@radware.com>
T:+972-72-3917050
M:+972-50-6518541
F:+972-3-6488662
[Check out the latest and greatest from Radware]<https://www.radware.com/Resources/CampaignRedirector.html>

www.radware.com<https://www.radware.com>

[Blog]<https://blog.radware.com/>  [https://www.radware.com/images/signature/linkedin.jpg] <https://www.linkedin.com/companies/165642> [https://www.radware.com/images/signature/twitter.jpg] <file://twitter.com/radware>   [youtube] <https://www.youtube.com/user/radwareinc>
Confidentiality note: This message, and any attachments to it, contains privileged/confidential information of RADWARE Ltd./RADWARE Inc. and may not be disclosed, used, copied, or transmitted in any form or by any means without prior written permission from RADWARE. If you are not the intended recipient, delete the message and any attachments from your system without reading or copying it, and kindly notify the sender by e-mail. Thank you.
P Please consider your environmental responsibility before printing this e-mail


[-- Attachment #2: Type: text/html, Size: 23054 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-12  6:05 DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes Asaf Sinai
@ 2022-07-12 13:13 ` Burakov, Anatoly
  2022-07-14 10:24   ` Asaf Sinai
  0 siblings, 1 reply; 8+ messages in thread
From: Burakov, Anatoly @ 2022-07-12 13:13 UTC (permalink / raw)
  To: Asaf Sinai, dev

Hi Asaf,

On 12-Jul-22 7:05 AM, Asaf Sinai wrote:
> Hi,
> 
> We run DPDK 19.11.3 with multi processes.
> 
> When using external memory for DPDK (instead of huge pages), we are 
> unable to receive traffic in the secondary processes.
> 
>   * We have external physical memory regions (excluded from Linux):
> 
> /[ULP-NG]# cat /proc/cmdline/
> 
> /console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0 
> rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0/
> 
> /memmap=0x1000000!0x60000000 *memmap=0x90000000!0x800000000 
> memmap=0x90000000!0x1800000000*quiet cloud-init=disabled/
> 
>   * We make all DPDK initializations in the primary process, including
>     mapping the mentioned memory regions via “/dev/mem”.
> 
> After these memory mapping, we register the external memory, as 
> described in 
> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#support-for-externally-allocated-memory 
> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#support-for-externally-allocated-memory>, 
> and allocate memory pools:
> 
> /.../
> 
> //* Map physical to virtual *//
> 
> /int memFd = open("/dev/mem", O_RDWR);/
> 
> /extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;/
> 
> /extMemInfo[i].memRegSize = 0x0000000080000000;/
> 
> /extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize, 
> PROT_READ | PROT_WRITE,/
> 
> /                             MAP_SHARED, memFd, memRegPhysAddr[i]);/
> 
> //
> 
> //* Create heap *//
> 
> /sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);/
> 
> /rv = rte_malloc_heap_create(extMemInfo[i].heapName);/
> 
> //
> 
> //* Save heap socket ID *//
> 
> /rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);/
> 
> /extMemInfo[i].socketId = rv;/
> 
> //
> 
> //* Add memory region to heap *//
> 
> /rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName, 
> extMemInfo[i].memRegAddr,/
> 
> /                                extMemInfo[i].memRegSize, NULL, 0, 
> extMemInfo[i].pageSize);/
> 
> /.../

Please correct me if I'm wrong, but it seems like you're specifying NULL 
as IOVA table, so that means external memory will not get IOVA addresses 
set. You need to find (probably using rte_mem_virt2phys() function) the 
individual page addresses, put them into an array, and pass it as an 
argument to add_memory call. See documentation[1], particularly the 
iova_addrs[] parameter.

[1] 
http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3

> 
> //* Allocate memory pool *//
> 
> /memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE, 
> poolCacheSize,/
> 
> /                             sizeof(struct rte_pktmbuf_pool_private),/
> 
> /                             rte_pktmbuf_pool_init, NULL,/
> 
> /                             und_pktmbuf_init, NULL, 
> extMemInfo[i].socketId, DPDK_NO_FLAGS);/
> 
> /.../
> 
> Please note, that during calls to “/rte_malloc_heap_memory_add/”, we see 
> the following warnings:
> 
> */EAL: WARNING! Base virtual address hint (0x2101005000 != 
> 0x7ffff4627000) not respected!/*
> 
> /EAL:    This may cause issues with mapping memory into secondary processes/
> 
> */EAL: WARNING! Base virtual address hint (0x210100b000 != 
> 0x7ffff4619000) not respected!/*
> 
> /EAL:    This may cause issues with mapping memory into secondary processes/
> 
> And after executing “/rte_mempool_create/”, physical addresses of the 
> allocated memory zones are bad:

Yes, because you did not specify them as part of 
rte_malloc_heap_add_memory() call :)

> 
> /[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0, 
> socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104, 
> pageSize=2097152/
> 
> /[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256, 
> vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0, 
> *paddr=0xffffffffffffffff-0x1840ff*, len=1589504, hugepage_sz=2MB/
> 
>   * In the next step, we spawn the secondary processes from the primary
>     one, using fork().
> 
> This way, all DPDK data and memory mappings are the same on all 
> processes. For example:

This is a recipe for disaster, because DPDK creates internal data 
structures that are mapped as files, and primary process expects to be 
the sole owner and user of those data structures. If you would like to 
create a secondary process using fork() at all, you should fork() first, 
and then call rte_eal_init() in the secondary process. A forked primary 
process is not what we mean when we say "secondary process".

In addition, see documentation[1] for `rte_malloc_heap_memory_add()` - 
there's a note there that states:

Before accessing this memory in other processes, it needs to be attached 
in each of those processes by calling rte_malloc_heap_memory_attach in 
each other process.

So, not only you need to call secondary process EAL init, you will also 
have to call `rte_malloc_heap_memory_attach()` before you can use your 
shared external memory in your secondary process.

Hope this is helpful!

> 
> _Mapping in primary process:_
> 
> /cat /proc/1875/maps/
> 
> /.../
> 
> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06 
> 5                        /dev/mem/*
> 
> */7fff58000000-7fffe8000000 rw-s 800000000 00:06 
> 5                         /dev/mem/*
> 
> /.../
> 
> _Mapping in secondary process:_
> 
> /cat /proc/2676/maps/
> 
> /.../
> 
> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06 
> 5                        /dev/mem/*
> 
> */7fff58000000-7fffe8000000 rw-s 800000000 00:06 
> 5                         /dev/mem/*
> 
> /.../
> 
>   * Unfortunately, when traffic is received by the secondary processes,
>     we see the following printout on the primary process:
> 
> */i40e_dev_alarm_handler(): ICR0: malicious programming detected/*
> 
>   * Additionally, we saw no example of using external physical shared
>     memory on secondary processes.
> 
> DPDK test applications show usage of anonymous private memory on the 
> primary process only.
> 
> What are we missing?
> 
> Can you please advise?
> 
> Thanks,
> 
> Asaf
> 
> Radware
> 


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-12 13:13 ` Burakov, Anatoly
@ 2022-07-14 10:24   ` Asaf Sinai
  2022-07-14 10:41     ` Asaf Sinai
  0 siblings, 1 reply; 8+ messages in thread
From: Asaf Sinai @ 2022-07-14 10:24 UTC (permalink / raw)
  To: Burakov, Anatoly, dev

[-- Attachment #1: Type: text/plain, Size: 7639 bytes --]

Hi Anatoly,



Thanks for the comments!

So now running with the correct initializations: first fork(), then call rte_eal_init() in the secondary processes.

When using hugepages, DPDK is up and running!



Then, tried to use the external memory:

- On primary: followed the example of create_extmem() in testpmd.c [without calling calc_mem_size() & alloc_mem(), as we already have virtual addresses and sizes from previous mmap() to the external memory regions].

  Unfortunately, calling rte_mem_virt2iova() [which calls rte_mem_virt2phy()] always returns RTE_BAD_IOVA.

- On secondaries, rte_eal_init() successfully finished, but then calling rte_mempool_lookup() gets null data... ☹



Do you have any idea?



Thanks,

Asaf



-----Original Message-----
From: Burakov, Anatoly <anatoly.burakov@intel.com>
Sent: Tuesday, July 12, 2022 16:14
To: Asaf Sinai <AsafSi@Radware.com>; dev@dpdk.org
Subject: Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes



Hi Asaf,



On 12-Jul-22 7:05 AM, Asaf Sinai wrote:

> Hi,

>

> We run DPDK 19.11.3 with multi processes.

>

> When using external memory for DPDK (instead of huge pages), we are

> unable to receive traffic in the secondary processes.

>

>   * We have external physical memory regions (excluded from Linux):

>

> /[ULP-NG]# cat /proc/cmdline/

>

> /console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0

> rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0/

>

> /memmap=0x1000000!0x60000000 *memmap=0x90000000!0x800000000

> memmap=0x90000000!0x1800000000*quiet cloud-init=disabled/

>

>   * We make all DPDK initializations in the primary process, including

>     mapping the mentioned memory regions via “/dev/mem”.

>

> After these memory mapping, we register the external memory, as

> described in

> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#suppo

> rt-for-externally-allocated-memory

> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#supp

> ort-for-externally-allocated-memory>,

> and allocate memory pools:

>

> /.../

>

> //* Map physical to virtual *//

>

> /int memFd = open("/dev/mem", O_RDWR);/

>

> /extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;/

>

> /extMemInfo[i].memRegSize = 0x0000000080000000;/

>

> /extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize,

> PROT_READ | PROT_WRITE,/

>

> /                             MAP_SHARED, memFd, memRegPhysAddr[i]);/

>

> //

>

> //* Create heap *//

>

> /sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);/

>

> /rv = rte_malloc_heap_create(extMemInfo[i].heapName);/

>

> //

>

> //* Save heap socket ID *//

>

> /rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);/

>

> /extMemInfo[i].socketId = rv;/

>

> //

>

> //* Add memory region to heap *//

>

> /rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName,

> extMemInfo[i].memRegAddr,/

>

> /                                extMemInfo[i].memRegSize, NULL, 0,

> extMemInfo[i].pageSize);/

>

> /.../



Please correct me if I'm wrong, but it seems like you're specifying NULL as IOVA table, so that means external memory will not get IOVA addresses set. You need to find (probably using rte_mem_virt2phys() function) the individual page addresses, put them into an array, and pass it as an argument to add_memory call. See documentation[1], particularly the iova_addrs[] parameter.



[1]

http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3



>

> //* Allocate memory pool *//

>

> /memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE,

> poolCacheSize,/

>

> /                             sizeof(struct rte_pktmbuf_pool_private),/

>

> /                             rte_pktmbuf_pool_init, NULL,/

>

> /                             und_pktmbuf_init, NULL,

> extMemInfo[i].socketId, DPDK_NO_FLAGS);/

>

> /.../

>

> Please note, that during calls to “/rte_malloc_heap_memory_add/”, we see

> the following warnings:

>

> */EAL: WARNING! Base virtual address hint (0x2101005000 !=

> 0x7ffff4627000) not respected!/*

>

> /EAL:    This may cause issues with mapping memory into secondary processes/

>

> */EAL: WARNING! Base virtual address hint (0x210100b000 !=

> 0x7ffff4619000) not respected!/*

>

> /EAL:    This may cause issues with mapping memory into secondary processes/

>

> And after executing “/rte_mempool_create/”, physical addresses of the

> allocated memory zones are bad:



Yes, because you did not specify them as part of

rte_malloc_heap_add_memory() call :)



>

> /[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0,

> socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104,

> pageSize=2097152/

>

> /[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256,

> vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0,

> *paddr=0xffffffffffffffff-0x1840ff*, len=1589504, hugepage_sz=2MB/

>

>   * In the next step, we spawn the secondary processes from the primary

>     one, using fork().

>

> This way, all DPDK data and memory mappings are the same on all

> processes. For example:



This is a recipe for disaster, because DPDK creates internal data

structures that are mapped as files, and primary process expects to be

the sole owner and user of those data structures. If you would like to

create a secondary process using fork() at all, you should fork() first,

and then call rte_eal_init() in the secondary process. A forked primary

process is not what we mean when we say "secondary process".



In addition, see documentation[1] for `rte_malloc_heap_memory_add()` -

there's a note there that states:



Before accessing this memory in other processes, it needs to be attached

in each of those processes by calling rte_malloc_heap_memory_attach in

each other process.



So, not only you need to call secondary process EAL init, you will also

have to call `rte_malloc_heap_memory_attach()` before you can use your

shared external memory in your secondary process.



Hope this is helpful!



>

> _Mapping in primary process:_

>

> /cat /proc/1875/maps/

>

> /.../

>

> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

> 5                        /dev/mem/*

>

> */7fff58000000-7fffe8000000 rw-s 800000000 00:06

> 5                         /dev/mem/*

>

> /.../

>

> _Mapping in secondary process:_

>

> /cat /proc/2676/maps/

>

> /.../

>

> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

> 5                        /dev/mem/*

>

> */7fff58000000-7fffe8000000 rw-s 800000000 00:06

> 5                         /dev/mem/*

>

> /.../

>

>   * Unfortunately, when traffic is received by the secondary processes,

>     we see the following printout on the primary process:

>

> */i40e_dev_alarm_handler(): ICR0: malicious programming detected/*

>

>   * Additionally, we saw no example of using external physical shared

>     memory on secondary processes.

>

> DPDK test applications show usage of anonymous private memory on the

> primary process only.

>

> What are we missing?

>

> Can you please advise?

>

> Thanks,

>

> Asaf

>

> Radware

>





--

Thanks,

Anatoly

[-- Attachment #2: Type: text/html, Size: 19868 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-14 10:24   ` Asaf Sinai
@ 2022-07-14 10:41     ` Asaf Sinai
  2022-07-15 10:17       ` Burakov, Anatoly
  0 siblings, 1 reply; 8+ messages in thread
From: Asaf Sinai @ 2022-07-14 10:41 UTC (permalink / raw)
  To: Burakov, Anatoly, dev

[-- Attachment #1: Type: text/plain, Size: 8299 bytes --]

As calling rte_mem_virt2iova() returns RTE_BAD_IOVA, tried to supply the correct matching physical addresses in iova_addrs[], but it did not help either: secondary processes still get null data in rte_mempool_lookup().
How does secondary process get info from primary? Via files in /run/dpdk/rte/ ?

From: Asaf Sinai <AsafSi@Radware.com>
Sent: Thursday, July 14, 2022 13:25
To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
Subject: RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes


Hi Anatoly,



Thanks for the comments!

So now running with the correct initializations: first fork(), then call rte_eal_init() in the secondary processes.

When using hugepages, DPDK is up and running!



Then, tried to use the external memory:

- On primary: followed the example of create_extmem() in testpmd.c [without calling calc_mem_size() & alloc_mem(), as we already have virtual addresses and sizes from previous mmap() to the external memory regions].

  Unfortunately, calling rte_mem_virt2iova() [which calls rte_mem_virt2phy()] always returns RTE_BAD_IOVA.

- On secondaries, rte_eal_init() successfully finished, but then calling rte_mempool_lookup() gets null data... ☹



Do you have any idea?



Thanks,

Asaf



-----Original Message-----
From: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Sent: Tuesday, July 12, 2022 16:14
To: Asaf Sinai <AsafSi@Radware.com<mailto:AsafSi@Radware.com>>; dev@dpdk.org<mailto:dev@dpdk.org>
Subject: Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes



Hi Asaf,



On 12-Jul-22 7:05 AM, Asaf Sinai wrote:

> Hi,

>

> We run DPDK 19.11.3 with multi processes.

>

> When using external memory for DPDK (instead of huge pages), we are

> unable to receive traffic in the secondary processes.

>

>   * We have external physical memory regions (excluded from Linux):

>

> /[ULP-NG]# cat /proc/cmdline/

>

> /console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0

> rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0/

>

> /memmap=0x1000000!0x60000000 *memmap=0x90000000!0x800000000

> memmap=0x90000000!0x1800000000*quiet cloud-init=disabled/

>

>   * We make all DPDK initializations in the primary process, including

>     mapping the mentioned memory regions via “/dev/mem”.

>

> After these memory mapping, we register the external memory, as

> described in

> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#suppo

> rt-for-externally-allocated-memory

> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#supp

> ort-for-externally-allocated-memory>,

> and allocate memory pools:

>

> /.../

>

> //* Map physical to virtual *//

>

> /int memFd = open("/dev/mem", O_RDWR);/

>

> /extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;/

>

> /extMemInfo[i].memRegSize = 0x0000000080000000;/

>

> /extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize,

> PROT_READ | PROT_WRITE,/

>

> /                             MAP_SHARED, memFd, memRegPhysAddr[i]);/

>

> //

>

> //* Create heap *//

>

> /sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);/

>

> /rv = rte_malloc_heap_create(extMemInfo[i].heapName);/

>

> //

>

> //* Save heap socket ID *//

>

> /rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);/

>

> /extMemInfo[i].socketId = rv;/

>

> //

>

> //* Add memory region to heap *//

>

> /rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName,

> extMemInfo[i].memRegAddr,/

>

> /                                extMemInfo[i].memRegSize, NULL, 0,

> extMemInfo[i].pageSize);/

>

> /.../



Please correct me if I'm wrong, but it seems like you're specifying NULL as IOVA table, so that means external memory will not get IOVA addresses set. You need to find (probably using rte_mem_virt2phys() function) the individual page addresses, put them into an array, and pass it as an argument to add_memory call. See documentation[1], particularly the iova_addrs[] parameter.



[1]

http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3



>

> //* Allocate memory pool *//

>

> /memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE,

> poolCacheSize,/

>

> /                             sizeof(struct rte_pktmbuf_pool_private),/

>

> /                             rte_pktmbuf_pool_init, NULL,/

>

> /                             und_pktmbuf_init, NULL,

> extMemInfo[i].socketId, DPDK_NO_FLAGS);/

>

> /.../

>

> Please note, that during calls to “/rte_malloc_heap_memory_add/”, we see

> the following warnings:

>

> */EAL: WARNING! Base virtual address hint (0x2101005000 !=

> 0x7ffff4627000) not respected!/*

>

> /EAL:    This may cause issues with mapping memory into secondary processes/

>

> */EAL: WARNING! Base virtual address hint (0x210100b000 !=

> 0x7ffff4619000) not respected!/*

>

> /EAL:    This may cause issues with mapping memory into secondary processes/

>

> And after executing “/rte_mempool_create/”, physical addresses of the

> allocated memory zones are bad:



Yes, because you did not specify them as part of

rte_malloc_heap_add_memory() call :)



>

> /[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0,

> socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104,

> pageSize=2097152/

>

> /[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256,

> vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0,

> *paddr=0xffffffffffffffff-0x1840ff*, len=1589504, hugepage_sz=2MB/

>

>   * In the next step, we spawn the secondary processes from the primary

>     one, using fork().

>

> This way, all DPDK data and memory mappings are the same on all

> processes. For example:



This is a recipe for disaster, because DPDK creates internal data

structures that are mapped as files, and primary process expects to be

the sole owner and user of those data structures. If you would like to

create a secondary process using fork() at all, you should fork() first,

and then call rte_eal_init() in the secondary process. A forked primary

process is not what we mean when we say "secondary process".



In addition, see documentation[1] for `rte_malloc_heap_memory_add()` -

there's a note there that states:



Before accessing this memory in other processes, it needs to be attached

in each of those processes by calling rte_malloc_heap_memory_attach in

each other process.



So, not only you need to call secondary process EAL init, you will also

have to call `rte_malloc_heap_memory_attach()` before you can use your

shared external memory in your secondary process.



Hope this is helpful!



>

> _Mapping in primary process:_

>

> /cat /proc/1875/maps/

>

> /.../

>

> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

> 5                        /dev/mem/*

>

> */7fff58000000-7fffe8000000 rw-s 800000000 00:06

> 5                         /dev/mem/*

>

> /.../

>

> _Mapping in secondary process:_

>

> /cat /proc/2676/maps/

>

> /.../

>

> */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

> 5                        /dev/mem/*

>

> */7fff58000000-7fffe8000000 rw-s 800000000 00:06

> 5                         /dev/mem/*

>

> /.../

>

>   * Unfortunately, when traffic is received by the secondary processes,

>     we see the following printout on the primary process:

>

> */i40e_dev_alarm_handler(): ICR0: malicious programming detected/*

>

>   * Additionally, we saw no example of using external physical shared

>     memory on secondary processes.

>

> DPDK test applications show usage of anonymous private memory on the

> primary process only.

>

> What are we missing?

>

> Can you please advise?

>

> Thanks,

>

> Asaf

>

> Radware

>





--

Thanks,

Anatoly

[-- Attachment #2: Type: text/html, Size: 21477 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-14 10:41     ` Asaf Sinai
@ 2022-07-15 10:17       ` Burakov, Anatoly
  2022-07-18 11:58         ` Asaf Sinai
  0 siblings, 1 reply; 8+ messages in thread
From: Burakov, Anatoly @ 2022-07-15 10:17 UTC (permalink / raw)
  To: Asaf Sinai, dev

On 14-Jul-22 11:41 AM, Asaf Sinai wrote:
> As calling rte_mem_virt2iova() returns RTE_BAD_IOVA, tried to supply the 
> correct matching physical addresses in iova_addrs[], but it did not help 
> either: secondary processes still get null data in rte_mempool_lookup().
> 
> How does secondary process get info from primary? Via files in 
> /run/dpdk/rte/ ?

Hi,

I'm assuming the sequence of events was as follows:

1) run binary
2) fork()
3) call rte_eal_init() (note: in both primary and secondary, fork() must 
happen *before* EAL init!)
4) add_memory in primary, then attach_memory in secondary

Can you double-check this sequence? If mempool lookup still returns 
NULL, that probably means that you don't have the same mempool 
drivers/libraries linked to your secondary process. You would have to 
verify that all of the same drivers/libraries are linked in both primary 
and secondary process. Once it is the case, at least the mempool lookup 
should work.

> 
> *From:* Asaf Sinai <AsafSi@Radware.com>
> *Sent:* Thursday, July 14, 2022 13:25
> *To:* Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
> *Subject:* RE: DPDK 19.11.3 with multi processes and external physical 
> memory: unable to receive traffic in the secondary processes
> 
> Hi Anatoly,
> 
> Thanks for the comments!
> 
> So now running with the correct initializations: first fork(), then call 
> rte_eal_init() in the secondary processes.
> 
> When using hugepages, DPDK is up and running!
> 
> Then, tried to use the external memory:
> 
> - On primary: followed the example of create_extmem() in testpmd.c 
> [without calling calc_mem_size() & alloc_mem(), as we already have 
> virtual addresses and sizes from previous mmap() to the external memory 
> regions].
> 
>    Unfortunately, calling rte_mem_virt2iova() [which calls 
> rte_mem_virt2phy()] always returns RTE_BAD_IOVA.

Looking at the code of that function, if you get RTE_BAD_IOVA, that 
means there is an error happening. Usually when something goes wrong, 
there are errors printed out to DPDK log, but there are two cases when 
that doesn't happen: when the pfn turns out to be invalid (which i 
assume isn't the case for you as you can see your mappings in your 
procfs as you have indicated in earlier mails), and when physical 
addresses are not available, usually because you're not running as root.

Are you running DPDK as root? If not, real physical addresses will not 
be available, which will make it so that rte_mem_virt2phys() will return 
RTE_BAD_IOVA.

If you are running as root, you should see the cause of the error in 
your EAL logs (if necessary, please run with `--log-level=*eal*:debug` 
to display any additional logs that might get hidden).

> 
> - On secondaries, rte_eal_init() successfully finished, but then calling 
> rte_mempool_lookup() gets null data... ☹
> 

I'm assuming by "null data" you mean you can't look up the mempool (i.e. 
mempool_lookup() returns NULL)?

> Do you have any idea?
> 
> Thanks,
> 
> Asaf
> 
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com 
> <mailto:anatoly.burakov@intel.com>>
> Sent: Tuesday, July 12, 2022 16:14
> To: Asaf Sinai <AsafSi@Radware.com <mailto:AsafSi@Radware.com>>; 
> dev@dpdk.org <mailto:dev@dpdk.org>
> Subject: Re: DPDK 19.11.3 with multi processes and external physical 
> memory: unable to receive traffic in the secondary processes
> 
> Hi Asaf,
> 
> On 12-Jul-22 7:05 AM, Asaf Sinai wrote:
> 
>  > Hi,
> 
>  >
> 
>  > We run DPDK 19.11.3 with multi processes.
> 
>  >
> 
>  > When using external memory for DPDK (instead of huge pages), we are
> 
>  > unable to receive traffic in the secondary processes.
> 
>  >
> 
>  >   * We have external physical memory regions (excluded from Linux):
> 
>  >
> 
>  > /[ULP-NG]# cat /proc/cmdline/
> 
>  >
> 
>  > /console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0
> 
>  > rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0/
> 
>  >
> 
>  > /memmap=0x1000000!0x60000000 *memmap=0x90000000!0x800000000
> 
>  > memmap=0x90000000!0x1800000000*quiet cloud-init=disabled/
> 
>  >
> 
>  >   * We make all DPDK initializations in the primary process, including
> 
>  >     mapping the mentioned memory regions via “/dev/mem”.
> 
>  >
> 
>  > After these memory mapping, we register the external memory, as
> 
>  > described in
> 
>  > 
> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#suppo 
> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#suppo>
> 
>  > rt-for-externally-allocated-memory
> 
>  > <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#supp
> 
>  > ort-for-externally-allocated-memory>,
> 
>  > and allocate memory pools:
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  > //* Map physical to virtual *//
> 
>  >
> 
>  > /int memFd = open("/dev/mem", O_RDWR);/
> 
>  >
> 
>  > /extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;/
> 
>  >
> 
>  > /extMemInfo[i].memRegSize = 0x0000000080000000;/
> 
>  >
> 
>  > /extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize,
> 
>  > PROT_READ | PROT_WRITE,/
> 
>  >
> 
>  > /                             MAP_SHARED, memFd, memRegPhysAddr[i]);/
> 
>  >
> 
>  > //
> 
>  >
> 
>  > //* Create heap *//
> 
>  >
> 
>  > /sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);/
> 
>  >
> 
>  > /rv = rte_malloc_heap_create(extMemInfo[i].heapName);/
> 
>  >
> 
>  > //
> 
>  >
> 
>  > //* Save heap socket ID *//
> 
>  >
> 
>  > /rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);/
> 
>  >
> 
>  > /extMemInfo[i].socketId = rv;/
> 
>  >
> 
>  > //
> 
>  >
> 
>  > //* Add memory region to heap *//
> 
>  >
> 
>  > /rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName,
> 
>  > extMemInfo[i].memRegAddr,/
> 
>  >
> 
>  > /                                extMemInfo[i].memRegSize, NULL, 0,
> 
>  > extMemInfo[i].pageSize);/
> 
>  >
> 
>  > /.../
> 
> Please correct me if I'm wrong, but it seems like you're specifying NULL 
> as IOVA table, so that means external memory will not get IOVA addresses 
> set. You need to find (probably using rte_mem_virt2phys() function) the 
> individual page addresses, put them into an array, and pass it as an 
> argument to add_memory call. See documentation[1], particularly the 
> iova_addrs[] parameter.
> 
> [1]
> 
> http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3 
> <http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3>
> 
>  >
> 
>  > //* Allocate memory pool *//
> 
>  >
> 
>  > /memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE,
> 
>  > poolCacheSize,/
> 
>  >
> 
>  > /                             sizeof(struct rte_pktmbuf_pool_private),/
> 
>  >
> 
>  > /                             rte_pktmbuf_pool_init, NULL,/
> 
>  >
> 
>  > /                             und_pktmbuf_init, NULL,
> 
>  > extMemInfo[i].socketId, DPDK_NO_FLAGS);/
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  > Please note, that during calls to “/rte_malloc_heap_memory_add/”, we see
> 
>  > the following warnings:
> 
>  >
> 
>  > */EAL: WARNING! Base virtual address hint (0x2101005000 !=
> 
>  > 0x7ffff4627000) not respected!/*
> 
>  >
> 
>  > /EAL:    This may cause issues with mapping memory into secondary 
> processes/
> 
>  >
> 
>  > */EAL: WARNING! Base virtual address hint (0x210100b000 !=
> 
>  > 0x7ffff4619000) not respected!/*
> 
>  >
> 
>  > /EAL:    This may cause issues with mapping memory into secondary 
> processes/
> 
>  >
> 
>  > And after executing “/rte_mempool_create/”, physical addresses of the
> 
>  > allocated memory zones are bad:
> 
> Yes, because you did not specify them as part of
> 
> rte_malloc_heap_add_memory() call :)
> 
>  >
> 
>  > /[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0,
> 
>  > socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104,
> 
>  > pageSize=2097152/
> 
>  >
> 
>  > /[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256,
> 
>  > vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0,
> 
>  > *paddr=0xffffffffffffffff-0x1840ff*, len=1589504, hugepage_sz=2MB/
> 
>  >
> 
>  >   * In the next step, we spawn the secondary processes from the primary
> 
>  >     one, using fork().
> 
>  >
> 
>  > This way, all DPDK data and memory mappings are the same on all
> 
>  > processes. For example:
> 
> This is a recipe for disaster, because DPDK creates internal data
> 
> structures that are mapped as files, and primary process expects to be
> 
> the sole owner and user of those data structures. If you would like to
> 
> create a secondary process using fork() at all, you should fork() first,
> 
> and then call rte_eal_init() in the secondary process. A forked primary
> 
> process is not what we mean when we say "secondary process".
> 
> In addition, see documentation[1] for `rte_malloc_heap_memory_add()` -
> 
> there's a note there that states:
> 
> Before accessing this memory in other processes, it needs to be attached
> 
> in each of those processes by calling rte_malloc_heap_memory_attach in
> 
> each other process.
> 
> So, not only you need to call secondary process EAL init, you will also
> 
> have to call `rte_malloc_heap_memory_attach()` before you can use your
> 
> shared external memory in your secondary process.
> 
> Hope this is helpful!
> 
>  >
> 
>  > _Mapping in primary process:_
> 
>  >
> 
>  > /cat /proc/1875/maps/
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  > */7ffec8000000-7fff58000000 rw-s 1800000000 00:06
> 
>  > 5                        /dev/mem/*
> 
>  >
> 
>  > */7fff58000000-7fffe8000000 rw-s 800000000 00:06
> 
>  > 5                         /dev/mem/*
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  > _Mapping in secondary process:_
> 
>  >
> 
>  > /cat /proc/2676/maps/
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  > */7ffec8000000-7fff58000000 rw-s 1800000000 00:06
> 
>  > 5                        /dev/mem/*
> 
>  >
> 
>  > */7fff58000000-7fffe8000000 rw-s 800000000 00:06
> 
>  > 5                         /dev/mem/*
> 
>  >
> 
>  > /.../
> 
>  >
> 
>  >   * Unfortunately, when traffic is received by the secondary processes,
> 
>  >     we see the following printout on the primary process:
> 
>  >
> 
>  > */i40e_dev_alarm_handler(): ICR0: malicious programming detected/*
> 
>  >
> 
>  >   * Additionally, we saw no example of using external physical shared
> 
>  >     memory on secondary processes.
> 
>  >
> 
>  > DPDK test applications show usage of anonymous private memory on the
> 
>  > primary process only.
> 
>  >
> 
>  > What are we missing?
> 
>  >
> 
>  > Can you please advise?
> 
>  >
> 
>  > Thanks,
> 
>  >
> 
>  > Asaf
> 
>  >
> 
>  > Radware
> 
>  >
> 
> -- 
> 
> Thanks,
> 
> Anatoly
> 


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-15 10:17       ` Burakov, Anatoly
@ 2022-07-18 11:58         ` Asaf Sinai
  2022-07-25  9:21           ` Burakov, Anatoly
  0 siblings, 1 reply; 8+ messages in thread
From: Asaf Sinai @ 2022-07-18 11:58 UTC (permalink / raw)
  To: Burakov, Anatoly, dev

[-- Attachment #1: Type: text/plain, Size: 13941 bytes --]

Hi Anatoly,



DPDK runs as root, and secondary processes have all the info.

The problem was as follows:

The external memory regions are not managed by the Linux OS (by using "memmap=x" in 'grub.conf'). Therefore, the kernel cannot supply their physical addresses.

So, we added these addresses in code, and now it works fine!

Thanks for your help!



We have several additional questions:

1. Usage of huge pages in "rte_eal_init":

We see that "rte_eal_init" requires allocating huge pages for configuring the drivers.

It seems impossible to use the external memory, as "rte_malloc_heap_memory_add" API cannot be used yet.

So, we tried to use regular pages instead of huge pages, using the option of "--no-huge".

It failed with the following printouts:



...

EAL: Multi-process socket /var/run/dpdk/rte/mp_socket

EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not available

EAL: Cannot use IOVA as 'PA' since physical addresses are not available

...



1a. Is there a way to use the external memory for “rte_eal_init”?

1b. Why using regular pages, causes DPDK to complain that “physical addresses are not available”?

1c. Why is “—no-huge” option defined as one of "EAL options for DEBUG use only" (in “eal_common_usage” routine)?



2. Explanation for some details in "create_extmem" routine:

2a. What is the purpose of calling "mlock" before populating IOVA addresses?

2b. Why “munlock” is not used afterwards?



Thanks,

Asaf



-----Original Message-----
From: Burakov, Anatoly <anatoly.burakov@intel.com>
Sent: Friday, July 15, 2022 13:18
To: Asaf Sinai <AsafSi@Radware.com>; dev@dpdk.org
Subject: Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes



On 14-Jul-22 11:41 AM, Asaf Sinai wrote:

> As calling rte_mem_virt2iova() returns RTE_BAD_IOVA, tried to supply

> the correct matching physical addresses in iova_addrs[], but it did

> not help

> either: secondary processes still get null data in rte_mempool_lookup().

>

> How does secondary process get info from primary? Via files in

> /run/dpdk/rte/ ?



Hi,



I'm assuming the sequence of events was as follows:



1) run binary

2) fork()

3) call rte_eal_init() (note: in both primary and secondary, fork() must happen *before* EAL init!)

4) add_memory in primary, then attach_memory in secondary



Can you double-check this sequence? If mempool lookup still returns NULL, that probably means that you don't have the same mempool drivers/libraries linked to your secondary process. You would have to verify that all of the same drivers/libraries are linked in both primary and secondary process. Once it is the case, at least the mempool lookup should work.



>

> *From:* Asaf Sinai <AsafSi@Radware.com<mailto:AsafSi@Radware.com>>

> *Sent:* Thursday, July 14, 2022 13:25

> *To:* Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; dev@dpdk.org<mailto:dev@dpdk.org>

> *Subject:* RE: DPDK 19.11.3 with multi processes and external physical

> memory: unable to receive traffic in the secondary processes

>

> Hi Anatoly,

>

> Thanks for the comments!

>

> So now running with the correct initializations: first fork(), then

> call

> rte_eal_init() in the secondary processes.

>

> When using hugepages, DPDK is up and running!

>

> Then, tried to use the external memory:

>

> - On primary: followed the example of create_extmem() in testpmd.c

> [without calling calc_mem_size() & alloc_mem(), as we already have

> virtual addresses and sizes from previous mmap() to the external

> memory regions].

>

>    Unfortunately, calling rte_mem_virt2iova() [which calls

> rte_mem_virt2phy()] always returns RTE_BAD_IOVA.



Looking at the code of that function, if you get RTE_BAD_IOVA, that means there is an error happening. Usually when something goes wrong, there are errors printed out to DPDK log, but there are two cases when that doesn't happen: when the pfn turns out to be invalid (which i assume isn't the case for you as you can see your mappings in your procfs as you have indicated in earlier mails), and when physical addresses are not available, usually because you're not running as root.



Are you running DPDK as root? If not, real physical addresses will not be available, which will make it so that rte_mem_virt2phys() will return RTE_BAD_IOVA.



If you are running as root, you should see the cause of the error in your EAL logs (if necessary, please run with `--log-level=*eal*:debug` to display any additional logs that might get hidden).



>

> - On secondaries, rte_eal_init() successfully finished, but then

> calling

> rte_mempool_lookup() gets null data... ☹

>



I'm assuming by "null data" you mean you can't look up the mempool (i.e.

mempool_lookup() returns NULL)?



> Do you have any idea?

>

> Thanks,

>

> Asaf

>

> -----Original Message-----

> From: Burakov, Anatoly <anatoly.burakov@intel.com

> <mailto:anatoly.burakov@intel.com>>

> Sent: Tuesday, July 12, 2022 16:14

> To: Asaf Sinai <AsafSi@Radware.com <mailto:AsafSi@Radware.com<mailto:AsafSi@Radware.com%20%3cmailto:AsafSi@Radware.com>>>;

> dev@dpdk.org<mailto:dev@dpdk.org> <mailto:dev@dpdk.org>

> Subject: Re: DPDK 19.11.3 with multi processes and external physical

> memory: unable to receive traffic in the secondary processes

>

> Hi Asaf,

>

> On 12-Jul-22 7:05 AM, Asaf Sinai wrote:

>

>  > Hi,

>

>  >

>

>  > We run DPDK 19.11.3 with multi processes.

>

>  >

>

>  > When using external memory for DPDK (instead of huge pages), we are

>

>  > unable to receive traffic in the secondary processes.

>

>  >

>

>  >   * We have external physical memory regions (excluded from Linux):

>

>  >

>

>  > /[ULP-NG]# cat /proc/cmdline/

>

>  >

>

>  > /console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0

>

>  > rwmode=r net.ifnames=0 biosdevname=0 systemd.show_status=0/

>

>  >

>

>  > /memmap=0x1000000!0x60000000 *memmap=0x90000000!0x800000000

>

>  > memmap=0x90000000!0x1800000000*quiet cloud-init=disabled/

>

>  >

>

>  >   * We make all DPDK initializations in the primary process,

> including

>

>  >     mapping the mentioned memory regions via “/dev/mem”.

>

>  >

>

>  > After these memory mapping, we register the external memory, as

>

>  > described in

>

>  >

> http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#suppo

> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#supp

> o>

>

>  > rt-for-externally-allocated-memory

>

>  >

> <http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#supp

>

>  > ort-for-externally-allocated-memory>,

>

>  > and allocate memory pools:

>

>  >

>

>  > /.../

>

>  >

>

>  > //* Map physical to virtual *//

>

>  >

>

>  > /int memFd = open("/dev/mem", O_RDWR);/

>

>  >

>

>  > /extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;/

>

>  >

>

>  > /extMemInfo[i].memRegSize = 0x0000000080000000;/

>

>  >

>

>  > /extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize,

>

>  > PROT_READ | PROT_WRITE,/

>

>  >

>

>  > /                             MAP_SHARED, memFd,

> memRegPhysAddr[i]);/

>

>  >

>

>  > //

>

>  >

>

>  > //* Create heap *//

>

>  >

>

>  > /sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);/

>

>  >

>

>  > /rv = rte_malloc_heap_create(extMemInfo[i].heapName);/

>

>  >

>

>  > //

>

>  >

>

>  > //* Save heap socket ID *//

>

>  >

>

>  > /rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);/

>

>  >

>

>  > /extMemInfo[i].socketId = rv;/

>

>  >

>

>  > //

>

>  >

>

>  > //* Add memory region to heap *//

>

>  >

>

>  > /rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName,

>

>  > extMemInfo[i].memRegAddr,/

>

>  >

>

>  > /                                extMemInfo[i].memRegSize, NULL, 0,

>

>  > extMemInfo[i].pageSize);/

>

>  >

>

>  > /.../

>

> Please correct me if I'm wrong, but it seems like you're specifying

> NULL as IOVA table, so that means external memory will not get IOVA

> addresses set. You need to find (probably using rte_mem_virt2phys()

> function) the individual page addresses, put them into an array, and

> pass it as an argument to add_memory call. See documentation[1],

> particularly the iova_addrs[] parameter.

>

> [1]

>

> http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62c

> f149fd3

> <http://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62

> cf149fd3>

>

>  >

>

>  > //* Allocate memory pool *//

>

>  >

>

>  > /memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE,

>

>  > poolCacheSize,/

>

>  >

>

>  > /                             sizeof(struct

> rte_pktmbuf_pool_private),/

>

>  >

>

>  > /                             rte_pktmbuf_pool_init, NULL,/

>

>  >

>

>  > /                             und_pktmbuf_init, NULL,

>

>  > extMemInfo[i].socketId, DPDK_NO_FLAGS);/

>

>  >

>

>  > /.../

>

>  >

>

>  > Please note, that during calls to “/rte_malloc_heap_memory_add/”,

> we see

>

>  > the following warnings:

>

>  >

>

>  > */EAL: WARNING! Base virtual address hint (0x2101005000 !=

>

>  > 0x7ffff4627000) not respected!/*

>

>  >

>

>  > /EAL:    This may cause issues with mapping memory into secondary

> processes/

>

>  >

>

>  > */EAL: WARNING! Base virtual address hint (0x210100b000 !=

>

>  > 0x7ffff4619000) not respected!/*

>

>  >

>

>  > /EAL:    This may cause issues with mapping memory into secondary

> processes/

>

>  >

>

>  > And after executing “/rte_mempool_create/”, physical addresses of

> the

>

>  > allocated memory zones are bad:

>

> Yes, because you did not specify them as part of

>

> rte_malloc_heap_add_memory() call :)

>

>  >

>

>  > /[Jul 12 12:02:17] [1875] extMemConfig:

> heapName=extMemHeapSocket_0,

>

>  > socketId=256, memRegAddr=0x7fff58000000, memRegSize=2415919104,

>

>  > pageSize=2097152/

>

>  >

>

>  > /[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256,

>

>  > vaddr=0x7fffe7e7bec0-0x7fffe7ffffc0,

>

>  > *paddr=0xffffffffffffffff-0x1840ff*, len=1589504, hugepage_sz=2MB/

>

>  >

>

>  >   * In the next step, we spawn the secondary processes from the

> primary

>

>  >     one, using fork().

>

>  >

>

>  > This way, all DPDK data and memory mappings are the same on all

>

>  > processes. For example:

>

> This is a recipe for disaster, because DPDK creates internal data

>

> structures that are mapped as files, and primary process expects to be

>

> the sole owner and user of those data structures. If you would like to

>

> create a secondary process using fork() at all, you should fork()

> first,

>

> and then call rte_eal_init() in the secondary process. A forked

> primary

>

> process is not what we mean when we say "secondary process".

>

> In addition, see documentation[1] for `rte_malloc_heap_memory_add()` -

>

> there's a note there that states:

>

> Before accessing this memory in other processes, it needs to be

> attached

>

> in each of those processes by calling rte_malloc_heap_memory_attach in

>

> each other process.

>

> So, not only you need to call secondary process EAL init, you will

> also

>

> have to call `rte_malloc_heap_memory_attach()` before you can use your

>

> shared external memory in your secondary process.

>

> Hope this is helpful!

>

>  >

>

>  > _Mapping in primary process:_

>

>  >

>

>  > /cat /proc/1875/maps/

>

>  >

>

>  > /.../

>

>  >

>

>  > */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

>

>  > 5                        /dev/mem/*

>

>  >

>

>  > */7fff58000000-7fffe8000000 rw-s 800000000 00:06

>

>  > 5                         /dev/mem/*

>

>  >

>

>  > /.../

>

>  >

>

>  > _Mapping in secondary process:_

>

>  >

>

>  > /cat /proc/2676/maps/

>

>  >

>

>  > /.../

>

>  >

>

>  > */7ffec8000000-7fff58000000 rw-s 1800000000 00:06

>

>  > 5                        /dev/mem/*

>

>  >

>

>  > */7fff58000000-7fffe8000000 rw-s 800000000 00:06

>

>  > 5                         /dev/mem/*

>

>  >

>

>  > /.../

>

>  >

>

>  >   * Unfortunately, when traffic is received by the secondary

> processes,

>

>  >     we see the following printout on the primary process:

>

>  >

>

>  > */i40e_dev_alarm_handler(): ICR0: malicious programming detected/*

>

>  >

>

>  >   * Additionally, we saw no example of using external physical

> shared

>

>  >     memory on secondary processes.

>

>  >

>

>  > DPDK test applications show usage of anonymous private memory on

> the

>

>  > primary process only.

>

>  >

>

>  > What are we missing?

>

>  >

>

>  > Can you please advise?

>

>  >

>

>  > Thanks,

>

>  >

>

>  > Asaf

>

>  >

>

>  > Radware

>

>  >

>

> --

>

> Thanks,

>

> Anatoly

>





--

Thanks,

Anatoly

[-- Attachment #2: Type: text/html, Size: 41036 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-18 11:58         ` Asaf Sinai
@ 2022-07-25  9:21           ` Burakov, Anatoly
  2022-07-25  9:31             ` Asaf Sinai
  0 siblings, 1 reply; 8+ messages in thread
From: Burakov, Anatoly @ 2022-07-25  9:21 UTC (permalink / raw)
  To: Asaf Sinai, dev

On 18-Jul-22 12:58 PM, Asaf Sinai wrote:
> Hi Anatoly,
> 
> DPDK runs as root, and secondary processes have all the info.
> 
> The problem was as follows:
> 
> The external memory regions are not managed by the Linux OS (by using 
> "memmap=x" in 'grub.conf'). Therefore, the kernel cannot supply their 
> physical addresses.
> 
> So, we added these addresses in code, and now it works fine!
> 
> Thanks for your help!

Happy to hear that!

> 
> We have several additional questions:
> 
> *_1. Usage of huge pages in "rte_eal_init":_*
> 
> We see that "rte_eal_init" *_requires_* allocating huge pages for 
> configuring the drivers.
> 
> It seems impossible to use the external memory, as 
> "rte_malloc_heap_memory_add" API cannot be used yet.

Yes, there is currently no way to initialize anything using the external 
memory. This is because at the time of initialization, external memory 
is not yet discovered and therefore cannot be acted upon. It /could/ be 
possible to implement this using an EAL plugin, but i have not looked 
into it and know very little about EAL plugin infrastructure, so I 
cannot offer suggestions off the top of my head.

> 
> So, we tried to use regular pages instead of huge pages, using the 
> option of "--no-huge".
> 
> It failed with the following printouts:
> 
> /.../
> 
> /EAL: Multi-process socket /var/run/dpdk/rte/mp_socket/
> 
> */EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not 
> available/*
> 
> */EAL: Cannot use IOVA as 'PA' since physical addresses are not available/*

This happens because no-huge will not attempt to find physical addresses 
of the memory backing the allocated segments.

> 
> /.../
> 
> 1a. Is there a way to use the external memory for “rte_eal_init”?

There is currently no way to do that, no.

> 
> 1b. Why using regular pages, causes DPDK to complain that “*/physical 
> addresses are not available/*”?

We do not populate physical addresses in case of nohuge, as per 
eal_legacy_hugepage_init(). Technically it should be possible to do so 
using calls into rte_eal_virt2phys(), we just don't. I believe the 
rationale is that 1) we have no control over that memory and kernel 
might change its PA's at any time, 2) the init would take a long time 
because there's quite a few pages even in small nohuge segments (and 
we'd need to query pagemap for every single one of them), and 3) nohuge 
is really meant to be a debug option and is not intended for production 
use, so this path is not heavily tested by intent.

> 
> 1c. Why is “—no-huge” option defined as one of "EAL options for DEBUG 
> use only" (in “eal_common_usage” routine)?

That's kind of why it was created: to test DPDK without hugepages. The 
intended use case for DPDK is to be run using hugepages.

> 
> *_2. Explanation for some details in "create_extmem" routine:_*
> 
> 2a. What is the purpose of calling "mlock" before populating IOVA addresses?
> 
> 2b. Why “munlock” is not used afterwards?

We want these pages to stay pinned in memory (i.e. the kernel shouldn't 
be allowed to move them).

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes
  2022-07-25  9:21           ` Burakov, Anatoly
@ 2022-07-25  9:31             ` Asaf Sinai
  0 siblings, 0 replies; 8+ messages in thread
From: Asaf Sinai @ 2022-07-25  9:31 UTC (permalink / raw)
  To: Burakov, Anatoly, dev

Hi Anatoly,

Thank you very much for the helpful information and support!

Regards,
Asaf

-----Original Message-----
From: Burakov, Anatoly <anatoly.burakov@intel.com> 
Sent: Monday, July 25, 2022 12:22
To: Asaf Sinai <AsafSi@Radware.com>; dev@dpdk.org
Subject: Re: DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes

On 18-Jul-22 12:58 PM, Asaf Sinai wrote:
> Hi Anatoly,
> 
> DPDK runs as root, and secondary processes have all the info.
> 
> The problem was as follows:
> 
> The external memory regions are not managed by the Linux OS (by using 
> "memmap=x" in 'grub.conf'). Therefore, the kernel cannot supply their 
> physical addresses.
> 
> So, we added these addresses in code, and now it works fine!
> 
> Thanks for your help!

Happy to hear that!

> 
> We have several additional questions:
> 
> *_1. Usage of huge pages in "rte_eal_init":_*
> 
> We see that "rte_eal_init" *_requires_* allocating huge pages for 
> configuring the drivers.
> 
> It seems impossible to use the external memory, as 
> "rte_malloc_heap_memory_add" API cannot be used yet.

Yes, there is currently no way to initialize anything using the external 
memory. This is because at the time of initialization, external memory 
is not yet discovered and therefore cannot be acted upon. It /could/ be 
possible to implement this using an EAL plugin, but i have not looked 
into it and know very little about EAL plugin infrastructure, so I 
cannot offer suggestions off the top of my head.

> 
> So, we tried to use regular pages instead of huge pages, using the 
> option of "--no-huge".
> 
> It failed with the following printouts:
> 
> /.../
> 
> /EAL: Multi-process socket /var/run/dpdk/rte/mp_socket/
> 
> */EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not 
> available/*
> 
> */EAL: Cannot use IOVA as 'PA' since physical addresses are not available/*

This happens because no-huge will not attempt to find physical addresses 
of the memory backing the allocated segments.

> 
> /.../
> 
> 1a. Is there a way to use the external memory for “rte_eal_init”?

There is currently no way to do that, no.

> 
> 1b. Why using regular pages, causes DPDK to complain that “*/physical 
> addresses are not available/*”?

We do not populate physical addresses in case of nohuge, as per 
eal_legacy_hugepage_init(). Technically it should be possible to do so 
using calls into rte_eal_virt2phys(), we just don't. I believe the 
rationale is that 1) we have no control over that memory and kernel 
might change its PA's at any time, 2) the init would take a long time 
because there's quite a few pages even in small nohuge segments (and 
we'd need to query pagemap for every single one of them), and 3) nohuge 
is really meant to be a debug option and is not intended for production 
use, so this path is not heavily tested by intent.

> 
> 1c. Why is “—no-huge” option defined as one of "EAL options for DEBUG 
> use only" (in “eal_common_usage” routine)?

That's kind of why it was created: to test DPDK without hugepages. The 
intended use case for DPDK is to be run using hugepages.

> 
> *_2. Explanation for some details in "create_extmem" routine:_*
> 
> 2a. What is the purpose of calling "mlock" before populating IOVA addresses?
> 
> 2b. Why “munlock” is not used afterwards?

We want these pages to stay pinned in memory (i.e. the kernel shouldn't 
be allowed to move them).

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-25  9:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-12  6:05 DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes Asaf Sinai
2022-07-12 13:13 ` Burakov, Anatoly
2022-07-14 10:24   ` Asaf Sinai
2022-07-14 10:41     ` Asaf Sinai
2022-07-15 10:17       ` Burakov, Anatoly
2022-07-18 11:58         ` Asaf Sinai
2022-07-25  9:21           ` Burakov, Anatoly
2022-07-25  9:31             ` Asaf Sinai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).