* tailqs issue
@ 2025-03-19 17:50 Lombardo, Ed
2025-03-19 20:23 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-19 17:50 UTC (permalink / raw)
To: users
[-- Attachment #1: Type: text/plain, Size: 4069 bytes --]
Hi,
My goal is to test DPDK applications running on the same server as a primary process and secondary process.
When I execute two dpdk-simple-mp processes, one as primary and other as secondary, I see them both startup with no issues.
# ./dpdk-simple_mp -c 0x2 -n 4 --legacy-mem --proc-type primary --
# ./dpdk-simple_mp -c 0x8 -n 4 --legacy-mem --proc-type secondary --
Now when I test our DPDK application (as primary) and same dpdk-simple-mp (as secondary) I get error "EAL: Cannot initialize tailq: RTE_FIB).
EAL args: MyApp, -l 25,26,27,28 -n 4 -socket-mem=2048, --legacy-mem -no-telemetry -proc_type=primary
# ./dpdk-simple_mp -l 24 -n 4 --legacy-mem --proc-type secondary --
When I use gdb I see that t->head is 0x0 in eal_common_tailqs.c Line 148.
(gdb) p *t
$40 = {head = 0x0, next = {tqe_next = 0x1c68c40 <rte_fib6_tailq>, tqe_prev = 0x1c67108 <rte_swx_ctl_pipeline_tailq+8>},
name = "RTE_FIB", '\000' <repeats 24 times>}
I created 2 - 1G hugepages per CPU socket for each test case listed above.
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
1 2 1Gb 2Gb
The dpdk-simple_mp execution output is shown below:
[root@localhost ~]# ./dpdk-simple_mp -l 24 -n 4 --legacy-mem --huge-dir /dev/mnt/huge --proc-type secondary --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_18349_8160ff395b1
EAL: Selected IOVA mode 'PA'
EAL: WARNING: Address Space Layout Randomization (ASLR) is enabled in the kernel.
EAL: This may cause issues with mapping memory into secondary processes
EAL: Cannot initialize tailq: RTE_FIB
Tailq 0: qname:<RTE_DIST_BURST>, tqh_first:(nil), tqh_last:0x100004490
Tailq 1: qname:<RTE_DISTRIBUTOR>, tqh_first:(nil), tqh_last:0x1000044c0
Tailq 2: qname:<RTE_REORDER>, tqh_first:(nil), tqh_last:0x1000044f0
Tailq 3: qname:<RTE_IPSEC_SAD>, tqh_first:(nil), tqh_last:0x100004520
Tailq 4: qname:<RTE_SWX_IPSEC>, tqh_first:(nil), tqh_last:0x100004550
Tailq 5: qname:<RTE_SWX_PIPELINE>, tqh_first:(nil), tqh_last:0x100004580
Tailq 6: qname:<RTE_SWX_CTL_PIPELINE>, tqh_first:(nil), tqh_last:0x1000045b0
Tailq 7: qname:<RTE_HASH>, tqh_first:0x1bfd9f140, tqh_last:0x1bf6f4240
Tailq 8: qname:<RTE_FBK_HASH>, tqh_first:(nil), tqh_last:0x100004610
Tailq 9: qname:<RTE_THASH>, tqh_first:(nil), tqh_last:0x100004640
Tailq 10: qname:<RTE_LPM>, tqh_first:(nil), tqh_last:0x100004670
Tailq 11: qname:<RTE_LPM6>, tqh_first:(nil), tqh_last:0x1000046a0
Tailq 12: qname:<RTE_ACL>, tqh_first:(nil), tqh_last:0x1000046d0
Tailq 13: qname:<RTE_MEMPOOL>, tqh_first:0x1bf282000, tqh_last:0x1bf282000
Tailq 14: qname:<RTE_RING>, tqh_first:0x1bfdc79c0, tqh_last:0x14f261ac0
Tailq 15: qname:<RTE_MBUF_DYNFIELD>, tqh_first:0x14f871680, tqh_last:0x14f870cc0
Tailq 16: qname:<RTE_MBUF_DYNFLAG>, tqh_first:0x14f871080, tqh_last:0x14f871080
Tailq 17: qname:<UIO_RESOURCE_LIST>, tqh_first:0x1bfffce00, tqh_last:0x1bf939e40
Tailq 18: qname:<VFIO_RESOURCE_LIST>, tqh_first:(nil), tqh_last:0x1000047f0
Tailq 19: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 20: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 21: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 22: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 23: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 24: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 25: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 26: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 27: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 28: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 29: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 30: qname:<>, tqh_first:(nil), tqh_last:(nil)
Tailq 31: qname:<>, tqh_first:(nil), tqh_last:(nil)
EAL: Cannot init tail queues for objects
EAL: Error - exiting with code: 1
Cannot init EAL
How do I resolve this issue?
Thanks,
Ed
[-- Attachment #2: Type: text/html, Size: 8822 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: tailqs issue
2025-03-19 17:50 tailqs issue Lombardo, Ed
@ 2025-03-19 20:23 ` Stephen Hemminger
2025-03-19 21:52 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2025-03-19 20:23 UTC (permalink / raw)
To: Lombardo, Ed; +Cc: users
On Wed, 19 Mar 2025 17:50:46 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi,
> My goal is to test DPDK applications running on the same server as a primary process and secondary process.
> When I execute two dpdk-simple-mp processes, one as primary and other as secondary, I see them both startup with no issues.
>
> # ./dpdk-simple_mp -c 0x2 -n 4 --legacy-mem --proc-type primary --
> # ./dpdk-simple_mp -c 0x8 -n 4 --legacy-mem --proc-type secondary --
>
>
> Now when I test our DPDK application (as primary) and same dpdk-simple-mp (as secondary) I get error "EAL: Cannot initialize tailq: RTE_FIB).
> EAL args: MyApp, -l 25,26,27,28 -n 4 -socket-mem=2048, --legacy-mem -no-telemetry -proc_type=primary
>
> # ./dpdk-simple_mp -l 24 -n 4 --legacy-mem --proc-type secondary --
>
>
> When I use gdb I see that t->head is 0x0 in eal_common_tailqs.c Line 148.
> (gdb) p *t
> $40 = {head = 0x0, next = {tqe_next = 0x1c68c40 <rte_fib6_tailq>, tqe_prev = 0x1c67108 <rte_swx_ctl_pipeline_tailq+8>},
> name = "RTE_FIB", '\000' <repeats 24 times>}
>
> I created 2 - 1G hugepages per CPU socket for each test case listed above.
>
> [root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
> Node Pages Size Total
> 0 2 1Gb 2Gb
> 1 2 1Gb 2Gb
>
>
> The dpdk-simple_mp execution output is shown below:
> [root@localhost ~]# ./dpdk-simple_mp -l 24 -n 4 --legacy-mem --huge-dir /dev/mnt/huge --proc-type secondary --
> EAL: Detected CPU lcores: 128
> EAL: Detected NUMA nodes: 2
> EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_18349_8160ff395b1
> EAL: Selected IOVA mode 'PA'
> EAL: WARNING: Address Space Layout Randomization (ASLR) is enabled in the kernel.
> EAL: This may cause issues with mapping memory into secondary processes
> EAL: Cannot initialize tailq: RTE_FIB
> Tailq 0: qname:<RTE_DIST_BURST>, tqh_first:(nil), tqh_last:0x100004490
> Tailq 1: qname:<RTE_DISTRIBUTOR>, tqh_first:(nil), tqh_last:0x1000044c0
> Tailq 2: qname:<RTE_REORDER>, tqh_first:(nil), tqh_last:0x1000044f0
> Tailq 3: qname:<RTE_IPSEC_SAD>, tqh_first:(nil), tqh_last:0x100004520
> Tailq 4: qname:<RTE_SWX_IPSEC>, tqh_first:(nil), tqh_last:0x100004550
> Tailq 5: qname:<RTE_SWX_PIPELINE>, tqh_first:(nil), tqh_last:0x100004580
> Tailq 6: qname:<RTE_SWX_CTL_PIPELINE>, tqh_first:(nil), tqh_last:0x1000045b0
> Tailq 7: qname:<RTE_HASH>, tqh_first:0x1bfd9f140, tqh_last:0x1bf6f4240
> Tailq 8: qname:<RTE_FBK_HASH>, tqh_first:(nil), tqh_last:0x100004610
> Tailq 9: qname:<RTE_THASH>, tqh_first:(nil), tqh_last:0x100004640
> Tailq 10: qname:<RTE_LPM>, tqh_first:(nil), tqh_last:0x100004670
> Tailq 11: qname:<RTE_LPM6>, tqh_first:(nil), tqh_last:0x1000046a0
> Tailq 12: qname:<RTE_ACL>, tqh_first:(nil), tqh_last:0x1000046d0
> Tailq 13: qname:<RTE_MEMPOOL>, tqh_first:0x1bf282000, tqh_last:0x1bf282000
> Tailq 14: qname:<RTE_RING>, tqh_first:0x1bfdc79c0, tqh_last:0x14f261ac0
> Tailq 15: qname:<RTE_MBUF_DYNFIELD>, tqh_first:0x14f871680, tqh_last:0x14f870cc0
> Tailq 16: qname:<RTE_MBUF_DYNFLAG>, tqh_first:0x14f871080, tqh_last:0x14f871080
> Tailq 17: qname:<UIO_RESOURCE_LIST>, tqh_first:0x1bfffce00, tqh_last:0x1bf939e40
> Tailq 18: qname:<VFIO_RESOURCE_LIST>, tqh_first:(nil), tqh_last:0x1000047f0
> Tailq 19: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 20: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 21: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 22: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 23: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 24: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 25: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 26: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 27: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 28: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 29: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 30: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 31: qname:<>, tqh_first:(nil), tqh_last:(nil)
> EAL: Cannot init tail queues for objects
> EAL: Error - exiting with code: 1
> Cannot init EAL
>
> How do I resolve this issue?
>
> Thanks,
> Ed
The problem is that the primary process has not linked in the fib library.
The primary process is the only one that can register tailq's at initialization.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-19 20:23 ` Stephen Hemminger
@ 2025-03-19 21:52 ` Lombardo, Ed
2025-03-19 23:16 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-19 21:52 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
Hi Stephen,
I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
I am doing incremental build-test-find_missing_library.
So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack -lrte_member -lrte_efd
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 4:24 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 17:50:46 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi,
> My goal is to test DPDK applications running on the same server as a primary process and secondary process.
> When I execute two dpdk-simple-mp processes, one as primary and other as secondary, I see them both startup with no issues.
>
> # ./dpdk-simple_mp -c 0x2 -n 4 --legacy-mem --proc-type primary -- #
> ./dpdk-simple_mp -c 0x8 -n 4 --legacy-mem --proc-type secondary --
>
>
> Now when I test our DPDK application (as primary) and same dpdk-simple-mp (as secondary) I get error "EAL: Cannot initialize tailq: RTE_FIB).
> EAL args: MyApp, -l 25,26,27,28 -n 4 -socket-mem=2048,
> --legacy-mem -no-telemetry -proc_type=primary
>
> # ./dpdk-simple_mp -l 24 -n 4 --legacy-mem --proc-type
> secondary --
>
>
> When I use gdb I see that t->head is 0x0 in eal_common_tailqs.c Line 148.
> (gdb) p *t
> $40 = {head = 0x0, next = {tqe_next = 0x1c68c40 <rte_fib6_tailq>, tqe_prev = 0x1c67108 <rte_swx_ctl_pipeline_tailq+8>},
> name = "RTE_FIB", '\000' <repeats 24 times>}
>
> I created 2 - 1G hugepages per CPU socket for each test case listed above.
>
> [root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s Node Pages Size
> Total
> 0 2 1Gb 2Gb
> 1 2 1Gb 2Gb
>
>
> The dpdk-simple_mp execution output is shown below:
> [root@localhost ~]# ./dpdk-simple_mp -l 24 -n 4 --legacy-mem
> --huge-dir /dev/mnt/huge --proc-type secondary --
> EAL: Detected CPU lcores: 128
> EAL: Detected NUMA nodes: 2
> EAL: Static memory layout is selected, amount of reserved memory can
> be adjusted with -m or --socket-mem
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket
> /var/run/dpdk/rte/mp_socket_18349_8160ff395b1
> EAL: Selected IOVA mode 'PA'
> EAL: WARNING: Address Space Layout Randomization (ASLR) is enabled in the kernel.
> EAL: This may cause issues with mapping memory into secondary processes
> EAL: Cannot initialize tailq: RTE_FIB
> Tailq 0: qname:<RTE_DIST_BURST>, tqh_first:(nil), tqh_last:0x100004490
> Tailq 1: qname:<RTE_DISTRIBUTOR>, tqh_first:(nil),
> tqh_last:0x1000044c0 Tailq 2: qname:<RTE_REORDER>, tqh_first:(nil),
> tqh_last:0x1000044f0 Tailq 3: qname:<RTE_IPSEC_SAD>, tqh_first:(nil),
> tqh_last:0x100004520 Tailq 4: qname:<RTE_SWX_IPSEC>, tqh_first:(nil),
> tqh_last:0x100004550 Tailq 5: qname:<RTE_SWX_PIPELINE>,
> tqh_first:(nil), tqh_last:0x100004580 Tailq 6:
> qname:<RTE_SWX_CTL_PIPELINE>, tqh_first:(nil), tqh_last:0x1000045b0
> Tailq 7: qname:<RTE_HASH>, tqh_first:0x1bfd9f140, tqh_last:0x1bf6f4240
> Tailq 8: qname:<RTE_FBK_HASH>, tqh_first:(nil), tqh_last:0x100004610
> Tailq 9: qname:<RTE_THASH>, tqh_first:(nil), tqh_last:0x100004640
> Tailq 10: qname:<RTE_LPM>, tqh_first:(nil), tqh_last:0x100004670 Tailq
> 11: qname:<RTE_LPM6>, tqh_first:(nil), tqh_last:0x1000046a0 Tailq 12:
> qname:<RTE_ACL>, tqh_first:(nil), tqh_last:0x1000046d0 Tailq 13:
> qname:<RTE_MEMPOOL>, tqh_first:0x1bf282000, tqh_last:0x1bf282000 Tailq
> 14: qname:<RTE_RING>, tqh_first:0x1bfdc79c0, tqh_last:0x14f261ac0
> Tailq 15: qname:<RTE_MBUF_DYNFIELD>, tqh_first:0x14f871680,
> tqh_last:0x14f870cc0 Tailq 16: qname:<RTE_MBUF_DYNFLAG>,
> tqh_first:0x14f871080, tqh_last:0x14f871080 Tailq 17:
> qname:<UIO_RESOURCE_LIST>, tqh_first:0x1bfffce00, tqh_last:0x1bf939e40
> Tailq 18: qname:<VFIO_RESOURCE_LIST>, tqh_first:(nil),
> tqh_last:0x1000047f0 Tailq 19: qname:<>, tqh_first:(nil),
> tqh_last:(nil) Tailq 20: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 21: qname:<>, tqh_first:(nil), tqh_last:(nil) Tailq 22:
> qname:<>, tqh_first:(nil), tqh_last:(nil) Tailq 23: qname:<>,
> tqh_first:(nil), tqh_last:(nil) Tailq 24: qname:<>, tqh_first:(nil),
> tqh_last:(nil) Tailq 25: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 26: qname:<>, tqh_first:(nil), tqh_last:(nil) Tailq 27:
> qname:<>, tqh_first:(nil), tqh_last:(nil) Tailq 28: qname:<>,
> tqh_first:(nil), tqh_last:(nil) Tailq 29: qname:<>, tqh_first:(nil),
> tqh_last:(nil) Tailq 30: qname:<>, tqh_first:(nil), tqh_last:(nil)
> Tailq 31: qname:<>, tqh_first:(nil), tqh_last:(nil)
> EAL: Cannot init tail queues for objects
> EAL: Error - exiting with code: 1
> Cannot init EAL
>
> How do I resolve this issue?
>
> Thanks,
> Ed
The problem is that the primary process has not linked in the fib library.
The primary process is the only one that can register tailq's at initialization.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: tailqs issue
2025-03-19 21:52 ` Lombardo, Ed
@ 2025-03-19 23:16 ` Stephen Hemminger
2025-03-21 18:18 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2025-03-19 23:16 UTC (permalink / raw)
To: Lombardo, Ed; +Cc: users
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-19 23:16 ` Stephen Hemminger
@ 2025-03-21 18:18 ` Lombardo, Ed
2025-03-24 5:01 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-21 18:18 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue ()
[Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))]
Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-21 18:18 ` Lombardo, Ed
@ 2025-03-24 5:01 ` Lombardo, Ed
2025-03-24 14:59 ` Kompella V, Purnima
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-24 5:01 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
Hi,
Further debugging has left me clueless as to the problem with getting the first Obj from the mempool on the secondary process to send to the primary process (my Application).
PRIMARY:
My application side (primary process) EAL Arguments:
"--proc-type=primary", "--file-prefix=dpdk_shared", "-l 25,26,27,28", "-n4" , "--socket-mem=2048", "--legacy-mem",
Create the _MSG_POOL
t_message_pool = rte_mempool_create(_MSG_POOL, // Pool name
t_pool_size, // Number of elements in the pool
STR_TOKEN_SIZE, // Size of each message
t_pool_cache, // Cache size
t_priv_data_sz, // Private data size
NULL, // mp_init
NULL, // mp_init_arg
NULL, // obj_init
NULL, // obj_init_arg
0, // rte_socket_id(), // socket_id
t_flags // flags
);
The t_message_pool pointer value matches the secondary process when execute " message_pool = rte_mempool_lookup(_MSG_POOL);"
NTSMON: send ring (0x1bfb43480), recv ring (0x1bfb45a00) and message pool (0x14f570ac0) are created successfully.
SECONDARY:
Secondary process I execute: " # ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --"
Notes:
* hugepages are created 2x1G on NUMA Node 0
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
* --file-prefix=dpdk_shared is provided to both primary and secondary
* --proc-type is correctly defined for both primary and secondary process.
* The secondary process reports correct socket=0 (See dump command output below)
* The secondary process showed Available mempool count is 0 and in-use count is 1024 (which looks incorrect).
* Secondary process reports mempool is empty
* Secondary audit passed (rte_mempool_audit()), no panic occurred.
* Tried disabling ASLR
* Tried turning off legacy-mem
EXECUTE Secondary process application "dpdk-simple_mp_dbg3"
[root@localhost ~]# ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --legacy-mem --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/dpdk_shared/mp_socket_221890_1ba0562a6ae95
EAL: Selected IOVA mode 'PA'
APP: Finished Process Init.
APP: Availble mempool count is 0, in-use count is 1024, (mempool ptr=0x14f570ac0)
APP: is mempool empty (1) or full (0)?
APP: check audit on message_pool
APP: Secondary mempool dump file write
APP: NO Objs in message pool (_MSG_POOL), exit the app
simple_mp > Starting core 31
simple_mp > send hello
message_pool pointer is 0x14f570ac0
Failed to get msg obj from mem pool: Success (ret=-2)
EAL: PANIC in cmd_send_parsed():
Failed to get message buffer
0: ./dpdk-simple_mp_dbg3 (rte_dump_stack+0x2b) [b1ee1b]
1: ./dpdk-simple_mp_dbg3 (__rte_panic+0xbd) [525ede]
2: ./dpdk-simple_mp_dbg3 (cmd_send_parsed+0x2d8) [7ceb68]
3: ./dpdk-simple_mp_dbg3 (400000+0x6a5f46) [aa5f46]
4: ./dpdk-simple_mp_dbg3 (400000+0x6a4f20) [aa4f20]
5: ./dpdk-simple_mp_dbg3 (rdline_char_in+0x34b) [aa841b]
6: ./dpdk-simple_mp_dbg3 (cmdline_in+0x71) [aa4ff1]
7: ./dpdk-simple_mp_dbg3 (cmdline_interact+0x30) [aa5110]
8: ./dpdk-simple_mp_dbg3 (400000+0xfe5cb) [4fe5cb]
9: /lib64/libc.so.6 (7f2d76600000+0x3feb0) [7f2d7663feb0]
10: /lib64/libc.so.6 (__libc_start_main+0x80) [7f2d7663ff60]
11: ./dpdk-simple_mp_dbg3 (_start+0x25) [7ce795]
Aborted (core dumped)
The rte_mempool_dump() is:
mempool <MSG_POOL>@0x14f570ac0
flags=1c
socket_id=0
pool=0x14f56c8c0
iova=0x1cf570ac0
nrte_mempool_getb_mem_chunks=1
size=1024
populated_size=1024
header_size=64
elt_size=64
trailer_size=64
total_obj_size=192
private_data_size=0
ops_index=1
ops_name: <cn9k_mempool_ops>
memory chunk at 0x1bfb43180, addr=0x14f53c780, iova=0x1cf53c780, len=196800
avg bytes/object=192.187500
internal cache infos:
cache_size=32 cache_count[0]=0
.....
cache_count[127]=0
total_cache_count=0
common_pool_count=1024
no statistics available
Any guidance is appreciated.
Thanks,
Ed
-----Original Message-----
From: Lombardo, Ed
Sent: Friday, March 21, 2025 2:19 PM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue () [Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))] Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack
> -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-24 5:01 ` Lombardo, Ed
@ 2025-03-24 14:59 ` Kompella V, Purnima
2025-03-24 16:39 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Kompella V, Purnima @ 2025-03-24 14:59 UTC (permalink / raw)
To: Lombardo, Ed, Stephen Hemminger; +Cc: users
Hi,
I am not sure if this could be your problem, but we had run into a similar issue with mbuf pools.
Do you list the DPDK libs in the exact same order when linking Primary and Secondary process? If not,
the OPs to which the ops_index is pointing to in Primary and in secondary is different.
Due to this, rte_mempool_ops_dequeue_bulk called from Secondary does not match the OPs that Primary had used to populate this memPool (during create)
So, accessing the pool from Secondary appears like the pool is empty but non-zero in-use Count.
This happens because each DPDK lib "appends" to the OPs list on the fly as it is linked by the linker.
So the order of OPs in the OPs database is a mismatch between Primary and Secondary. Primary and Secondary share information about OPs using ops_index, but since the order is different, ops_index=X in Primary not the same OPs in Secondary.
And this leads to the segmentation fault.
One of your processes is showing ops_index=1, ops_name: <cn9k_mempool_ops>. Can you confirm it is the same in the other process also?
By OPs, and ops_index, I am referring to this internal data structure of DPDK fw.
struct rte_mempool_ops {
char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
rte_mempool_alloc_t alloc; /**< Allocate private data. */
rte_mempool_free_t free; /**< Free the external pool. */
rte_mempool_enqueue_t enqueue; /**< Enqueue an object. */
rte_mempool_dequeue_t dequeue; /**< Dequeue an object. */
rte_mempool_get_count get_count; /**< Get qty of available objs. */
/**
* Optional callback to calculate memory size required to
* store specified number of objects.
*/
rte_mempool_calc_mem_size_t calc_mem_size;
/**
* Optional callback to populate mempool objects using
* provided memory chunk.
*/
rte_mempool_populate_t populate;
/**
* Get mempool info
*/
rte_mempool_get_info_t get_info;
/**
* Dequeue a number of contiguous object blocks.
*/
rte_mempool_dequeue_contig_blocks_t dequeue_contig_blocks;
} __rte_cache_aligned;
Also, the link provides some insight into how OPs is significant.
https://www.intel.com/content/www/us/en/developer/articles/technical/optimize-memory-usage-in-multi-threaded-data-plane-development-kit-dpdk-applications.html
Regards,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:32 AM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
Further debugging has left me clueless as to the problem with getting the first Obj from the mempool on the secondary process to send to the primary process (my Application).
PRIMARY:
My application side (primary process) EAL Arguments:
"--proc-type=primary", "--file-prefix=dpdk_shared", "-l 25,26,27,28", "-n4" , "--socket-mem=2048", "--legacy-mem",
Create the _MSG_POOL
t_message_pool = rte_mempool_create(_MSG_POOL, // Pool name
t_pool_size, // Number of elements in the pool
STR_TOKEN_SIZE, // Size of each message
t_pool_cache, // Cache size
t_priv_data_sz, // Private data size
NULL, // mp_init
NULL, // mp_init_arg
NULL, // obj_init
NULL, // obj_init_arg
0, // rte_socket_id(), // socket_id
t_flags // flags
);
The t_message_pool pointer value matches the secondary process when execute " message_pool = rte_mempool_lookup(_MSG_POOL);"
NTSMON: send ring (0x1bfb43480), recv ring (0x1bfb45a00) and message pool (0x14f570ac0) are created successfully.
SECONDARY:
Secondary process I execute: " # ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --"
Notes:
* hugepages are created 2x1G on NUMA Node 0
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
* --file-prefix=dpdk_shared is provided to both primary and secondary
* --proc-type is correctly defined for both primary and secondary process.
* The secondary process reports correct socket=0 (See dump command output below)
* The secondary process showed Available mempool count is 0 and in-use count is 1024 (which looks incorrect).
* Secondary process reports mempool is empty
* Secondary audit passed (rte_mempool_audit()), no panic occurred.
* Tried disabling ASLR
* Tried turning off legacy-mem
EXECUTE Secondary process application "dpdk-simple_mp_dbg3"
[root@localhost ~]# ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --legacy-mem --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/dpdk_shared/mp_socket_221890_1ba0562a6ae95
EAL: Selected IOVA mode 'PA'
APP: Finished Process Init.
APP: Availble mempool count is 0, in-use count is 1024, (mempool ptr=0x14f570ac0)
APP: is mempool empty (1) or full (0)?
APP: check audit on message_pool
APP: Secondary mempool dump file write
APP: NO Objs in message pool (_MSG_POOL), exit the app
simple_mp > Starting core 31
simple_mp > send hello
message_pool pointer is 0x14f570ac0
Failed to get msg obj from mem pool: Success (ret=-2)
EAL: PANIC in cmd_send_parsed():
Failed to get message buffer
0: ./dpdk-simple_mp_dbg3 (rte_dump_stack+0x2b) [b1ee1b]
1: ./dpdk-simple_mp_dbg3 (__rte_panic+0xbd) [525ede]
2: ./dpdk-simple_mp_dbg3 (cmd_send_parsed+0x2d8) [7ceb68]
3: ./dpdk-simple_mp_dbg3 (400000+0x6a5f46) [aa5f46]
4: ./dpdk-simple_mp_dbg3 (400000+0x6a4f20) [aa4f20]
5: ./dpdk-simple_mp_dbg3 (rdline_char_in+0x34b) [aa841b]
6: ./dpdk-simple_mp_dbg3 (cmdline_in+0x71) [aa4ff1]
7: ./dpdk-simple_mp_dbg3 (cmdline_interact+0x30) [aa5110]
8: ./dpdk-simple_mp_dbg3 (400000+0xfe5cb) [4fe5cb]
9: /lib64/libc.so.6 (7f2d76600000+0x3feb0) [7f2d7663feb0]
10: /lib64/libc.so.6 (__libc_start_main+0x80) [7f2d7663ff60]
11: ./dpdk-simple_mp_dbg3 (_start+0x25) [7ce795] Aborted (core dumped)
The rte_mempool_dump() is:
mempool <MSG_POOL>@0x14f570ac0
flags=1c
socket_id=0
pool=0x14f56c8c0
iova=0x1cf570ac0
nrte_mempool_getb_mem_chunks=1
size=1024
populated_size=1024
header_size=64
elt_size=64
trailer_size=64
total_obj_size=192
private_data_size=0
ops_index=1
ops_name: <cn9k_mempool_ops>
memory chunk at 0x1bfb43180, addr=0x14f53c780, iova=0x1cf53c780, len=196800
avg bytes/object=192.187500
internal cache infos:
cache_size=32 cache_count[0]=0
.....
cache_count[127]=0
total_cache_count=0
common_pool_count=1024
no statistics available
Any guidance is appreciated.
Thanks,
Ed
-----Original Message-----
From: Lombardo, Ed
Sent: Friday, March 21, 2025 2:19 PM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue () [Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))] Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack
> -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-24 14:59 ` Kompella V, Purnima
@ 2025-03-24 16:39 ` Lombardo, Ed
2025-03-25 10:25 ` Kompella V, Purnima
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-24 16:39 UTC (permalink / raw)
To: Kompella V, Purnima, Stephen Hemminger; +Cc: users
Hi,
I have both the primary and secondary ops_name (s) and they are different.
The two files are identical, except for these:
PRIMARY:
ops_name: <ring_sp_sc>
common_pool_count=1024
SECONDARY:
ops_name: <cn9k_mempool_ops>
common_pool_count=0
[root@localhost ~]# diff prim_mempool_dump.txt ex_second_mempool_dump.txt
15c15
< ops_name: <ring_sp_sc>
---
> ops_name: <cn9k_mempool_ops>
149c149
< common_pool_count=1024
---
> common_pool_count=0
Should the ops_name be the same among the primary and secondary processes?
I build out application in our build environment and the dpdk-simple_mp is from the dpdk sdk (meson/ninja).
Thanks,
Ed
-----Original Message-----
From: Kompella V, Purnima <Kompella.Purnima@commscope.com>
Sent: Monday, March 24, 2025 11:00 AM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi,
I am not sure if this could be your problem, but we had run into a similar issue with mbuf pools.
Do you list the DPDK libs in the exact same order when linking Primary and Secondary process? If not,
the OPs to which the ops_index is pointing to in Primary and in secondary is different.
Due to this, rte_mempool_ops_dequeue_bulk called from Secondary does not match the OPs that Primary had used to populate this memPool (during create)
So, accessing the pool from Secondary appears like the pool is empty but non-zero in-use Count.
This happens because each DPDK lib "appends" to the OPs list on the fly as it is linked by the linker.
So the order of OPs in the OPs database is a mismatch between Primary and Secondary. Primary and Secondary share information about OPs using ops_index, but since the order is different, ops_index=X in Primary not the same OPs in Secondary.
And this leads to the segmentation fault.
One of your processes is showing ops_index=1, ops_name: <cn9k_mempool_ops>. Can you confirm it is the same in the other process also?
By OPs, and ops_index, I am referring to this internal data structure of DPDK fw.
struct rte_mempool_ops {
char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
rte_mempool_alloc_t alloc; /**< Allocate private data. */
rte_mempool_free_t free; /**< Free the external pool. */
rte_mempool_enqueue_t enqueue; /**< Enqueue an object. */
rte_mempool_dequeue_t dequeue; /**< Dequeue an object. */
rte_mempool_get_count get_count; /**< Get qty of available objs. */
/**
* Optional callback to calculate memory size required to
* store specified number of objects.
*/
rte_mempool_calc_mem_size_t calc_mem_size;
/**
* Optional callback to populate mempool objects using
* provided memory chunk.
*/
rte_mempool_populate_t populate;
/**
* Get mempool info
*/
rte_mempool_get_info_t get_info;
/**
* Dequeue a number of contiguous object blocks.
*/
rte_mempool_dequeue_contig_blocks_t dequeue_contig_blocks; } __rte_cache_aligned;
Also, the link provides some insight into how OPs is significant.
https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/technical/optimize-memory-usage-in-multi-threaded-data-plane-development-kit-dpdk-applications.html__;!!Nzg7nt7_!ElXZ6jVzd3ey8732oohMXbI9iJA_a5BF67ev7sypu4Vf8LAfVMQzZiEReuDIb-jaLhNO6PswM4RTXVt-qxmR3d_493_Z2SSC$
Regards,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:32 AM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
Further debugging has left me clueless as to the problem with getting the first Obj from the mempool on the secondary process to send to the primary process (my Application).
PRIMARY:
My application side (primary process) EAL Arguments:
"--proc-type=primary", "--file-prefix=dpdk_shared", "-l 25,26,27,28", "-n4" , "--socket-mem=2048", "--legacy-mem",
Create the _MSG_POOL
t_message_pool = rte_mempool_create(_MSG_POOL, // Pool name
t_pool_size, // Number of elements in the pool
STR_TOKEN_SIZE, // Size of each message
t_pool_cache, // Cache size
t_priv_data_sz, // Private data size
NULL, // mp_init
NULL, // mp_init_arg
NULL, // obj_init
NULL, // obj_init_arg
0, // rte_socket_id(), // socket_id
t_flags // flags
);
The t_message_pool pointer value matches the secondary process when execute " message_pool = rte_mempool_lookup(_MSG_POOL);"
NTSMON: send ring (0x1bfb43480), recv ring (0x1bfb45a00) and message pool (0x14f570ac0) are created successfully.
SECONDARY:
Secondary process I execute: " # ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --"
Notes:
* hugepages are created 2x1G on NUMA Node 0
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
* --file-prefix=dpdk_shared is provided to both primary and secondary
* --proc-type is correctly defined for both primary and secondary process.
* The secondary process reports correct socket=0 (See dump command output below)
* The secondary process showed Available mempool count is 0 and in-use count is 1024 (which looks incorrect).
* Secondary process reports mempool is empty
* Secondary audit passed (rte_mempool_audit()), no panic occurred.
* Tried disabling ASLR
* Tried turning off legacy-mem
EXECUTE Secondary process application "dpdk-simple_mp_dbg3"
[root@localhost ~]# ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --legacy-mem --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/dpdk_shared/mp_socket_221890_1ba0562a6ae95
EAL: Selected IOVA mode 'PA'
APP: Finished Process Init.
APP: Availble mempool count is 0, in-use count is 1024, (mempool ptr=0x14f570ac0)
APP: is mempool empty (1) or full (0)?
APP: check audit on message_pool
APP: Secondary mempool dump file write
APP: NO Objs in message pool (_MSG_POOL), exit the app
simple_mp > Starting core 31
simple_mp > send hello
message_pool pointer is 0x14f570ac0
Failed to get msg obj from mem pool: Success (ret=-2)
EAL: PANIC in cmd_send_parsed():
Failed to get message buffer
0: ./dpdk-simple_mp_dbg3 (rte_dump_stack+0x2b) [b1ee1b]
1: ./dpdk-simple_mp_dbg3 (__rte_panic+0xbd) [525ede]
2: ./dpdk-simple_mp_dbg3 (cmd_send_parsed+0x2d8) [7ceb68]
3: ./dpdk-simple_mp_dbg3 (400000+0x6a5f46) [aa5f46]
4: ./dpdk-simple_mp_dbg3 (400000+0x6a4f20) [aa4f20]
5: ./dpdk-simple_mp_dbg3 (rdline_char_in+0x34b) [aa841b]
6: ./dpdk-simple_mp_dbg3 (cmdline_in+0x71) [aa4ff1]
7: ./dpdk-simple_mp_dbg3 (cmdline_interact+0x30) [aa5110]
8: ./dpdk-simple_mp_dbg3 (400000+0xfe5cb) [4fe5cb]
9: /lib64/libc.so.6 (7f2d76600000+0x3feb0) [7f2d7663feb0]
10: /lib64/libc.so.6 (__libc_start_main+0x80) [7f2d7663ff60]
11: ./dpdk-simple_mp_dbg3 (_start+0x25) [7ce795] Aborted (core dumped)
The rte_mempool_dump() is:
mempool <MSG_POOL>@0x14f570ac0
flags=1c
socket_id=0
pool=0x14f56c8c0
iova=0x1cf570ac0
nrte_mempool_getb_mem_chunks=1
size=1024
populated_size=1024
header_size=64
elt_size=64
trailer_size=64
total_obj_size=192
private_data_size=0
ops_index=1
ops_name: <cn9k_mempool_ops>
memory chunk at 0x1bfb43180, addr=0x14f53c780, iova=0x1cf53c780, len=196800
avg bytes/object=192.187500
internal cache infos:
cache_size=32 cache_count[0]=0
.....
cache_count[127]=0
total_cache_count=0
common_pool_count=1024
no statistics available
Any guidance is appreciated.
Thanks,
Ed
-----Original Message-----
From: Lombardo, Ed
Sent: Friday, March 21, 2025 2:19 PM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue () [Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))] Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack
> -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-24 16:39 ` Lombardo, Ed
@ 2025-03-25 10:25 ` Kompella V, Purnima
2025-03-25 14:39 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Kompella V, Purnima @ 2025-03-25 10:25 UTC (permalink / raw)
To: Lombardo, Ed, Stephen Hemminger; +Cc: users
Hi Ed,
GDB BT seems to indicate one of your processes (that one that suffered segfault) uses <bucket> OP, not <ring_sp_sc> or <cn9k_mempool_ops> as you have listed!
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
In any case, can you try by keeping the "order of dpdk libs" exactly the same when compiling the Primary process and the Secondary process?
This way, both processes will get the same OPs database and hence ops_index=X will refer to same OPs from either of them and hopefully things should work fine.
Thanks,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:09 PM
To: Kompella V, Purnima <Kompella.Purnima@commscope.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
I have both the primary and secondary ops_name (s) and they are different.
The two files are identical, except for these:
PRIMARY:
ops_name: <ring_sp_sc>
common_pool_count=1024
SECONDARY:
ops_name: <cn9k_mempool_ops>
common_pool_count=0
[root@localhost ~]# diff prim_mempool_dump.txt ex_second_mempool_dump.txt
15c15
< ops_name: <ring_sp_sc>
---
> ops_name: <cn9k_mempool_ops>
149c149
< common_pool_count=1024
---
> common_pool_count=0
Should the ops_name be the same among the primary and secondary processes?
I build out application in our build environment and the dpdk-simple_mp is from the dpdk sdk (meson/ninja).
Thanks,
Ed
-----Original Message-----
From: Kompella V, Purnima <Kompella.Purnima@commscope.com>
Sent: Monday, March 24, 2025 11:00 AM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi,
I am not sure if this could be your problem, but we had run into a similar issue with mbuf pools.
Do you list the DPDK libs in the exact same order when linking Primary and Secondary process? If not,
the OPs to which the ops_index is pointing to in Primary and in secondary is different.
Due to this, rte_mempool_ops_dequeue_bulk called from Secondary does not match the OPs that Primary had used to populate this memPool (during create)
So, accessing the pool from Secondary appears like the pool is empty but non-zero in-use Count.
This happens because each DPDK lib "appends" to the OPs list on the fly as it is linked by the linker.
So the order of OPs in the OPs database is a mismatch between Primary and Secondary. Primary and Secondary share information about OPs using ops_index, but since the order is different, ops_index=X in Primary not the same OPs in Secondary.
And this leads to the segmentation fault.
One of your processes is showing ops_index=1, ops_name: <cn9k_mempool_ops>. Can you confirm it is the same in the other process also?
By OPs, and ops_index, I am referring to this internal data structure of DPDK fw.
struct rte_mempool_ops {
char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
rte_mempool_alloc_t alloc; /**< Allocate private data. */
rte_mempool_free_t free; /**< Free the external pool. */
rte_mempool_enqueue_t enqueue; /**< Enqueue an object. */
rte_mempool_dequeue_t dequeue; /**< Dequeue an object. */
rte_mempool_get_count get_count; /**< Get qty of available objs. */
/**
* Optional callback to calculate memory size required to
* store specified number of objects.
*/
rte_mempool_calc_mem_size_t calc_mem_size;
/**
* Optional callback to populate mempool objects using
* provided memory chunk.
*/
rte_mempool_populate_t populate;
/**
* Get mempool info
*/
rte_mempool_get_info_t get_info;
/**
* Dequeue a number of contiguous object blocks.
*/
rte_mempool_dequeue_contig_blocks_t dequeue_contig_blocks; } __rte_cache_aligned;
Also, the link provides some insight into how OPs is significant.
https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/technical/optimize-memory-usage-in-multi-threaded-data-plane-development-kit-dpdk-applications.html__;!!Nzg7nt7_!ElXZ6jVzd3ey8732oohMXbI9iJA_a5BF67ev7sypu4Vf8LAfVMQzZiEReuDIb-jaLhNO6PswM4RTXVt-qxmR3d_493_Z2SSC$
Regards,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:32 AM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
Further debugging has left me clueless as to the problem with getting the first Obj from the mempool on the secondary process to send to the primary process (my Application).
PRIMARY:
My application side (primary process) EAL Arguments:
"--proc-type=primary", "--file-prefix=dpdk_shared", "-l 25,26,27,28", "-n4" , "--socket-mem=2048", "--legacy-mem",
Create the _MSG_POOL
t_message_pool = rte_mempool_create(_MSG_POOL, // Pool name
t_pool_size, // Number of elements in the pool
STR_TOKEN_SIZE, // Size of each message
t_pool_cache, // Cache size
t_priv_data_sz, // Private data size
NULL, // mp_init
NULL, // mp_init_arg
NULL, // obj_init
NULL, // obj_init_arg
0, // rte_socket_id(), // socket_id
t_flags // flags
);
The t_message_pool pointer value matches the secondary process when execute " message_pool = rte_mempool_lookup(_MSG_POOL);"
NTSMON: send ring (0x1bfb43480), recv ring (0x1bfb45a00) and message pool (0x14f570ac0) are created successfully.
SECONDARY:
Secondary process I execute: " # ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --"
Notes:
* hugepages are created 2x1G on NUMA Node 0
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
* --file-prefix=dpdk_shared is provided to both primary and secondary
* --proc-type is correctly defined for both primary and secondary process.
* The secondary process reports correct socket=0 (See dump command output below)
* The secondary process showed Available mempool count is 0 and in-use count is 1024 (which looks incorrect).
* Secondary process reports mempool is empty
* Secondary audit passed (rte_mempool_audit()), no panic occurred.
* Tried disabling ASLR
* Tried turning off legacy-mem
EXECUTE Secondary process application "dpdk-simple_mp_dbg3"
[root@localhost ~]# ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --legacy-mem --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/dpdk_shared/mp_socket_221890_1ba0562a6ae95
EAL: Selected IOVA mode 'PA'
APP: Finished Process Init.
APP: Availble mempool count is 0, in-use count is 1024, (mempool ptr=0x14f570ac0)
APP: is mempool empty (1) or full (0)?
APP: check audit on message_pool
APP: Secondary mempool dump file write
APP: NO Objs in message pool (_MSG_POOL), exit the app
simple_mp > Starting core 31
simple_mp > send hello
message_pool pointer is 0x14f570ac0
Failed to get msg obj from mem pool: Success (ret=-2)
EAL: PANIC in cmd_send_parsed():
Failed to get message buffer
0: ./dpdk-simple_mp_dbg3 (rte_dump_stack+0x2b) [b1ee1b]
1: ./dpdk-simple_mp_dbg3 (__rte_panic+0xbd) [525ede]
2: ./dpdk-simple_mp_dbg3 (cmd_send_parsed+0x2d8) [7ceb68]
3: ./dpdk-simple_mp_dbg3 (400000+0x6a5f46) [aa5f46]
4: ./dpdk-simple_mp_dbg3 (400000+0x6a4f20) [aa4f20]
5: ./dpdk-simple_mp_dbg3 (rdline_char_in+0x34b) [aa841b]
6: ./dpdk-simple_mp_dbg3 (cmdline_in+0x71) [aa4ff1]
7: ./dpdk-simple_mp_dbg3 (cmdline_interact+0x30) [aa5110]
8: ./dpdk-simple_mp_dbg3 (400000+0xfe5cb) [4fe5cb]
9: /lib64/libc.so.6 (7f2d76600000+0x3feb0) [7f2d7663feb0]
10: /lib64/libc.so.6 (__libc_start_main+0x80) [7f2d7663ff60]
11: ./dpdk-simple_mp_dbg3 (_start+0x25) [7ce795] Aborted (core dumped)
The rte_mempool_dump() is:
mempool <MSG_POOL>@0x14f570ac0
flags=1c
socket_id=0
pool=0x14f56c8c0
iova=0x1cf570ac0
nrte_mempool_getb_mem_chunks=1
size=1024
populated_size=1024
header_size=64
elt_size=64
trailer_size=64
total_obj_size=192
private_data_size=0
ops_index=1
ops_name: <cn9k_mempool_ops>
memory chunk at 0x1bfb43180, addr=0x14f53c780, iova=0x1cf53c780, len=196800
avg bytes/object=192.187500
internal cache infos:
cache_size=32 cache_count[0]=0
.....
cache_count[127]=0
total_cache_count=0
common_pool_count=1024
no statistics available
Any guidance is appreciated.
Thanks,
Ed
-----Original Message-----
From: Lombardo, Ed
Sent: Friday, March 21, 2025 2:19 PM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue () [Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))] Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack
> -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-25 10:25 ` Kompella V, Purnima
@ 2025-03-25 14:39 ` Lombardo, Ed
2025-03-25 22:20 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-25 14:39 UTC (permalink / raw)
To: Kompella V, Purnima, Stephen Hemminger; +Cc: users
Hi Purnima,
I will try your suggestion, but this seems weird. What if I have a 3rd party application that I want to integrate with our application. This could be impossible to coordinate this requirement.
Thanks,
Ed
-----Original Message-----
From: Kompella V, Purnima <Kompella.Purnima@commscope.com>
Sent: Tuesday, March 25, 2025 6:26 AM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Ed,
GDB BT seems to indicate one of your processes (that one that suffered segfault) uses <bucket> OP, not <ring_sp_sc> or <cn9k_mempool_ops> as you have listed!
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
In any case, can you try by keeping the "order of dpdk libs" exactly the same when compiling the Primary process and the Secondary process?
This way, both processes will get the same OPs database and hence ops_index=X will refer to same OPs from either of them and hopefully things should work fine.
Thanks,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:09 PM
To: Kompella V, Purnima <Kompella.Purnima@commscope.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
I have both the primary and secondary ops_name (s) and they are different.
The two files are identical, except for these:
PRIMARY:
ops_name: <ring_sp_sc>
common_pool_count=1024
SECONDARY:
ops_name: <cn9k_mempool_ops>
common_pool_count=0
[root@localhost ~]# diff prim_mempool_dump.txt ex_second_mempool_dump.txt
15c15
< ops_name: <ring_sp_sc>
---
> ops_name: <cn9k_mempool_ops>
149c149
< common_pool_count=1024
---
> common_pool_count=0
Should the ops_name be the same among the primary and secondary processes?
I build out application in our build environment and the dpdk-simple_mp is from the dpdk sdk (meson/ninja).
Thanks,
Ed
-----Original Message-----
From: Kompella V, Purnima <Kompella.Purnima@commscope.com>
Sent: Monday, March 24, 2025 11:00 AM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>; Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi,
I am not sure if this could be your problem, but we had run into a similar issue with mbuf pools.
Do you list the DPDK libs in the exact same order when linking Primary and Secondary process? If not,
the OPs to which the ops_index is pointing to in Primary and in secondary is different.
Due to this, rte_mempool_ops_dequeue_bulk called from Secondary does not match the OPs that Primary had used to populate this memPool (during create)
So, accessing the pool from Secondary appears like the pool is empty but non-zero in-use Count.
This happens because each DPDK lib "appends" to the OPs list on the fly as it is linked by the linker.
So the order of OPs in the OPs database is a mismatch between Primary and Secondary. Primary and Secondary share information about OPs using ops_index, but since the order is different, ops_index=X in Primary not the same OPs in Secondary.
And this leads to the segmentation fault.
One of your processes is showing ops_index=1, ops_name: <cn9k_mempool_ops>. Can you confirm it is the same in the other process also?
By OPs, and ops_index, I am referring to this internal data structure of DPDK fw.
struct rte_mempool_ops {
char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
rte_mempool_alloc_t alloc; /**< Allocate private data. */
rte_mempool_free_t free; /**< Free the external pool. */
rte_mempool_enqueue_t enqueue; /**< Enqueue an object. */
rte_mempool_dequeue_t dequeue; /**< Dequeue an object. */
rte_mempool_get_count get_count; /**< Get qty of available objs. */
/**
* Optional callback to calculate memory size required to
* store specified number of objects.
*/
rte_mempool_calc_mem_size_t calc_mem_size;
/**
* Optional callback to populate mempool objects using
* provided memory chunk.
*/
rte_mempool_populate_t populate;
/**
* Get mempool info
*/
rte_mempool_get_info_t get_info;
/**
* Dequeue a number of contiguous object blocks.
*/
rte_mempool_dequeue_contig_blocks_t dequeue_contig_blocks; } __rte_cache_aligned;
Also, the link provides some insight into how OPs is significant.
https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/technical/optimize-memory-usage-in-multi-threaded-data-plane-development-kit-dpdk-applications.html__;!!Nzg7nt7_!ElXZ6jVzd3ey8732oohMXbI9iJA_a5BF67ev7sypu4Vf8LAfVMQzZiEReuDIb-jaLhNO6PswM4RTXVt-qxmR3d_493_Z2SSC$
Regards,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Monday, March 24, 2025 10:32 AM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi,
Further debugging has left me clueless as to the problem with getting the first Obj from the mempool on the secondary process to send to the primary process (my Application).
PRIMARY:
My application side (primary process) EAL Arguments:
"--proc-type=primary", "--file-prefix=dpdk_shared", "-l 25,26,27,28", "-n4" , "--socket-mem=2048", "--legacy-mem",
Create the _MSG_POOL
t_message_pool = rte_mempool_create(_MSG_POOL, // Pool name
t_pool_size, // Number of elements in the pool
STR_TOKEN_SIZE, // Size of each message
t_pool_cache, // Cache size
t_priv_data_sz, // Private data size
NULL, // mp_init
NULL, // mp_init_arg
NULL, // obj_init
NULL, // obj_init_arg
0, // rte_socket_id(), // socket_id
t_flags // flags
);
The t_message_pool pointer value matches the secondary process when execute " message_pool = rte_mempool_lookup(_MSG_POOL);"
NTSMON: send ring (0x1bfb43480), recv ring (0x1bfb45a00) and message pool (0x14f570ac0) are created successfully.
SECONDARY:
Secondary process I execute: " # ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --"
Notes:
* hugepages are created 2x1G on NUMA Node 0
[root@localhost ~]# /opt/dpdk/dpdk-hugepages.py -s
Node Pages Size Total
0 2 1Gb 2Gb
* --file-prefix=dpdk_shared is provided to both primary and secondary
* --proc-type is correctly defined for both primary and secondary process.
* The secondary process reports correct socket=0 (See dump command output below)
* The secondary process showed Available mempool count is 0 and in-use count is 1024 (which looks incorrect).
* Secondary process reports mempool is empty
* Secondary audit passed (rte_mempool_audit()), no panic occurred.
* Tried disabling ASLR
* Tried turning off legacy-mem
EXECUTE Secondary process application "dpdk-simple_mp_dbg3"
[root@localhost ~]# ./dpdk-simple_mp_dbg3 -l 30-31 -n 4 --file-prefix=dpdk_shared --proc-type=secondary --legacy-mem --
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/dpdk_shared/mp_socket_221890_1ba0562a6ae95
EAL: Selected IOVA mode 'PA'
APP: Finished Process Init.
APP: Availble mempool count is 0, in-use count is 1024, (mempool ptr=0x14f570ac0)
APP: is mempool empty (1) or full (0)?
APP: check audit on message_pool
APP: Secondary mempool dump file write
APP: NO Objs in message pool (_MSG_POOL), exit the app
simple_mp > Starting core 31
simple_mp > send hello
message_pool pointer is 0x14f570ac0
Failed to get msg obj from mem pool: Success (ret=-2)
EAL: PANIC in cmd_send_parsed():
Failed to get message buffer
0: ./dpdk-simple_mp_dbg3 (rte_dump_stack+0x2b) [b1ee1b]
1: ./dpdk-simple_mp_dbg3 (__rte_panic+0xbd) [525ede]
2: ./dpdk-simple_mp_dbg3 (cmd_send_parsed+0x2d8) [7ceb68]
3: ./dpdk-simple_mp_dbg3 (400000+0x6a5f46) [aa5f46]
4: ./dpdk-simple_mp_dbg3 (400000+0x6a4f20) [aa4f20]
5: ./dpdk-simple_mp_dbg3 (rdline_char_in+0x34b) [aa841b]
6: ./dpdk-simple_mp_dbg3 (cmdline_in+0x71) [aa4ff1]
7: ./dpdk-simple_mp_dbg3 (cmdline_interact+0x30) [aa5110]
8: ./dpdk-simple_mp_dbg3 (400000+0xfe5cb) [4fe5cb]
9: /lib64/libc.so.6 (7f2d76600000+0x3feb0) [7f2d7663feb0]
10: /lib64/libc.so.6 (__libc_start_main+0x80) [7f2d7663ff60]
11: ./dpdk-simple_mp_dbg3 (_start+0x25) [7ce795] Aborted (core dumped)
The rte_mempool_dump() is:
mempool <MSG_POOL>@0x14f570ac0
flags=1c
socket_id=0
pool=0x14f56c8c0
iova=0x1cf570ac0
nrte_mempool_getb_mem_chunks=1
size=1024
populated_size=1024
header_size=64
elt_size=64
trailer_size=64
total_obj_size=192
private_data_size=0
ops_index=1
ops_name: <cn9k_mempool_ops>
memory chunk at 0x1bfb43180, addr=0x14f53c780, iova=0x1cf53c780, len=196800
avg bytes/object=192.187500
internal cache infos:
cache_size=32 cache_count[0]=0
.....
cache_count[127]=0
total_cache_count=0
common_pool_count=1024
no statistics available
Any guidance is appreciated.
Thanks,
Ed
-----Original Message-----
From: Lombardo, Ed
Sent: Friday, March 21, 2025 2:19 PM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: RE: tailqs issue
Hi Sephen,
Thank you for your help. I made good progress up to now.
When I try to use the dpdk_simple_mp application to send a message to my application I get a Segmentation fault.
First, I re-verified the dpdk_simple_mp process=primary and dpdk_simple_mp process-secondary does pass messages successfully. So, my hugepages are created and DPDK initializes successfully on both at startup.
In my application I created the send and recv rings and message_pool as the primary process. The logs I added do not show any errors.
Once my application starts and settles I started the dpdk_simple_mp application: # ./dpdk-simple_mp_dbg -l 30-31 -n 4 --legacy-mem --proc-type secondary --
However, on the dpdk_simple_mp side do "send hello" and I then get a segmentation fault.
The debugger takes me deep within the dpdk libraries which I am not too familiar with.
The rte_ring_elem.h file function: rte_ring_dequeue_build_elem() is where I end up with segmentation fault. I notice that the variables are optimized out, not sure why since I built the dpdk libraries with debug flag.
Here is the back trace and could you point me in the direction to look.
# gdb dpdk-simple_mp /core/core.dpdk-simple_mp.241269
warning: Unexpected size of section `.reg-xstate/241269' in core file.
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./dpdk-simple_mp -l 30-31 -n 4 --legacy-mem --proc-type secondary --'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/241269' in core file.
#0 0x0000000000cf446a in bucket_dequeue () [Current thread is 1 (Thread 0x7f946f835c00 (LWP 241269))] Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-3.el9.x86_64 glibc-2.34-83.0.1.el9_3.7.x86_64 libibverbs-46.0-1.el9.x86_64 libnl3-3.7.0-1.el9.x86_64 libpcap-1.10.0-4.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 numactl-libs-2.0.16-1.el9.x86_64 openssl-libs-3.0.7-25.0.1.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) bt
#0 0x0000000000cf446a in bucket_dequeue ()
#1 0x00000000007ce77d in cmd_send_parsed ()
#2 0x0000000000aa5d96 in __cmdline_parse ()
#3 0x0000000000aa4d70 in cmdline_valid_buffer ()
#4 0x0000000000aa826b in rdline_char_in ()
#5 0x0000000000aa4e41 in cmdline_in ()
#6 0x0000000000aa4f60 in cmdline_interact ()
#7 0x00000000004fe47a in main.cold ()
#8 0x00007f946f03feb0 in __libc_start_call_main () from /lib64/libc.so.6
#9 0x00007f946f03ff60 in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x00000000007ce605 in _start ()
Gdb - stepping through the code, gdb attached to dpdk_simple_mp_debug
(gdb)
0x0000000000cf42c5 in rte_ring_dequeue_bulk_elem (available=<optimized out>, n=<optimized out>, esize=<optimized out>,
obj_table=<optimized out>, r=<optimized out>) at ../lib/ring/rte_ring_elem.h:375
375 ../lib/ring/rte_ring_elem.h: No such file or directory.
(gdb) p r
$17 = <optimized out>
(gdb) p obj_table
$18 = <optimized out>
(gdb) p available
$19 = <optimized out>
(gdb) n
Thread 1 "dpdk-simple_mp_" received signal SIGSEGV, Segmentation fault.
bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
191 ../drivers/mempool/bucket/rte_mempool_bucket.c: No such file or directory.
(gdb) bt
#0 bucket_dequeue_orphans (n_orphans=33, obj_table=0x14f09b5c0, bd=0x14f05aa80)
at ../drivers/mempool/bucket/rte_mempool_bucket.c:191
#1 bucket_dequeue (mp=<optimized out>, obj_table=0x14f09b5c0, n=33) at ../drivers/mempool/bucket/rte_mempool_bucket.c:289
#2 0x00000000007ce77d in rte_mempool_ops_dequeue_bulk (n=<optimized out>, obj_table=0x14f09b5c0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:793
#3 rte_mempool_do_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1570
#4 rte_mempool_generic_get (cache=0x14f09b580, n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40)
at ../lib/mempool/rte_mempool.h:1649
#5 rte_mempool_get_bulk (n=1, obj_table=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1684
#6 rte_mempool_get (obj_p=0x7fff8df066f0, mp=0x14f05ed40) at ../lib/mempool/rte_mempool.h:1710
#7 cmd_send_parsed (parsed_result=parsed_result@entry=0x7fff8df06790, cl=cl@entry=0x2f73220, data=data@entry=0x0)
at ../examples/multi_process/simple_mp/mp_commands.c:18
#8 0x0000000000aa5d96 in __cmdline_parse (cl=cl@entry=0x2f73220, buf=0x2f73268 "send hello\n",
call_fn=call_fn@entry=true) at ../lib/cmdline/cmdline_parse.c:294
#9 0x0000000000aa5f1a in cmdline_parse (cl=cl@entry=0x2f73220, buf=<optimized out>) at ../lib/cmdline/cmdline_parse.c:302
#10 0x0000000000aa4d70 in cmdline_valid_buffer (rdl=<optimized out>, buf=<optimized out>, size=<optimized out>)
at ../lib/cmdline/cmdline.c:24
#11 0x0000000000aa826b in rdline_char_in (rdl=rdl@entry=0x2f73230, c=<optimized out>)
at ../lib/cmdline/cmdline_rdline.c:444
#12 0x0000000000aa4e41 in cmdline_in (size=<optimized out>, buf=<optimized out>, cl=<optimized out>)
at ../lib/cmdline/cmdline.c:146
#13 cmdline_in (cl=0x2f73220, buf=0x7fff8df0c89f "\n\200", size=<optimized out>) at ../lib/cmdline/cmdline.c:135
#14 0x0000000000aa4f60 in cmdline_interact (cl=cl@entry=0x2f73220) at ../lib/cmdline/cmdline.c:192
#15 0x00000000004fe47a in main (argc=<optimized out>, argv=<optimized out>)
at ../examples/multi_process/simple_mp/main.c:122
Appreciate if you can help.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 19, 2025 7:17 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, 19 Mar 2025 21:52:39 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I added the fib library, but I now see there are many more dpdk libraries I need to add. Is this typically the case with the example files working with primary DPDK application?
>
> I am using meson and ninja to build the examples, but I don't know how to know the library dependencies.
>
> How do I learn ahead of building my Application as to what extra libraries I need to include for the DPDK example to work?
>
> I am doing incremental build-test-find_missing_library.
>
> So far, I needed to add these: -lrte_fib -lrte_rib -lrte_stack
> -lrte_member -lrte_efd
>
> Thanks,
> Ed
The typical case is to make sure that primary and secondary are built with the same libraries.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: tailqs issue
2025-03-25 14:39 ` Lombardo, Ed
@ 2025-03-25 22:20 ` Stephen Hemminger
2025-03-25 22:24 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2025-03-25 22:20 UTC (permalink / raw)
To: Lombardo, Ed; +Cc: Kompella V, Purnima, users
On Tue, 25 Mar 2025 14:39:08 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Purnima,
> I will try your suggestion, but this seems weird. What if I have a 3rd party application that I want to integrate with our application. This could be impossible to coordinate this requirement.
>
> Thanks,
> Ed
Primary and secondary are tightly coupled. They have to be built from same base, so true 3rd party
support would be very difficult.
How are you building? Using meson and DPDK process or something custom?
I seem to remember there was a issue long ago with shared libraries and build flags
where initializers would not get run unless a flag was passed during the shared library
link step.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-25 22:20 ` Stephen Hemminger
@ 2025-03-25 22:24 ` Lombardo, Ed
2025-03-25 22:41 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-25 22:24 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Kompella V, Purnima, users
Hi Stephen,
I am building the dpdk-simple_mp example in meson/ninja.
Our application is built in our custom build environment, and we are not using DPDK shared libraries, but are linking to DPDK static libs.
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Tuesday, March 25, 2025 6:20 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: Kompella V, Purnima <Kompella.Purnima@commscope.com>; users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Tue, 25 Mar 2025 14:39:08 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Purnima,
> I will try your suggestion, but this seems weird. What if I have a 3rd party application that I want to integrate with our application. This could be impossible to coordinate this requirement.
>
> Thanks,
> Ed
Primary and secondary are tightly coupled. They have to be built from same base, so true 3rd party support would be very difficult.
How are you building? Using meson and DPDK process or something custom?
I seem to remember there was a issue long ago with shared libraries and build flags where initializers would not get run unless a flag was passed during the shared library link step.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: tailqs issue
2025-03-25 22:24 ` Lombardo, Ed
@ 2025-03-25 22:41 ` Stephen Hemminger
2025-03-25 22:56 ` Lombardo, Ed
0 siblings, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2025-03-25 22:41 UTC (permalink / raw)
To: Lombardo, Ed; +Cc: Kompella V, Purnima, users
On Tue, 25 Mar 2025 22:24:33 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I am building the dpdk-simple_mp example in meson/ninja.
>
> Our application is built in our custom build environment, and we are not using DPDK shared libraries, but are linking to DPDK static libs.
>
> Thanks,
> Ed
Both need to be built the same way.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-25 22:41 ` Stephen Hemminger
@ 2025-03-25 22:56 ` Lombardo, Ed
2025-03-26 10:27 ` Kompella V, Purnima
0 siblings, 1 reply; 17+ messages in thread
From: Lombardo, Ed @ 2025-03-25 22:56 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Kompella V, Purnima, users
Hi Stephen,
Is there development work to remove this restriction, or is this impossible?
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Tuesday, March 25, 2025 6:42 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: Kompella V, Purnima <Kompella.Purnima@commscope.com>; users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Tue, 25 Mar 2025 22:24:33 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I am building the dpdk-simple_mp example in meson/ninja.
>
> Our application is built in our custom build environment, and we are not using DPDK shared libraries, but are linking to DPDK static libs.
>
> Thanks,
> Ed
Both need to be built the same way.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-25 22:56 ` Lombardo, Ed
@ 2025-03-26 10:27 ` Kompella V, Purnima
2025-03-26 14:14 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Kompella V, Purnima @ 2025-03-26 10:27 UTC (permalink / raw)
To: Lombardo, Ed, Stephen Hemminger; +Cc: users
Hi Stephen
Isn't it possible to gather all the MEMPOOL_REGISTER_OPS (xxx) calls to a separate .c file in the dpdk source code
Like say drivers/mempool/common/rte_mempool_ops_reg.c containing below lines
MEMPOOL_REGISTER_OPS(ops_stack);
MEMPOOL_REGISTER_OPS(ops_lf_stack);
MEMPOOL_REGISTER_OPS(ops_bucket);
MEMPOOL_REGISTER_OPS(ops_bucket);
MEMPOOL_REGISTER_OPS(octeontx_fpavf_ops);
Etc
Etc
This way both Primary and Secondary processes can get the same order of OPs irrespective of the order of dpdk libs listing in their compilation.
In the current method, each lib calls MEMPOOL_REGISTER_OPS(xx) in its own source code and hence the order of OPs in struct rte_mempool_ops::name[] in a DPDK process depends on the order of libs listed during compilation.
Also, in struct rte_mempool, if we can add ops_name as a new data-member (along with ops_index that's already present), then
Primary process can populate struct rte_mempool::ops_name with the name of the OP corresponding to struct rte_mempool::ops_index it had used to create this mempool.
Secondary processes can validate whether in the secondary processes' OPs database, struct rte_mempool::ops_index is matching the struct rte_mempool::ops_name
If a mismatch is detected, Secondary can call panic -- this kind of early failure is better than everything only 'looking good` but not actually being good.
Regards,
Purnima
-----Original Message-----
From: Lombardo, Ed <Ed.Lombardo@netscout.com>
Sent: Wednesday, March 26, 2025 4:27 AM
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: Kompella V, Purnima <Kompella.Purnima@commscope.com>; users@dpdk.org
Subject: RE: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
Hi Stephen,
Is there development work to remove this restriction, or is this impossible?
Thanks,
Ed
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Tuesday, March 25, 2025 6:42 PM
To: Lombardo, Ed <Ed.Lombardo@netscout.com>
Cc: Kompella V, Purnima <Kompella.Purnima@commscope.com>; users@dpdk.org
Subject: Re: tailqs issue
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Tue, 25 Mar 2025 22:24:33 +0000
"Lombardo, Ed" <Ed.Lombardo@netscout.com> wrote:
> Hi Stephen,
> I am building the dpdk-simple_mp example in meson/ninja.
>
> Our application is built in our custom build environment, and we are not using DPDK shared libraries, but are linking to DPDK static libs.
>
> Thanks,
> Ed
Both need to be built the same way.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: tailqs issue
2025-03-26 10:27 ` Kompella V, Purnima
@ 2025-03-26 14:14 ` Stephen Hemminger
2025-03-26 14:33 ` Kompella V, Purnima
0 siblings, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2025-03-26 14:14 UTC (permalink / raw)
To: Kompella V, Purnima; +Cc: Lombardo, Ed, users
On Wed, 26 Mar 2025 10:27:40 +0000
"Kompella V, Purnima" <Kompella.Purnima@commscope.com> wrote:
> Hi Stephen
>
> Isn't it possible to gather all the MEMPOOL_REGISTER_OPS (xxx) calls to a separate .c file in the dpdk source code
> Like say drivers/mempool/common/rte_mempool_ops_reg.c containing below lines
>
> MEMPOOL_REGISTER_OPS(ops_stack);
> MEMPOOL_REGISTER_OPS(ops_lf_stack);
> MEMPOOL_REGISTER_OPS(ops_bucket);
> MEMPOOL_REGISTER_OPS(ops_bucket);
> MEMPOOL_REGISTER_OPS(octeontx_fpavf_ops);
That would be inflexible. Not every build needs all the ops.
If you want to fix, a better approach would be to harden the registration process.
Initializers and destructors are a problematic construct to debug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: tailqs issue
2025-03-26 14:14 ` Stephen Hemminger
@ 2025-03-26 14:33 ` Kompella V, Purnima
0 siblings, 0 replies; 17+ messages in thread
From: Kompella V, Purnima @ 2025-03-26 14:33 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Lombardo, Ed, users
OK.
Is it possible to declare an enum or a lookup table like "ops_name"(string) and "ops_index" (integer) pair in some common file.
Whether or not an OP is registered in the current build, the OP will always be associated with a fixed ops_index and this way ops_index really becomes shareable (like a database key) across different DPDK processes.
Anyone contributing a new OP to DPDK must add it to the LOOKUP table.
Both our Pri and Sec were built on the same code base, listed all the DPDK libs, just that the Makefiles had different order of listing the DPDK libs because Pri and Sec are built by
different teams in our org.
OPs is not 'user-visible' when creating mempool, create-mempool DPDK API doesn't take OP name as input.
This makes it very hard to troubleshoot, to find a starting point of where things may have gone wrong.
Just a thought.
Regards,
Purnima
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, March 26, 2025 7:44 PM
To: Kompella V, Purnima <Kompella.Purnima@commscope.com>
Cc: Lombardo, Ed <Ed.Lombardo@netscout.com>; users@dpdk.org
Subject: Re: tailqs issue
CAUTION: This message originated from an External Source outside of CommScope.com. This may be a phishing email that can result in unauthorized access to CommScope. Please use caution when opening attachments, clicking links, scanning QR codes, or responding. You can report suspicious emails directly in Microsoft Outlook.
On Wed, 26 Mar 2025 10:27:40 +0000
"Kompella V, Purnima" <Kompella.Purnima@commscope.com> wrote:
> Hi Stephen
>
> Isn't it possible to gather all the MEMPOOL_REGISTER_OPS (xxx) calls
> to a separate .c file in the dpdk source code Like say
> drivers/mempool/common/rte_mempool_ops_reg.c containing below lines
>
> MEMPOOL_REGISTER_OPS(ops_stack);
> MEMPOOL_REGISTER_OPS(ops_lf_stack);
> MEMPOOL_REGISTER_OPS(ops_bucket);
> MEMPOOL_REGISTER_OPS(ops_bucket);
> MEMPOOL_REGISTER_OPS(octeontx_fpavf_ops);
That would be inflexible. Not every build needs all the ops.
If you want to fix, a better approach would be to harden the registration process.
Initializers and destructors are a problematic construct to debug.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-03-26 14:34 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-19 17:50 tailqs issue Lombardo, Ed
2025-03-19 20:23 ` Stephen Hemminger
2025-03-19 21:52 ` Lombardo, Ed
2025-03-19 23:16 ` Stephen Hemminger
2025-03-21 18:18 ` Lombardo, Ed
2025-03-24 5:01 ` Lombardo, Ed
2025-03-24 14:59 ` Kompella V, Purnima
2025-03-24 16:39 ` Lombardo, Ed
2025-03-25 10:25 ` Kompella V, Purnima
2025-03-25 14:39 ` Lombardo, Ed
2025-03-25 22:20 ` Stephen Hemminger
2025-03-25 22:24 ` Lombardo, Ed
2025-03-25 22:41 ` Stephen Hemminger
2025-03-25 22:56 ` Lombardo, Ed
2025-03-26 10:27 ` Kompella V, Purnima
2025-03-26 14:14 ` Stephen Hemminger
2025-03-26 14:33 ` Kompella V, Purnima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).