* [PATCH] mempool: fix rte primary program coredump
@ 2021-11-10 15:57 Tianli Lai
2021-11-10 16:00 ` David Marchand
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Tianli Lai @ 2021-11-10 15:57 UTC (permalink / raw)
To: dev
the primary program(such as ofp app) run first, then run the secondary
program(such as dpdk-pdump), the primary program would receive signal
SIGSEGV. the function stack as follow:
aived signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffee60e700 (LWP 112613)]
0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at
/ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
95 if (stack->top == 0)
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-196.el7.x86_64 libatomic-4.8.5-16.el7.x86_64
libconfig-1.4.9-5.el7.x86_64 libgcc-4.8.5-16.el7.x86_64
libpcap-1.5.3-12.el7.x86_64 numactl-libs-2.0.9-6.el7_2.x86_64
openssl-libs-1.0.2k-8.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
#1 0x00007ffff5f2e5dc in bucket_dequeue_orphans (bd=0x2209e5fac0,obj_table=0x220b083710, n_orphans=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
#2 0x00007ffff5f30192 in bucket_dequeue (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
#3 0x00007ffff5f47e18 in rte_mempool_ops_dequeue_bulk (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:739
#4 0x00007ffff5f4819d in __mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1443
#5 rte_mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1506
#6 rte_mempool_get_bulk (n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1539
#7 rte_mempool_get (obj_p=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1565
#8 rte_mbuf_raw_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:551
#9 0x00007ffff5f483a4 in rte_pktmbuf_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:804
#10 0x00007ffff5f4c9d9 in pdump_pktmbuf_copy (m=0x220746ad80, mp=0x220b07d5c0) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:99
#11 0x00007ffff5f4e42e in pdump_copy (pkts=0x7fffee5dfdf0, nb_pkts=1, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:151
#12 0x00007ffff5f4eadd in pdump_rx (port=0, qidx=0, pkts=0x7fffee5dfdf0, nb_pkts=1, max_pkts=16, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:172
#13 0x00007ffff5d0e9e8 in rte_eth_rx_burst (port_id=0, queue_id=0, rx_pkts=0x7fffee5dfdf0, nb_pkts=16) at /ofp/dpdk/x86_64-native-linuxapp-gcc/usr/local/include/dpdk/rte_ethdev.h:4396
#14 0x00007ffff5d114c3 in recv_pkt_dpdk (pktio_entry=0x22005436c0, index=0, pkt_table=0x7fffee5dfdf0, num=16) at odp_packet_dpdk.c:1081
#15 0x00007ffff5d2f931 in odp_pktin_recv (queue=...,packets=0x7fffee5dfdf0, num=16) at ../linux-generic/odp_packet_io.c:1896
#16 0x000000000040a344 in rx_burst (pktin=...) at app_main.c:223
#17 0x000000000040aca4 in run_server_single (arg=0x7fffffffe2b0) at app_main.c:417
#18 0x00007ffff7bd6883 in run_thread (arg=0x7fffffffe3b8) at threads.c:67
#19 0x00007ffff53c8e25 in start_thread () from /lib64/libpthread.so.0
#20 0x00007ffff433e34d in clone () from /lib64/libc.so.6.c:67
The program crash down reason is:
In primary program and secondary program , the global array rte_mempool_ops.ops[]:
primary name secondary name
[0]: "bucket" "ring_mp_mc"
[1]: "dpaa" "ring_sp_sc"
[2]: "dpaa2" "ring_mp_sc"
[3]: "octeontx_fpavf" "ring_sp_mc"
[4]: "octeontx2_npa" "octeontx2_npa"
[5]: "ring_mp_mc" "bucket"
[6]: "ring_sp_sc" "stack"
[7]: "ring_mp_sc" "if_stack"
[8]: "ring_sp_mc" "dpaa"
[9]: "stack" "dpaa2"
[10]: "if_stack" "octeontx_fpavf"
[11]: NULL NULL
this array in primary program is different with secondary program.
so when secondary program call rte_pktmbuf_pool_create_by_ops() with
mempool name “ring_mp_mc”, but the primary program use "bucket" type
to alloc rte_mbuf.
so sort this array both primary program and secondary program when init
memzone.
Signed-off-by: Tianli Lai <laitianli@tom.com>
---
lib/librte_eal/common/eal_common_memzone.c | 2 +-
lib/librte_mempool/rte_mempool.h | 6 ++++++
lib/librte_mempool/rte_mempool_ops.c | 31 ++++++++++++++++++++++++++++++
3 files changed, 38 insertions(+), 1 deletion(-)
mode change 100644 => 100755 lib/librte_mempool/rte_mempool_ops.c
diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 99b8d65..b59f3f5 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -384,7 +384,7 @@
}
rte_rwlock_write_unlock(&mcfg->mlock);
-
+ rte_sort_mempool_ops();
return ret;
}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index f81152a..a22850b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -910,6 +910,12 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
/**
+ * Sort global array rte_mempool_ops_table.ops[] .
+ * Used by rte_eal_memzone_init()
+ */
+int rte_sort_mempool_ops(void);
+
+/**
* Macro to statically register the ops of a mempool handler.
* Note that the rte_mempool_register_ops fails silently here when
* more than RTE_MEMPOOL_MAX_OPS_IDX is registered.
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
old mode 100644
new mode 100755
index 22c5251..8e10488
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -68,6 +68,37 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
return ops_index;
}
+
+int rte_sort_mempool_ops(void)
+{
+ /* same with rte_mempool_ops.name */
+ static const char *memops_name[RTE_MEMPOOL_MAX_OPS_IDX] = {
+ "ring_mp_mc", "ring_sp_sc", "ring_mp_sc", "ring_sp_mc",
+ "stack", "lf_stack", "octeontx2_npa", "octeontx_fpavf",
+ "dpaa2", "dpaa", "bucket",
+ };
+ struct rte_mempool_ops_table tmp_mempool_ops_table = {
+ .sl = rte_mempool_ops_table.sl,
+ .num_ops = rte_mempool_ops_table.num_ops
+ };
+ uint32_t i = 0, j= 0;
+ struct rte_mempool_ops *ops = NULL;
+ for (i = 0; i < 16; i++) {
+ const char* name = memops_name[i];
+ if(name && strlen(name)) {
+ for(j = 0; j < rte_mempool_ops_table.num_ops; j++) {
+ if(strcmp(name, rte_mempool_ops_table.ops[j].name))
+ continue;
+ ops = &rte_mempool_ops_table.ops[j];
+ memcpy(&tmp_mempool_ops_table.ops[i], ops, sizeof(*ops));
+ break;
+ }
+ }
+ }
+ memcpy(&rte_mempool_ops_table, &tmp_mempool_ops_table, sizeof(tmp_mempool_ops_table));
+ return 0;
+}
+
/* wrapper to allocate an external mempool's private (pool) data. */
int
rte_mempool_ops_alloc(struct rte_mempool *mp)
--
1.8.3.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mempool: fix rte primary program coredump
2021-11-10 15:57 [PATCH] mempool: fix rte primary program coredump Tianli Lai
@ 2021-11-10 16:00 ` David Marchand
2021-11-10 16:07 ` laitianli
2021-11-10 17:15 ` Jerin Jacob
2022-01-27 10:06 ` Olivier Matz
2 siblings, 1 reply; 6+ messages in thread
From: David Marchand @ 2021-11-10 16:00 UTC (permalink / raw)
To: Tianli Lai; +Cc: dev
On Wed, Nov 10, 2021 at 4:57 PM Tianli Lai <laitianli@tom.com> wrote:
>
> the primary program(such as ofp app) run first, then run the secondary
> program(such as dpdk-pdump), the primary program would receive signal
> SIGSEGV. the function stack as follow:
Is OpenFastPath linked against the same dpdk binary than your dpdk-pdump tool?
--
David Marchand
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mempool: fix rte primary program coredump
2021-11-10 16:00 ` David Marchand
@ 2021-11-10 16:07 ` laitianli
0 siblings, 0 replies; 6+ messages in thread
From: laitianli @ 2021-11-10 16:07 UTC (permalink / raw)
To: david.marchand; +Cc: dev
[-- Attachment #1: Type: text/html, Size: 4544 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mempool: fix rte primary program coredump
2021-11-10 15:57 [PATCH] mempool: fix rte primary program coredump Tianli Lai
2021-11-10 16:00 ` David Marchand
@ 2021-11-10 17:15 ` Jerin Jacob
2022-01-27 10:06 ` Olivier Matz
2 siblings, 0 replies; 6+ messages in thread
From: Jerin Jacob @ 2021-11-10 17:15 UTC (permalink / raw)
To: Tianli Lai; +Cc: dpdk-dev
On Wed, Nov 10, 2021 at 9:38 PM Tianli Lai <laitianli@tom.com> wrote:
>
> the primary program(such as ofp app) run first, then run the secondary
> program(such as dpdk-pdump), the primary program would receive signal
> SIGSEGV. the function stack as follow:
>
> aived signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffee60e700 (LWP 112613)]
> 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at
> /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
> 95 if (stack->top == 0)
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-196.el7.x86_64 libatomic-4.8.5-16.el7.x86_64
> libconfig-1.4.9-5.el7.x86_64 libgcc-4.8.5-16.el7.x86_64
> libpcap-1.5.3-12.el7.x86_64 numactl-libs-2.0.9-6.el7_2.x86_64
> openssl-libs-1.0.2k-8.el7.x86_64 zlib-1.2.7-17.el7.x86_64
> (gdb) bt
> #0 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
> #1 0x00007ffff5f2e5dc in bucket_dequeue_orphans (bd=0x2209e5fac0,obj_table=0x220b083710, n_orphans=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
> #2 0x00007ffff5f30192 in bucket_dequeue (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
> #3 0x00007ffff5f47e18 in rte_mempool_ops_dequeue_bulk (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:739
> #4 0x00007ffff5f4819d in __mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1443
> #5 rte_mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1506
> #6 rte_mempool_get_bulk (n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1539
> #7 rte_mempool_get (obj_p=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1565
> #8 rte_mbuf_raw_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:551
> #9 0x00007ffff5f483a4 in rte_pktmbuf_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:804
> #10 0x00007ffff5f4c9d9 in pdump_pktmbuf_copy (m=0x220746ad80, mp=0x220b07d5c0) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:99
> #11 0x00007ffff5f4e42e in pdump_copy (pkts=0x7fffee5dfdf0, nb_pkts=1, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:151
> #12 0x00007ffff5f4eadd in pdump_rx (port=0, qidx=0, pkts=0x7fffee5dfdf0, nb_pkts=1, max_pkts=16, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:172
> #13 0x00007ffff5d0e9e8 in rte_eth_rx_burst (port_id=0, queue_id=0, rx_pkts=0x7fffee5dfdf0, nb_pkts=16) at /ofp/dpdk/x86_64-native-linuxapp-gcc/usr/local/include/dpdk/rte_ethdev.h:4396
> #14 0x00007ffff5d114c3 in recv_pkt_dpdk (pktio_entry=0x22005436c0, index=0, pkt_table=0x7fffee5dfdf0, num=16) at odp_packet_dpdk.c:1081
> #15 0x00007ffff5d2f931 in odp_pktin_recv (queue=...,packets=0x7fffee5dfdf0, num=16) at ../linux-generic/odp_packet_io.c:1896
> #16 0x000000000040a344 in rx_burst (pktin=...) at app_main.c:223
> #17 0x000000000040aca4 in run_server_single (arg=0x7fffffffe2b0) at app_main.c:417
> #18 0x00007ffff7bd6883 in run_thread (arg=0x7fffffffe3b8) at threads.c:67
> #19 0x00007ffff53c8e25 in start_thread () from /lib64/libpthread.so.0
> #20 0x00007ffff433e34d in clone () from /lib64/libc.so.6.c:67
>
> The program crash down reason is:
>
> In primary program and secondary program , the global array rte_mempool_ops.ops[]:
> primary name secondary name
> [0]: "bucket" "ring_mp_mc"
> [1]: "dpaa" "ring_sp_sc"
> [2]: "dpaa2" "ring_mp_sc"
> [3]: "octeontx_fpavf" "ring_sp_mc"
> [4]: "octeontx2_npa" "octeontx2_npa"
> [5]: "ring_mp_mc" "bucket"
> [6]: "ring_sp_sc" "stack"
> [7]: "ring_mp_sc" "if_stack"
> [8]: "ring_sp_mc" "dpaa"
> [9]: "stack" "dpaa2"
> [10]: "if_stack" "octeontx_fpavf"
> [11]: NULL NULL
>
> this array in primary program is different with secondary program.
> so when secondary program call rte_pktmbuf_pool_create_by_ops() with
> mempool name “ring_mp_mc”, but the primary program use "bucket" type
> to alloc rte_mbuf.
>
> so sort this array both primary program and secondary program when init
> memzone.
>
> Signed-off-by: Tianli Lai <laitianli@tom.com>
> ---
> lib/librte_eal/common/eal_common_memzone.c | 2 +-
> lib/librte_mempool/rte_mempool.h | 6 ++++++
> lib/librte_mempool/rte_mempool_ops.c | 31 ++++++++++++++++++++++++++++++
> 3 files changed, 38 insertions(+), 1 deletion(-)
> mode change 100644 => 100755 lib/librte_mempool/rte_mempool_ops.c
>
> diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
> index 99b8d65..b59f3f5 100644
> --- a/lib/librte_eal/common/eal_common_memzone.c
> +++ b/lib/librte_eal/common/eal_common_memzone.c
> @@ -384,7 +384,7 @@
> }
>
> rte_rwlock_write_unlock(&mcfg->mlock);
> -
> + rte_sort_mempool_ops();
> return ret;
> }
>
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index f81152a..a22850b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -910,6 +910,12 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>
> /**
> + * Sort global array rte_mempool_ops_table.ops[] .
> + * Used by rte_eal_memzone_init()
> + */
> +int rte_sort_mempool_ops(void);
Since it is an internal API, No need for rte_ prefix.
> +
> +/**
> * Macro to statically register the ops of a mempool handler.
> * Note that the rte_mempool_register_ops fails silently here when
> * more than RTE_MEMPOOL_MAX_OPS_IDX is registered.
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> old mode 100644
> new mode 100755
> index 22c5251..8e10488
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -68,6 +68,37 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> return ops_index;
> }
>
> +
> +int rte_sort_mempool_ops(void)
> +{
> + /* same with rte_mempool_ops.name */
> + static const char *memops_name[RTE_MEMPOOL_MAX_OPS_IDX] = {
> + "ring_mp_mc", "ring_sp_sc", "ring_mp_sc", "ring_sp_mc",
> + "stack", "lf_stack", "octeontx2_npa", "octeontx_fpavf",
> + "dpaa2", "dpaa", "bucket",
I think, it is not foolproof. I think, either
1) you can use primary - secondary communication
mechanism to get the library order from the primary.
OR
2) At end of primary rte_eal_init or so, copy the array to memzone and then
lookup the memzone by name(string) in secondary to fill this array.
> + };
> + struct rte_mempool_ops_table tmp_mempool_ops_table = {
> + .sl = rte_mempool_ops_table.sl,
> + .num_ops = rte_mempool_ops_table.num_ops
> + };
> + uint32_t i = 0, j= 0;
> + struct rte_mempool_ops *ops = NULL;
> + for (i = 0; i < 16; i++) {
> + const char* name = memops_name[i];
> + if(name && strlen(name)) {
> + for(j = 0; j < rte_mempool_ops_table.num_ops; j++) {
> + if(strcmp(name, rte_mempool_ops_table.ops[j].name))
> + continue;
> + ops = &rte_mempool_ops_table.ops[j];
> + memcpy(&tmp_mempool_ops_table.ops[i], ops, sizeof(*ops));
> + break;
> + }
> + }
> + }
> + memcpy(&rte_mempool_ops_table, &tmp_mempool_ops_table, sizeof(tmp_mempool_ops_table));
> + return 0;
> +}
> +
> /* wrapper to allocate an external mempool's private (pool) data. */
> int
> rte_mempool_ops_alloc(struct rte_mempool *mp)
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mempool: fix rte primary program coredump
2021-11-10 15:57 [PATCH] mempool: fix rte primary program coredump Tianli Lai
2021-11-10 16:00 ` David Marchand
2021-11-10 17:15 ` Jerin Jacob
@ 2022-01-27 10:06 ` Olivier Matz
2023-06-30 21:36 ` Stephen Hemminger
2 siblings, 1 reply; 6+ messages in thread
From: Olivier Matz @ 2022-01-27 10:06 UTC (permalink / raw)
To: Tianli Lai; +Cc: dev
Hi Tianli,
On Wed, Nov 10, 2021 at 11:57:19PM +0800, Tianli Lai wrote:
> the primary program(such as ofp app) run first, then run the secondary
> program(such as dpdk-pdump), the primary program would receive signal
> SIGSEGV. the function stack as follow:
>
> aived signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffee60e700 (LWP 112613)]
> 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at
> /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
> 95 if (stack->top == 0)
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-196.el7.x86_64 libatomic-4.8.5-16.el7.x86_64
> libconfig-1.4.9-5.el7.x86_64 libgcc-4.8.5-16.el7.x86_64
> libpcap-1.5.3-12.el7.x86_64 numactl-libs-2.0.9-6.el7_2.x86_64
> openssl-libs-1.0.2k-8.el7.x86_64 zlib-1.2.7-17.el7.x86_64
> (gdb) bt
> #0 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
> #1 0x00007ffff5f2e5dc in bucket_dequeue_orphans (bd=0x2209e5fac0,obj_table=0x220b083710, n_orphans=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
> #2 0x00007ffff5f30192 in bucket_dequeue (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
> #3 0x00007ffff5f47e18 in rte_mempool_ops_dequeue_bulk (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:739
> #4 0x00007ffff5f4819d in __mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1443
> #5 rte_mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1506
> #6 rte_mempool_get_bulk (n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1539
> #7 rte_mempool_get (obj_p=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1565
> #8 rte_mbuf_raw_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:551
> #9 0x00007ffff5f483a4 in rte_pktmbuf_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:804
> #10 0x00007ffff5f4c9d9 in pdump_pktmbuf_copy (m=0x220746ad80, mp=0x220b07d5c0) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:99
> #11 0x00007ffff5f4e42e in pdump_copy (pkts=0x7fffee5dfdf0, nb_pkts=1, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:151
> #12 0x00007ffff5f4eadd in pdump_rx (port=0, qidx=0, pkts=0x7fffee5dfdf0, nb_pkts=1, max_pkts=16, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:172
> #13 0x00007ffff5d0e9e8 in rte_eth_rx_burst (port_id=0, queue_id=0, rx_pkts=0x7fffee5dfdf0, nb_pkts=16) at /ofp/dpdk/x86_64-native-linuxapp-gcc/usr/local/include/dpdk/rte_ethdev.h:4396
> #14 0x00007ffff5d114c3 in recv_pkt_dpdk (pktio_entry=0x22005436c0, index=0, pkt_table=0x7fffee5dfdf0, num=16) at odp_packet_dpdk.c:1081
> #15 0x00007ffff5d2f931 in odp_pktin_recv (queue=...,packets=0x7fffee5dfdf0, num=16) at ../linux-generic/odp_packet_io.c:1896
> #16 0x000000000040a344 in rx_burst (pktin=...) at app_main.c:223
> #17 0x000000000040aca4 in run_server_single (arg=0x7fffffffe2b0) at app_main.c:417
> #18 0x00007ffff7bd6883 in run_thread (arg=0x7fffffffe3b8) at threads.c:67
> #19 0x00007ffff53c8e25 in start_thread () from /lib64/libpthread.so.0
> #20 0x00007ffff433e34d in clone () from /lib64/libc.so.6.c:67
>
> The program crash down reason is:
>
> In primary program and secondary program , the global array rte_mempool_ops.ops[]:
> primary name secondary name
> [0]: "bucket" "ring_mp_mc"
> [1]: "dpaa" "ring_sp_sc"
> [2]: "dpaa2" "ring_mp_sc"
> [3]: "octeontx_fpavf" "ring_sp_mc"
> [4]: "octeontx2_npa" "octeontx2_npa"
> [5]: "ring_mp_mc" "bucket"
> [6]: "ring_sp_sc" "stack"
> [7]: "ring_mp_sc" "if_stack"
> [8]: "ring_sp_mc" "dpaa"
> [9]: "stack" "dpaa2"
> [10]: "if_stack" "octeontx_fpavf"
> [11]: NULL NULL
>
> this array in primary program is different with secondary program.
> so when secondary program call rte_pktmbuf_pool_create_by_ops() with
> mempool name “ring_mp_mc”, but the primary program use "bucket" type
> to alloc rte_mbuf.
>
> so sort this array both primary program and secondary program when init
> memzone.
>
> Signed-off-by: Tianli Lai <laitianli@tom.com>
I think it is the same problem than the one described here:
http://inbox.dpdk.org/dev/1583114253-15345-1-git-send-email-xiangxia.m.yue@gmail.com/#r
To summarize what is said in the thread, sorting ops look dangerous because it
changes the index during the lifetime of the application. A new proposal was
made to use a shared memory to ensure the indexes are the same in primary and
secondaries, but it requires some changes in EAL to have init callbacks at a
specific place.
I have a draft patchset that may fix this issue by using the vdev infrastructure
instead of a specific init, but it is not heavily tested. I can send it here as
a RFC if you want to try it.
One thing that is not clear to me is how do you trigger this issue? Why the
mempool ops are not loaded in the same order in primary and secondary?
Thanks,
Olivier
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mempool: fix rte primary program coredump
2022-01-27 10:06 ` Olivier Matz
@ 2023-06-30 21:36 ` Stephen Hemminger
0 siblings, 0 replies; 6+ messages in thread
From: Stephen Hemminger @ 2023-06-30 21:36 UTC (permalink / raw)
To: Olivier Matz; +Cc: Tianli Lai, dev
On Thu, 27 Jan 2022 11:06:56 +0100
Olivier Matz <olivier.matz@6wind.com> wrote:
> >
> > this array in primary program is different with secondary program.
> > so when secondary program call rte_pktmbuf_pool_create_by_ops() with
> > mempool name “ring_mp_mc”, but the primary program use "bucket" type
> > to alloc rte_mbuf.
> >
> > so sort this array both primary program and secondary program when init
> > memzone.
> >
> > Signed-off-by: Tianli Lai <laitianli@tom.com>
>
> I think it is the same problem than the one described here:
> http://inbox.dpdk.org/dev/1583114253-15345-1-git-send-email-xiangxia.m.yue@gmail.com/#r
>
> To summarize what is said in the thread, sorting ops look dangerous because it
> changes the index during the lifetime of the application. A new proposal was
> made to use a shared memory to ensure the indexes are the same in primary and
> secondaries, but it requires some changes in EAL to have init callbacks at a
> specific place.
>
> I have a draft patchset that may fix this issue by using the vdev infrastructure
> instead of a specific init, but it is not heavily tested. I can send it here as
> a RFC if you want to try it.
>
> One thing that is not clear to me is how do you trigger this issue? Why the
> mempool ops are not loaded in the same order in primary and secondary?
>
> Thanks,
> Olivier
Agree with Olivier, hard coded sort is not the best way to fix this.
Some work is needed to address either the ordering or communicate the list from primary/secondary
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-06-30 21:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10 15:57 [PATCH] mempool: fix rte primary program coredump Tianli Lai
2021-11-10 16:00 ` David Marchand
2021-11-10 16:07 ` laitianli
2021-11-10 17:15 ` Jerin Jacob
2022-01-27 10:06 ` Olivier Matz
2023-06-30 21:36 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).