DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob <jerinjacobk@gmail.com>
To: Tianli Lai <laitianli@tom.com>
Cc: dpdk-dev <dev@dpdk.org>
Subject: Re: [PATCH] mempool: fix rte primary program coredump
Date: Wed, 10 Nov 2021 22:45:22 +0530	[thread overview]
Message-ID: <CALBAE1PjEio=Q8GyW78Q73T9ZQknxZ_A5RZzgKNO8LVK=2sv0w@mail.gmail.com> (raw)
In-Reply-To: <1636559839-6553-1-git-send-email-laitianli@tom.com>

On Wed, Nov 10, 2021 at 9:38 PM Tianli Lai <laitianli@tom.com> wrote:
>
> the primary program(such as ofp app) run first, then run the secondary
> program(such as dpdk-pdump), the primary program would receive signal
> SIGSEGV. the function stack as follow:
>
> aived signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffee60e700 (LWP 112613)]
> 0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at
> /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
> 95      if (stack->top == 0)
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-196.el7.x86_64 libatomic-4.8.5-16.el7.x86_64
> libconfig-1.4.9-5.el7.x86_64 libgcc-4.8.5-16.el7.x86_64
> libpcap-1.5.3-12.el7.x86_64 numactl-libs-2.0.9-6.el7_2.x86_64
> openssl-libs-1.0.2k-8.el7.x86_64 zlib-1.2.7-17.el7.x86_64
> (gdb) bt
>  #0  0x00007ffff5f2cc0b in bucket_stack_pop (stack=0xffff00010000) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:95
>  #1  0x00007ffff5f2e5dc in bucket_dequeue_orphans (bd=0x2209e5fac0,obj_table=0x220b083710, n_orphans=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
>  #2  0x00007ffff5f30192 in bucket_dequeue (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
>  #3  0x00007ffff5f47e18 in rte_mempool_ops_dequeue_bulk (mp=0x220b07d5c0,obj_table=0x220b083710, n=251) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:739
>  #4  0x00007ffff5f4819d in __mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1443
>  #5  rte_mempool_generic_get (cache=0x220b083700, n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1506
>  #6  rte_mempool_get_bulk (n=1, obj_table=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1539
>  #7  rte_mempool_get (obj_p=0x7fffee5deb18, mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1565
>  #8  rte_mbuf_raw_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:551
>  #9  0x00007ffff5f483a4 in rte_pktmbuf_alloc (mp=0x220b07d5c0) at /ofp/dpdk/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:804
>  #10 0x00007ffff5f4c9d9 in pdump_pktmbuf_copy (m=0x220746ad80, mp=0x220b07d5c0) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:99
>  #11 0x00007ffff5f4e42e in pdump_copy (pkts=0x7fffee5dfdf0, nb_pkts=1, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:151
>  #12 0x00007ffff5f4eadd in pdump_rx (port=0, qidx=0, pkts=0x7fffee5dfdf0, nb_pkts=1, max_pkts=16, user_params=0x7ffff76d7cc0 <rx_cbs>) at /ofp/dpdk/lib/librte_pdump/rte_pdump.c:172
>  #13 0x00007ffff5d0e9e8 in rte_eth_rx_burst (port_id=0, queue_id=0, rx_pkts=0x7fffee5dfdf0, nb_pkts=16) at /ofp/dpdk/x86_64-native-linuxapp-gcc/usr/local/include/dpdk/rte_ethdev.h:4396
>  #14 0x00007ffff5d114c3 in recv_pkt_dpdk (pktio_entry=0x22005436c0, index=0, pkt_table=0x7fffee5dfdf0, num=16) at odp_packet_dpdk.c:1081
>  #15 0x00007ffff5d2f931 in odp_pktin_recv (queue=...,packets=0x7fffee5dfdf0, num=16) at ../linux-generic/odp_packet_io.c:1896
>  #16 0x000000000040a344 in rx_burst (pktin=...) at app_main.c:223
>  #17 0x000000000040aca4 in run_server_single (arg=0x7fffffffe2b0) at app_main.c:417
>  #18 0x00007ffff7bd6883 in run_thread (arg=0x7fffffffe3b8) at threads.c:67
>  #19 0x00007ffff53c8e25 in start_thread () from /lib64/libpthread.so.0
>  #20 0x00007ffff433e34d in clone () from /lib64/libc.so.6.c:67
>
> The program crash down reason is:
>
> In primary program and secondary program , the global array rte_mempool_ops.ops[]:
>         primary name            secondary name
>  [0]:   "bucket"                "ring_mp_mc"
>  [1]:   "dpaa"                  "ring_sp_sc"
>  [2]:   "dpaa2"                 "ring_mp_sc"
>  [3]:   "octeontx_fpavf"        "ring_sp_mc"
>  [4]:   "octeontx2_npa"         "octeontx2_npa"
>  [5]:   "ring_mp_mc"            "bucket"
>  [6]:   "ring_sp_sc"            "stack"
>  [7]:   "ring_mp_sc"            "if_stack"
>  [8]:   "ring_sp_mc"            "dpaa"
>  [9]:   "stack"                 "dpaa2"
>  [10]:  "if_stack"              "octeontx_fpavf"
>  [11]:  NULL                    NULL
>
>  this array in primary program is different with secondary program.
>  so when secondary program call rte_pktmbuf_pool_create_by_ops() with
>  mempool name “ring_mp_mc”, but the primary program use "bucket" type
>  to alloc rte_mbuf.
>
>  so sort this array both primary program and secondary program when init
>  memzone.
>
> Signed-off-by: Tianli Lai <laitianli@tom.com>
> ---
>  lib/librte_eal/common/eal_common_memzone.c |  2 +-
>  lib/librte_mempool/rte_mempool.h           |  6 ++++++
>  lib/librte_mempool/rte_mempool_ops.c       | 31 ++++++++++++++++++++++++++++++
>  3 files changed, 38 insertions(+), 1 deletion(-)
>  mode change 100644 => 100755 lib/librte_mempool/rte_mempool_ops.c
>
> diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
> index 99b8d65..b59f3f5 100644
> --- a/lib/librte_eal/common/eal_common_memzone.c
> +++ b/lib/librte_eal/common/eal_common_memzone.c
> @@ -384,7 +384,7 @@
>         }
>
>         rte_rwlock_write_unlock(&mcfg->mlock);
> -
> +       rte_sort_mempool_ops();
>         return ret;
>  }
>
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index f81152a..a22850b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -910,6 +910,12 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
>  int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>
>  /**
> + * Sort global array rte_mempool_ops_table.ops[] .
> + * Used by rte_eal_memzone_init()
> + */
> +int rte_sort_mempool_ops(void);

Since it is an internal API, No need for rte_ prefix.


> +
> +/**
>   * Macro to statically register the ops of a mempool handler.
>   * Note that the rte_mempool_register_ops fails silently here when
>   * more than RTE_MEMPOOL_MAX_OPS_IDX is registered.
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> old mode 100644
> new mode 100755
> index 22c5251..8e10488
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -68,6 +68,37 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
>         return ops_index;
>  }
>
> +
> +int rte_sort_mempool_ops(void)
> +{
> +       /* same with rte_mempool_ops.name */
> +       static const char *memops_name[RTE_MEMPOOL_MAX_OPS_IDX] = {
> +               "ring_mp_mc", "ring_sp_sc", "ring_mp_sc", "ring_sp_mc",
> +               "stack", "lf_stack", "octeontx2_npa", "octeontx_fpavf",
> +                "dpaa2", "dpaa", "bucket",

I think, it is not foolproof. I think, either

1) you can use primary - secondary communication
mechanism to get the library order from the primary.

OR

2) At end of primary rte_eal_init or so, copy the array to memzone and then
lookup the memzone by name(string) in secondary to fill this array.


> +        };
> +       struct rte_mempool_ops_table tmp_mempool_ops_table = {
> +               .sl =  rte_mempool_ops_table.sl,
> +               .num_ops = rte_mempool_ops_table.num_ops
> +       };
> +       uint32_t i = 0, j= 0;
> +       struct rte_mempool_ops *ops = NULL;
> +       for (i = 0; i < 16; i++) {
> +               const char* name = memops_name[i];
> +               if(name && strlen(name)) {
> +                       for(j = 0; j < rte_mempool_ops_table.num_ops; j++) {
> +                               if(strcmp(name, rte_mempool_ops_table.ops[j].name))
> +                                       continue;
> +                               ops = &rte_mempool_ops_table.ops[j];
> +                               memcpy(&tmp_mempool_ops_table.ops[i], ops, sizeof(*ops));
> +                               break;
> +                       }
> +               }
> +       }
> +       memcpy(&rte_mempool_ops_table, &tmp_mempool_ops_table, sizeof(tmp_mempool_ops_table));
> +       return 0;
> +}
> +
>  /* wrapper to allocate an external mempool's private (pool) data. */
>  int
>  rte_mempool_ops_alloc(struct rte_mempool *mp)
> --
> 1.8.3.1
>

  parent reply	other threads:[~2021-11-10 17:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 15:57 Tianli Lai
2021-11-10 16:00 ` David Marchand
2021-11-10 16:07   ` laitianli
2021-11-10 17:15 ` Jerin Jacob [this message]
2022-01-27 10:06 ` Olivier Matz
2023-06-30 21:36   ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALBAE1PjEio=Q8GyW78Q73T9ZQknxZ_A5RZzgKNO8LVK=2sv0w@mail.gmail.com' \
    --to=jerinjacobk@gmail.com \
    --cc=dev@dpdk.org \
    --cc=laitianli@tom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).