Segmentation fault when running MPRQ on testpmd

DPDK patches and discussions
 help / color / mirror / Atom feed

* Segmentation fault when running MPRQ on testpmd
@ 2025-08-20  8:40 Joni
  2025-08-20 10:07 ` Khadem Ullah
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Joni @ 2025-08-20  8:40 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 4476 bytes --]

Hi,

I hope this is the correct place to report these issues since it seems to
be related to DPDK codes. I've reported this to Nvidia a few days ago but
have yet to receive any response from them.

My server is currently using ConnectX5 MT27800 (mlx5_core 5.7-1.0.2) on
firmware 16.35.4506 (MT_0000000011). My DPDK library version is 22.11.

I ran the following testpmd command which resulted in segmentation fault (I
am currently running on filtered traffic with packets >1000 bytes to
increase the odds of hitting the segmentation fault):

dpdk-testpmd -l 1-5 -n 4 -a
0000:1f:00.0,rxq_comp_en=1,rxq_pkt_pad_en=1,rxqs_min_mprq=1,mprq_en=1,mprq_log_stride_num=6,mprq_log_stride_size=9,mprq_max_memcpy_len=64,rx_vec_en=1
-- -i --rxd=8192 --max-pkt-len=1700 --rxq=1 --total-num-mbufs=16384
--mbuf-size=3000 --enable_drop_en –enable_scatter

This segmentation fault goes away when I disable vectorization
(rx_vec_en=0). (Note that the segmentation fault does not occur in
forward-mode=rxonly). The segmentation fault also seems to happen with
higher chances when there is a rxnombuf.

Upon some investigation, I noticed that in DPDK’s source codes
drivers/net/mlx5/mlx5_rxtx_vec.c
(function rxq_copy_mprq_mbuf_v()), there is a possibility where the
consumed stride exceeds the stride number (64 in this case) which should
not be happening. I'm suspecting there's some CQE misalignment here upon
encountering rxnombuf.

rxq_copy_mprq_mbuf_v(...) {
    ...
    if(rxq->consumed_strd == strd_n) {
        // replenish WQE
    }
    ...
    strd_cnt = (elts[i]->pkt_len / strd_sz) +
               ((elts[i]->pkt_len % strd_sz) ? 1 : 0);

    rxq_code = mprq_buf_to_pkt(rxq, elts[i], elts[i]->pkt_len, buf,
rxq->consumed_strd, strd_cnt);
    rxq->consumed_strd += strd_cnt;       // encountering cases where
rxq->consumed_strd > strd_n
    ...
}

In addition, there were also cases in mprq_buf_to_pkt() where the allocated
seg address is exactly the same as the pkt (elts[i]) address passed in
which should not happen.

mprq_buf_to_pkt(...) {
    ...
    if(hdrm_overlap > 0) {
        MLX5_ASSERT(rxq->strd_scatter_en);

        struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
        if (unlikely(seg == NULL)) return MLX5_RXQ_CODE_NOMBUF;
        SET_DATA_OFF(seg, 0);

        // added debug statement
        DRV_LOG(DEBUG, "pkt %p seg %p", (void *)pkt, (void *)seg);

rte_memcpy(rte_pktmbuf_mtod(seg, void *), RTE_PTR_ADD(addr, len -
hdrm_overlap), hdrm_overlap); ... } }

I have tried upgrading my DPDK version to 24.11 but the segmentation fault
still persists.

In addition, there were also a few other issues that I've noticed:

   - max-pkt-len does not seem to work for values < 1500 even though "show
   port info X" showed that the MTU was set to the value I've passed in
   - In mprq_buf_to_pkt():
       - uint32_t seg_len = RTE_MIN(len, (uint32_t)(pkt->buf_len -
   RTE_PKTMBUF_HEADROOM)) --> seems unnecessary as to hit this code, len has
   to be greater than (uint32_t)(pkt->buf_len - RTE_PKTMBUF_HEADROOM) due to
   the if condition
       - If the allocation struct rte_mbuf *next =
   rte_pktmbuf_alloc(rxq->mp) fails and packet has more than 2 segs, the segs
   that were allocated previously do not get freed

    mprq_buf_to_pkt(...) {
        ...                } else if (rxq->strd_scatter_en) {

struct rte_mbuf *prev = pkt;

uint32_t seg_len = RTE_MIN(len, (uint32_t)

(pkt->buf_len - RTE_PKTMBUF_HEADROOM));

uint32_t rem_len = len - seg_len;


      rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, seg_len);
      DATA_LEN(pkt) = seg_len;
      while (rem_len) {
         struct rte_mbuf *next = rte_pktmbuf_alloc(rxq->mp);


            if (unlikely(next == NULL))
                return MLX5_RXQ_CODE_NOMBUF;
            ...
    - In the external buffer attach case where hdrm_overlap > 0, the code
did not decrement the buffer refcnt if allocation struct rte_mbuf *next =
rte_pktmbuf_alloc(rxq->mp) fails

mprq_buf_to_pkt(...) {
    ...            if (hdrm_overlap > 0) {

        __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED);
        ...
        MLX5_ASSERT(rxq->strd_scatter_en);
        struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
        if (unlikely(seg == NULL))
            return MLX5_RXQ_CODE_NOMBUF;
        SET_DATA_OFF(seg, 0);
        ...


Hope to hear from you soon!

With regards,
Joni

[-- Attachment #2: Type: text/html, Size: 5913 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segmentation fault when running MPRQ on testpmd
  2025-08-20  8:40 Segmentation fault when running MPRQ on testpmd Joni
@ 2025-08-20 10:07 ` Khadem Ullah
  2025-08-20 10:34 ` Khadem Ullah
  2025-08-20 12:02 ` Dariusz Sosnowski
  2 siblings, 0 replies; 5+ messages in thread
From: Khadem Ullah @ 2025-08-20 10:07 UTC (permalink / raw)
  To: Joni; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 210 bytes --]

Hi Joni,

Please try the following command for enabling mprq.
./build/app/dpdk-testpmd -l 0-15  -n 4 -a 03:00.0,mprq_en=1 -- -i
--nb-cores=12 --rxq=12 --txq=12 --rxd=4096 --txd=4096 --burst=64

Regards,
Khadem

[-- Attachment #2: Type: text/html, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segmentation fault when running MPRQ on testpmd
  2025-08-20  8:40 Segmentation fault when running MPRQ on testpmd Joni
  2025-08-20 10:07 ` Khadem Ullah
@ 2025-08-20 10:34 ` Khadem Ullah
  2025-08-20 12:02 ` Dariusz Sosnowski
  2 siblings, 0 replies; 5+ messages in thread
From: Khadem Ullah @ 2025-08-20 10:34 UTC (permalink / raw)
  To: Joni; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 410 bytes --]

Hi,
I have run your command with minor modification on dpdk 22.11, it is also
running without any seg fault; please try below one:

./build/app/dpdk-testpmd -l 1-5 -n 4 -a
0000:03:00.0,rxq_pkt_pad_en=1,rxqs_min_mprq=1,mprq_en=1,mprq_log_stride_num=6,mprq_log_stride_size=9,mprq_max_memcpy_len=64,rx_vec_en=1
-- -i --rxd=8192 --max-pkt-len=1700 --rxq=1 --total-num-mbufs=16384
--mbuf-size=3000

Regards,
Khadem

[-- Attachment #2: Type: text/html, Size: 494 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segmentation fault when running MPRQ on testpmd
  2025-08-20  8:40 Segmentation fault when running MPRQ on testpmd Joni
  2025-08-20 10:07 ` Khadem Ullah
  2025-08-20 10:34 ` Khadem Ullah
@ 2025-08-20 12:02 ` Dariusz Sosnowski
  2025-08-21  3:14   ` Joni
  2 siblings, 1 reply; 5+ messages in thread
From: Dariusz Sosnowski @ 2025-08-20 12:02 UTC (permalink / raw)
  To: Joni, Viacheslav Ovsiienko; +Cc: dev

Hi,

On Wed, Aug 20, 2025 at 04:40:16PM +0800, Joni wrote:
> Hi,
> 
> I hope this is the correct place to report these issues since it seems to
> be related to DPDK codes. I've reported this to Nvidia a few days ago but
> have yet to receive any response from them.
> 
> My server is currently using ConnectX5 MT27800 (mlx5_core 5.7-1.0.2) on
> firmware 16.35.4506 (MT_0000000011). My DPDK library version is 22.11.
> 
> I ran the following testpmd command which resulted in segmentation fault (I
> am currently running on filtered traffic with packets >1000 bytes to
> increase the odds of hitting the segmentation fault):
> 
> dpdk-testpmd -l 1-5 -n 4 -a
> 0000:1f:00.0,rxq_comp_en=1,rxq_pkt_pad_en=1,rxqs_min_mprq=1,mprq_en=1,mprq_log_stride_num=6,mprq_log_stride_size=9,mprq_max_memcpy_len=64,rx_vec_en=1
> -- -i --rxd=8192 --max-pkt-len=1700 --rxq=1 --total-num-mbufs=16384
> --mbuf-size=3000 --enable_drop_en –enable_scatter
> 
> This segmentation fault goes away when I disable vectorization
> (rx_vec_en=0). (Note that the segmentation fault does not occur in
> forward-mode=rxonly). The segmentation fault also seems to happen with
> higher chances when there is a rxnombuf.

Thank you for reporting and for the analysis.

Could you please open a bug on https://bugs.dpdk.org/ with all the
details?

Do you happen to have a stack trace from the segmentation fault?

Slava: Could you please take a look at the issue described by Joni in this mail?

> 
> Upon some investigation, I noticed that in DPDK’s source codes
> drivers/net/mlx5/mlx5_rxtx_vec.c
> (function rxq_copy_mprq_mbuf_v()), there is a possibility where the
> consumed stride exceeds the stride number (64 in this case) which should
> not be happening. I'm suspecting there's some CQE misalignment here upon
> encountering rxnombuf.
> 
> rxq_copy_mprq_mbuf_v(...) {
>     ...
>     if(rxq->consumed_strd == strd_n) {
>         // replenish WQE
>     }
>     ...
>     strd_cnt = (elts[i]->pkt_len / strd_sz) +
>                ((elts[i]->pkt_len % strd_sz) ? 1 : 0);
> 
>     rxq_code = mprq_buf_to_pkt(rxq, elts[i], elts[i]->pkt_len, buf,
> rxq->consumed_strd, strd_cnt);
>     rxq->consumed_strd += strd_cnt;       // encountering cases where
> rxq->consumed_strd > strd_n
>     ...
> }
> 
> In addition, there were also cases in mprq_buf_to_pkt() where the allocated
> seg address is exactly the same as the pkt (elts[i]) address passed in
> which should not happen.
> 
> mprq_buf_to_pkt(...) {
>     ...
>     if(hdrm_overlap > 0) {
>         MLX5_ASSERT(rxq->strd_scatter_en);
> 
>         struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
>         if (unlikely(seg == NULL)) return MLX5_RXQ_CODE_NOMBUF;
>         SET_DATA_OFF(seg, 0);
> 
>         // added debug statement
>         DRV_LOG(DEBUG, "pkt %p seg %p", (void *)pkt, (void *)seg);
> 
> rte_memcpy(rte_pktmbuf_mtod(seg, void *), RTE_PTR_ADD(addr, len -
> hdrm_overlap), hdrm_overlap); ... } }
> 
> I have tried upgrading my DPDK version to 24.11 but the segmentation fault
> still persists.
> 
> In addition, there were also a few other issues that I've noticed:
> 
>    - max-pkt-len does not seem to work for values < 1500 even though "show
>    port info X" showed that the MTU was set to the value I've passed in
>    - In mprq_buf_to_pkt():
>        - uint32_t seg_len = RTE_MIN(len, (uint32_t)(pkt->buf_len -
>    RTE_PKTMBUF_HEADROOM)) --> seems unnecessary as to hit this code, len has
>    to be greater than (uint32_t)(pkt->buf_len - RTE_PKTMBUF_HEADROOM) due to
>    the if condition
>        - If the allocation struct rte_mbuf *next =
>    rte_pktmbuf_alloc(rxq->mp) fails and packet has more than 2 segs, the segs
>    that were allocated previously do not get freed
> 
>     mprq_buf_to_pkt(...) {
>         ...                } else if (rxq->strd_scatter_en) {
> 
> struct rte_mbuf *prev = pkt;
> 
> uint32_t seg_len = RTE_MIN(len, (uint32_t)
> 
> (pkt->buf_len - RTE_PKTMBUF_HEADROOM));
> 
> uint32_t rem_len = len - seg_len;
> 
> 
>       rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, seg_len);
>       DATA_LEN(pkt) = seg_len;
>       while (rem_len) {
>          struct rte_mbuf *next = rte_pktmbuf_alloc(rxq->mp);
> 
> 
>             if (unlikely(next == NULL))
>                 return MLX5_RXQ_CODE_NOMBUF;
>             ...
>     - In the external buffer attach case where hdrm_overlap > 0, the code
> did not decrement the buffer refcnt if allocation struct rte_mbuf *next =
> rte_pktmbuf_alloc(rxq->mp) fails
> 
> mprq_buf_to_pkt(...) {
>     ...            if (hdrm_overlap > 0) {
> 
>         __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED);
>         ...
>         MLX5_ASSERT(rxq->strd_scatter_en);
>         struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
>         if (unlikely(seg == NULL))
>             return MLX5_RXQ_CODE_NOMBUF;
>         SET_DATA_OFF(seg, 0);
>         ...
> 
> 
> Hope to hear from you soon!
> 
> With regards,
> Joni

Best regards,
Dariusz Sosnowski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segmentation fault when running MPRQ on testpmd
  2025-08-20 12:02 ` Dariusz Sosnowski
@ 2025-08-21  3:14   ` Joni
  0 siblings, 0 replies; 5+ messages in thread
From: Joni @ 2025-08-21  3:14 UTC (permalink / raw)
  To: Dariusz Sosnowski; +Cc: Viacheslav Ovsiienko, dev

[-- Attachment #1: Type: text/plain, Size: 6260 bytes --]

Hi Dariusz,

Sure!

#0  0x0000000001c34912 in __rte_pktmbuf_free_extbuf ()
#1  0x0000000001c36a10 in rte_pktmbuf_detach ()
#2  0x0000000001c4a9ec in rxq_copy_mprq_mbuf_v ()
#3  0x0000000001c4d63b in rxq_burst_mprq_v ()
#4  0x0000000001c4d7a7 in mlx5_rx_burst_mprq_vec ()
#5  0x000000000050be66 in rte_eth_rx_burst ()
#6  0x000000000050c53d in pkt_burst_io_forward ()
#7  0x00000000005427b4 in run_pkt_fwd_on_lcore ()
#8  0x000000000054289b in start_pkt_forward_on_core ()
#9  0x0000000000a473c9 in eal_thread_loop ()
#10 0x00007ffff60061ca in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff5c72e73 in clone () from /lib64/libc.so.6

I've raised the bugs as instructed (ID 1776, 1777, 1778 and 1779) and
included the stack trace there too.


With regards,
Joni


On Wed, Aug 20, 2025 at 8:04 PM Dariusz Sosnowski <dsosnowski@nvidia.com>
wrote:

> Hi,
>
> On Wed, Aug 20, 2025 at 04:40:16PM +0800, Joni wrote:
> > Hi,
> >
> > I hope this is the correct place to report these issues since it seems to
> > be related to DPDK codes. I've reported this to Nvidia a few days ago but
> > have yet to receive any response from them.
> >
> > My server is currently using ConnectX5 MT27800 (mlx5_core 5.7-1.0.2) on
> > firmware 16.35.4506 (MT_0000000011). My DPDK library version is 22.11.
> >
> > I ran the following testpmd command which resulted in segmentation fault
> (I
> > am currently running on filtered traffic with packets >1000 bytes to
> > increase the odds of hitting the segmentation fault):
> >
> > dpdk-testpmd -l 1-5 -n 4 -a
> >
> 0000:1f:00.0,rxq_comp_en=1,rxq_pkt_pad_en=1,rxqs_min_mprq=1,mprq_en=1,mprq_log_stride_num=6,mprq_log_stride_size=9,mprq_max_memcpy_len=64,rx_vec_en=1
> > -- -i --rxd=8192 --max-pkt-len=1700 --rxq=1 --total-num-mbufs=16384
> > --mbuf-size=3000 --enable_drop_en –enable_scatter
> >
> > This segmentation fault goes away when I disable vectorization
> > (rx_vec_en=0). (Note that the segmentation fault does not occur in
> > forward-mode=rxonly). The segmentation fault also seems to happen with
> > higher chances when there is a rxnombuf.
>
> Thank you for reporting and for the analysis.
>
> Could you please open a bug on https://bugs.dpdk.org/ with all the
> details?
>
> Do you happen to have a stack trace from the segmentation fault?
>
> Slava: Could you please take a look at the issue described by Joni in this
> mail?
>
> >
> > Upon some investigation, I noticed that in DPDK’s source codes
> > drivers/net/mlx5/mlx5_rxtx_vec.c
> > (function rxq_copy_mprq_mbuf_v()), there is a possibility where the
> > consumed stride exceeds the stride number (64 in this case) which should
> > not be happening. I'm suspecting there's some CQE misalignment here upon
> > encountering rxnombuf.
> >
> > rxq_copy_mprq_mbuf_v(...) {
> >     ...
> >     if(rxq->consumed_strd == strd_n) {
> >         // replenish WQE
> >     }
> >     ...
> >     strd_cnt = (elts[i]->pkt_len / strd_sz) +
> >                ((elts[i]->pkt_len % strd_sz) ? 1 : 0);
> >
> >     rxq_code = mprq_buf_to_pkt(rxq, elts[i], elts[i]->pkt_len, buf,
> > rxq->consumed_strd, strd_cnt);
> >     rxq->consumed_strd += strd_cnt;       // encountering cases where
> > rxq->consumed_strd > strd_n
> >     ...
> > }
> >
> > In addition, there were also cases in mprq_buf_to_pkt() where the
> allocated
> > seg address is exactly the same as the pkt (elts[i]) address passed in
> > which should not happen.
> >
> > mprq_buf_to_pkt(...) {
> >     ...
> >     if(hdrm_overlap > 0) {
> >         MLX5_ASSERT(rxq->strd_scatter_en);
> >
> >         struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
> >         if (unlikely(seg == NULL)) return MLX5_RXQ_CODE_NOMBUF;
> >         SET_DATA_OFF(seg, 0);
> >
> >         // added debug statement
> >         DRV_LOG(DEBUG, "pkt %p seg %p", (void *)pkt, (void *)seg);
> >
> > rte_memcpy(rte_pktmbuf_mtod(seg, void *), RTE_PTR_ADD(addr, len -
> > hdrm_overlap), hdrm_overlap); ... } }
> >
> > I have tried upgrading my DPDK version to 24.11 but the segmentation
> fault
> > still persists.
> >
> > In addition, there were also a few other issues that I've noticed:
> >
> >    - max-pkt-len does not seem to work for values < 1500 even though
> "show
> >    port info X" showed that the MTU was set to the value I've passed in
> >    - In mprq_buf_to_pkt():
> >        - uint32_t seg_len = RTE_MIN(len, (uint32_t)(pkt->buf_len -
> >    RTE_PKTMBUF_HEADROOM)) --> seems unnecessary as to hit this code, len
> has
> >    to be greater than (uint32_t)(pkt->buf_len - RTE_PKTMBUF_HEADROOM)
> due to
> >    the if condition
> >        - If the allocation struct rte_mbuf *next =
> >    rte_pktmbuf_alloc(rxq->mp) fails and packet has more than 2 segs, the
> segs
> >    that were allocated previously do not get freed
> >
> >     mprq_buf_to_pkt(...) {
> >         ...                } else if (rxq->strd_scatter_en) {
> >
> > struct rte_mbuf *prev = pkt;
> >
> > uint32_t seg_len = RTE_MIN(len, (uint32_t)
> >
> > (pkt->buf_len - RTE_PKTMBUF_HEADROOM));
> >
> > uint32_t rem_len = len - seg_len;
> >
> >
> >       rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, seg_len);
> >       DATA_LEN(pkt) = seg_len;
> >       while (rem_len) {
> >          struct rte_mbuf *next = rte_pktmbuf_alloc(rxq->mp);
> >
> >
> >             if (unlikely(next == NULL))
> >                 return MLX5_RXQ_CODE_NOMBUF;
> >             ...
> >     - In the external buffer attach case where hdrm_overlap > 0, the code
> > did not decrement the buffer refcnt if allocation struct rte_mbuf *next =
> > rte_pktmbuf_alloc(rxq->mp) fails
> >
> > mprq_buf_to_pkt(...) {
> >     ...            if (hdrm_overlap > 0) {
> >
> >         __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED);
> >         ...
> >         MLX5_ASSERT(rxq->strd_scatter_en);
> >         struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
> >         if (unlikely(seg == NULL))
> >             return MLX5_RXQ_CODE_NOMBUF;
> >         SET_DATA_OFF(seg, 0);
> >         ...
> >
> >
> > Hope to hear from you soon!
> >
> > With regards,
> > Joni
>
> Best regards,
> Dariusz Sosnowski
>

[-- Attachment #2: Type: text/html, Size: 7793 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-21  9:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-20  8:40 Segmentation fault when running MPRQ on testpmd Joni
2025-08-20 10:07 ` Khadem Ullah
2025-08-20 10:34 ` Khadem Ullah
2025-08-20 12:02 ` Dariusz Sosnowski
2025-08-21  3:14   ` Joni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).