DPDK usage discussions
 help / color / mirror / Atom feed
From: "jiangheng (G)" <jiangheng14@huawei.com>
To: "users@dpdk.org" <users@dpdk.org>,
	"matan@nvidia.com" <matan@nvidia.com>,
	 Slava Ovsiienko <viacheslavo@nvidia.com>,
	"orika@nvidia.com" <orika@nvidia.com>
Subject: [mlx5] CX6 NIC bug, the process exits abnormally.
Date: Thu, 14 Sep 2023 12:08:37 +0000	[thread overview]
Message-ID: <ce8cbb28302f43eca87461ad35562db3@huawei.com> (raw)

[-- Attachment #1.1: Type: text/plain, Size: 2799 bytes --]

During the pressure test on the CX6 using DPDK, the process exits abnormally. It is located that the problem is caused by a bug of the DPDK mlx5 driver. Please check whether the latest firmware and driver fix this coredump.

By default, the DPDK enables the rxtx_vect and compress CQE functions, and the receive ringbuffer is 1024. During the service process pressure, the service process receives SIGFAULT and exits.
Call stack information:
    #2  0x0000000000e72437 in signal_captured_function (signo=11, si=0x7f6310f46eb0, ucontext=0x7f6310f46d80) at ../v1/handle_signal.c:499
    #3  <signal handler called>
    #4  _mm_storeu_si128 (__B=..., __P=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/7.3.0/include/emmintrin.h:720
    #5  rxq_cq_decompress_v (elts=0x20217ff394e8, cq=0x20217f8538c0, rxq=0x20217ff36e00) at ../drivers/net/mlx5/mlx5_rxtx_vec_sse.h:159
    #6  rxq_burst_v (no_cq=<synthetic pointer>, err=<synthetic pointer>, pkts_n=9, pkts=0x2004e278c9d8, rxq=0x20217ff36e00) at ../drivers/net/mlx5/mlx5_rxtx_vec.c:349
    #7  mlx5_rx_burst_vec (dpdk_rxq=0x20217ff36e00, pkts=0x2004e278c9d8, pkts_n=128) at ../drivers/net/mlx5/mlx5_rxtx_vec.c:393
    #8  0x0000000001086448 in rte_eth_rx_burst (nb_pkts=128, rx_pkts=0x2004e278c9d8, queue_id=7, port_id=<optimized out>) at ../include/dpdk/rte_ethdev.h:5339

Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
    [root@localhost ~]# ofed_info -s

    [root@localhost ~]# ethtool -i eth6|grep fir
firmware-version: 22.37.1014 (MT_0000000359)
dpdk version: DPDK 21.11

157:          /* B.1 store rearm data to mbuf. */
158:          _mm_storeu_si128((__m128i *)&elts[pos + 2]->rearm_data, rearm);
159:          _mm_storeu_si128((__m128i *)&elts[pos + 3]->rearm_data, rearm);

Root cause: When processing compressed CQEs, 9 mini CQEs need to be processed and (*rxq->elts)[1021] -> (*rxq->elts)[1028] is accessed. Only [0, 1027] are reserved during the initialization of the receive queue. A null pointer is accessed due to out-of-bounds access. As a result, a core dump occurs in the process.
(gdb) p elts[0]
$149 = (struct rte_mbuf *) 0x2006945a8000  //first round
(gdb) p elts[1]
$150 = (struct rte_mbuf *) 0x2006945aa1c0
(gdb) p elts[2]
$151 = (struct rte_mbuf *) 0x2006945ac380
(gdb) p elts[3]
$152 = (struct rte_mbuf *) 0x20217ff36f80
(gdb) p elts[4]
$153 = (struct rte_mbuf *) 0x20217ff36f80  //Second round
(gdb) p elts[5]
$154 = (struct rte_mbuf *) 0x20217ff36f80
(gdb) p elts[6]
$155 = (struct rte_mbuf *) 0x20217ff36f80
(gdb) p elts[7]
$156 = (struct rte_mbuf *) 0x0     //coredump
(gdb) p elts - (*rxq->elts)
$157 = 1021

[-- Attachment #1.2: Type: text/html, Size: 8958 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 35072 bytes --]

                 reply	other threads:[~2023-09-14 12:08 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce8cbb28302f43eca87461ad35562db3@huawei.com \
    --to=jiangheng14@huawei.com \
    --cc=matan@nvidia.com \
    --cc=orika@nvidia.com \
    --cc=users@dpdk.org \
    --cc=viacheslavo@nvidia.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).