Hi, All:
I am using Mellanox ConnectX-5 and ConnectX-4 Lx with DPDK v21.11 but
there is a probability that the nic can't send packets.
One condition is that the contiguous physical of hugepages allocated
on the host is poor. For example, if the environment is configured with
10GB hugepages but each hugepage is physically discontinuous, this problem
can be reproduced.
This problem is introduced by this patch:
https://git.dpdk.org/dpdk/commit/?id=fec28ca0e3a93143829f3b41a28a8da933f28499.
LOG:
dpdk # ./x86_64-native-linuxapp-gcc/app/dpdk-testpmd -c 0xFC0 --iova-mode pa
--legacy-mem -a 03:00.0 -a 03:00.1 -m 8192,0 -- -a -i --forward-mode=fwd
--rxq=4 --txq=4 --total-num-mbufs=1000000
EAL: Detected CPU lcores: 72
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: 0000:03:00.0 (socket 0)
mlx5_net: Default miss action is not supported.
EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: 0000:03:00.1 (socket 0)
mlx5_net: Default miss action is not supported.
TELEMETRY: No legacy callbacks, legacy socket not created
Auto-start selected
Interactive-mode selected
Invalid fwd packet forwarding mode
testpmd: create a new mbuf pool <mb_pool_0>: n=1000000, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 28:DE:E5:AB:9D:CA
Configuring Port 1 (socket 0)
Port 1: 28:DE:E5:AB:9D:CB
Checking link statuses...
Done
Start automatic packet forwarding
io packet forwarding - ports=2 - cores=1 - streams=8 - NUMA support enabled, MP allocation mode: native
Logical Core 7 (socket 0) forwards packets on 8 streams:
RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
RX P=0/Q=1 (socket 0) -> TX P=1/Q=1 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=1 (socket 0) -> TX P=0/Q=1 (socket 0) peer=02:00:00:00:00:00
RX P=0/Q=2 (socket 0) -> TX P=1/Q=2 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=2 (socket 0) -> TX P=0/Q=2 (socket 0) peer=02:00:00:00:00:00
RX P=0/Q=3 (socket 0) -> TX P=1/Q=3 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=3 (socket 0) -> TX P=0/Q=3 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=2
port 0: RX queue number: 4 Tx queue number: 4
Rx offloads=0x0 Tx offloads=0x10000
RX queue: 0
RX desc=4096 - RX free threshold=64
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=4096 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x10000 - TX RS bit threshold=0
port 1: RX queue number: 4 Tx queue number: 4
Rx offloads=0x0 Tx offloads=0x10000
RX queue: 0
RX desc=4096 - RX free threshold=64
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=4096 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x10000 - TX RS bit threshold=0
testpmd> mlx5_net: Cannot change Tx QP state to INIT Invalid argument
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
testpmd> mlx5_net: Cannot change Tx QP state to INIT Invalid argument
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
quimlx5_net: Cannot change Tx QP state to INIT Invalid argument
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
And create some files:
/var/log/dpdk_mlx5_port_0_txq_0_index_0_1883249505
/var/log/dpdk_mlx5_port_0_txq_0_index_0_2291454530
/var/log/dpdk_mlx5_port_0_txq_0_index_0_2880295119
/var/log/dpdk_mlx5_port_1_txq_0_index_0_2198716197
/var/log/dpdk_mlx5_port_1_txq_0_index_0_2498129310
/var/log/dpdk_mlx5_port_1_txq_0_index_0_3046021743
Unexpected CQE error syndrome 0x04 CQN = 256 SQN = 6612 wqe_counter = 0 wq_ci = 1 cq_ci = 0
MLX5 Error CQ: at [0x7f6edca57000], len=16384
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000020: 00 00 00 01 73 65 65 6E 00 00 00 00 00 00 00 00 | ....seen........
00000030: 00 00 00 00 9D 00 53 04 29 00 19 D4 00 00 02 D2 | ......S.).......
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 | ................
00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................