* Mellanox - Unexpected CEQ error, rx stops receiving packets
@ 2024-06-28 9:22 Ken Andrews
2024-07-22 20:34 ` Alexander Kozyrev
0 siblings, 1 reply; 2+ messages in thread
From: Ken Andrews @ 2024-06-28 9:22 UTC (permalink / raw)
To: users
Hi,
I'm seeing an issue previously mentioned on this list in 2022, where my Mellanox NIC is logging an Unexpected CEQ error syndrome. Once this condition is hit, the
rxq->err_state var in the mlx5 PMD is never cleared, and the rx just loops around never receiving any further packets.
It's not clear what's causing the initial CEQ error, as it can take upwards of an hour to occur.
The full log entry is:
Unexpected CQE error syndrome 0x04 CQN = 256 SQN = 4679 wqe_counter = 2149 wq_ci = 3294 cq_ci = 42609
MLX5 Error CQ: at [0x292d26000], len=16384
The NIC is: NVIDIA ConnectX-7 HHHL Adapter card, 400GbE / NDR IB (default mode), Single-port OSFP, PCIe 5.0 x16, Crypto Enabled, Secure Boot Enabled
Part number: MCX75310AAC-NEA_Ax
Firmware: 28.36.1010
OFED Version: 24.04-0.6.6.0
DPDK Version: 23.11.0
This issue was previously mentioned in this post: https://mails.dpdk.org/archives/users/2022-October/006779.html
Can anyone please help shed some light on this?
Thanks,
Ken AndrewsKen Andrews R&D Departmentt: +44 1506 671416e: ken.andrews@calnexsol.comw: calnexsol.comNew Product
The SNE-X is a total solution to the problem of real-world Ethernet testing. It combines comprehensive and efficient network emulation for 5G, Data Center, and Cloud applications. Click for more information.
Calnex Solutions
Oracle Campus
Linlithgow
EH49 7LR
United KingdomCalnex Solutions plc is registered in Scotland. Registration number: SC299625. Registered office: Oracle Campus, Linlithgow, Scotland, EH49 7LR, United Kingdom.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Mellanox - Unexpected CEQ error, rx stops receiving packets
2024-06-28 9:22 Mellanox - Unexpected CEQ error, rx stops receiving packets Ken Andrews
@ 2024-07-22 20:34 ` Alexander Kozyrev
0 siblings, 0 replies; 2+ messages in thread
From: Alexander Kozyrev @ 2024-07-22 20:34 UTC (permalink / raw)
To: Ken Andrews, users
[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]
Hi Ken, here is the error syndrome 0x04 meaning:
0x4: Local_Protection_Error
"This event is generated when a user attempts to access an address outside of the registered memory region.
For example, this may happen if the Lkey does not match the address in the WR."
Looks like wrong buffer was passed to the NIC for a packet acquisition.
Could you please share more details on your test case? What is the traffic pattern? What is Rx/Tx queues config? mbufs?
Regards,
Alex
________________________________
From: Ken Andrews <ken.andrews@calnexsol.com>
Sent: Friday, June 28, 2024 5:22 AM
To: users@dpdk.org <users@dpdk.org>
Subject: Mellanox - Unexpected CEQ error, rx stops receiving packets
Hi,
I'm seeing an issue previously mentioned on this list in 2022, where my Mellanox NIC is logging an Unexpected CEQ error syndrome. Once this condition is hit, the
rxq->err_state var in the mlx5 PMD is never cleared, and the rx just loops around never receiving any further packets.
It's not clear what's causing the initial CEQ error, as it can take upwards of an hour to occur.
The full log entry is:
Unexpected CQE error syndrome 0x04 CQN = 256 SQN = 4679 wqe_counter = 2149 wq_ci = 3294 cq_ci = 42609
MLX5 Error CQ: at [0x292d26000], len=16384
The NIC is: NVIDIA ConnectX-7 HHHL Adapter card, 400GbE / NDR IB (default mode), Single-port OSFP, PCIe 5.0 x16, Crypto Enabled, Secure Boot Enabled
Part number: MCX75310AAC-NEA_Ax
Firmware: 28.36.1010
OFED Version: 24.04-0.6.6.0
DPDK Version: 23.11.0
This issue was previously mentioned in this post: https://mails.dpdk.org/archives/users/2022-October/006779.html
Can anyone please help shed some light on this?
Thanks,
Ken AndrewsKen Andrews R&D Departmentt: +44 1506 671416e: ken.andrews@calnexsol.comw: calnexsol.comNew Product
The SNE-X is a total solution to the problem of real-world Ethernet testing. It combines comprehensive and efficient network emulation for 5G, Data Center, and Cloud applications. Click for more information.
Calnex Solutions
Oracle Campus
Linlithgow
EH49 7LR
United KingdomCalnex Solutions plc is registered in Scotland. Registration number: SC299625. Registered office: Oracle Campus, Linlithgow, Scotland, EH49 7LR, United Kingdom.
[-- Attachment #2: Type: text/html, Size: 5207 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-07-22 20:34 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-28 9:22 Mellanox - Unexpected CEQ error, rx stops receiving packets Ken Andrews
2024-07-22 20:34 ` Alexander Kozyrev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).