* [Bug 996] PDK:20.11.1: net/ena crash while fetching xstats
@ 2022-04-18 8:50 bugzilla
2022-04-19 20:35 ` [Bug 996] DPDK:20.11.1: " bugzilla
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: bugzilla @ 2022-04-18 8:50 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=996
Bug ID: 996
Summary: PDK:20.11.1: net/ena crash while fetching xstats
Product: DPDK
Version: 20.11
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: major
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: amohakud@paloaltonetworks.com
Target Milestone: ---
I am facing one issue in net/ena driver while fetching extended stats (xstats).
The DPDK seems to segfault with below backtrace.
DPDK Version: 20.11.1
ENA version: 2.2.1
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/opt/dpfs/usr/local/bin/brdagent'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __memmove_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
232 VMOVU %VEC(0), (%rdi)
[Current thread is 1 (Thread 0x7fffed93a400 (LWP 5060))]
Thread 1 (Thread 0x7fffed93a400 (LWP 5060)):
#0 __memmove_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
#1 0x00007ffff3c246df in ena_com_handle_admin_completion () from
../lib64/../../lib64/libdpdk.so.20
#2 0x00007ffff3c1e7f5 in ena_interrupt_handler_rte () from
../lib64/../../lib64/libdpdk.so.20
#3 0x00007ffff3519902 in eal_intr_thread_main () from
/../lib64/../../lib64/libdpdk.so.20
#4 0x00007ffff510714a in start_thread (arg=<optimized out>) at
pthread_create.c:479
#5 0x00007ffff561ff23 in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Background:
This used to work fine with DPDK-19.11.3 , that means there was no crash
observed with the 19.11.3 DPDK version, but now after upgrading to DPDK
20.11.1, DPDK is crashing with the above trace. It looks to me as a DPDK issue.
I could see multiple fixes/patches in the net/ena area, but not able to
identify which patch would exactly fix this issue.
For example:
http://git.dpdk.org/dpdk/diff/?h=releases&id=aab58857330bb4bd03f6699bf1ee716f72993774
https://inbox.dpdk.org/dev/20210430125725.28796-6-mk@semihalf.com/T/#me99457c706718bb236d1fd8006ee7a0319ce76fc
Could you please help here and let me know what patch could fix this issue.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug 996] DPDK:20.11.1: net/ena crash while fetching xstats
2022-04-18 8:50 [Bug 996] PDK:20.11.1: net/ena crash while fetching xstats bugzilla
@ 2022-04-19 20:35 ` bugzilla
2022-04-27 11:04 ` bugzilla
2023-08-20 11:03 ` bugzilla
2 siblings, 0 replies; 4+ messages in thread
From: bugzilla @ 2022-04-19 20:35 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=996
Michal Krawczyk (mk@semihalf.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |WONTFIX
CC| |mk@semihalf.com
--- Comment #1 from Michal Krawczyk (mk@semihalf.com) ---
Hi Amiya,
this is associated with lack of the proper MP support of the older ENA PMD (it
supports MP properly starting from DPDK v22.03).
The xstats API uses ENA admin queue in DPDK v20.11, and admin queue cannot be
used safely from the secondary process.
Those commits were improving the MP support starting from v20.11 to v22.03:
net/ena: make ethdev references multi-process safe
aab58857330bb4bd03f6699bf1ee716f72993774
net/ena: disable ops not supported by secondary process
39ecdd3dfa15d5ac591ce8d77d362480bff32355
net/ena: proxy AQ calls to primary process (this is the critical patch)
e3595539e0e03f0dbb81904f8edaaef0447a4f62
net/ena: enable stats for multi-process mode
3aa3fa851f58873457bdc5c387d0e5956f812322
net/ena/base: make IO memzone unique per port
850e1bb1c72b3d1163b2857ab7a02af11ba29c40
Thanks,
Michal
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug 996] DPDK:20.11.1: net/ena crash while fetching xstats
2022-04-18 8:50 [Bug 996] PDK:20.11.1: net/ena crash while fetching xstats bugzilla
2022-04-19 20:35 ` [Bug 996] DPDK:20.11.1: " bugzilla
@ 2022-04-27 11:04 ` bugzilla
2023-08-20 11:03 ` bugzilla
2 siblings, 0 replies; 4+ messages in thread
From: bugzilla @ 2022-04-27 11:04 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=996
Michal Krawczyk (mk@semihalf.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|WONTFIX |---
Status|RESOLVED |UNCONFIRMED
--- Comment #3 from Michal Krawczyk (mk@semihalf.com) ---
Hey Amiya,
sorry for the late reply, I was OOO for one week.
Thank you for providing us with more details.
If you aren't calling any API that needs to use the ENA admin queue from the
secondary process, the situation you're seeing shouldn't happen.
I just executed simple application on DPDK v20.11.1 in MP mode - the main
process is fetching the xstats, the secondary process is simply performing the
packets forwarding. The application is not crashing for my case.
From what I understand, the crash happens, because:
1. The ENA admin queue is not using the shared memory
2. The secondary process sends the request and saves it in the secondary
process memory
3. The primary process receives the interrupt and executes the completion
handler
4. The completion handler cannot find the relevant request (as it's in the
secondary process memory) and the app crashes.
Please double check if:
1. The xstats aren't being fetched from the secondary process
2. You aren't calling any of API below from the secondary process, which also
uses the ENA admin queue:
- rte_eth_dev_set_mtu()
- rte_eth_dev_rss_reta_update()
- rte_eth_dev_rss_reta_query()
The point 1. is much more likely as you've described it's a regression in
v20.11, and indeed - the xstats were extended after v19.11 release.
If none of the above is true, every other information that could potentially
get us closer to the core of the issue may be helpful (we can't reproduce this
on our side).
Thanks,
Michal
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug 996] DPDK:20.11.1: net/ena crash while fetching xstats
2022-04-18 8:50 [Bug 996] PDK:20.11.1: net/ena crash while fetching xstats bugzilla
2022-04-19 20:35 ` [Bug 996] DPDK:20.11.1: " bugzilla
2022-04-27 11:04 ` bugzilla
@ 2023-08-20 11:03 ` bugzilla
2 siblings, 0 replies; 4+ messages in thread
From: bugzilla @ 2023-08-20 11:03 UTC (permalink / raw)
To: dev
[-- Attachment #1: Type: text/plain, Size: 641 bytes --]
https://bugs.dpdk.org/show_bug.cgi?id=996
shaibran@amazon.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
CC| |shaibran@amazon.com
--- Comment #5 from shaibran@amazon.com ---
ENA PMD 2.6.0 and later (available since DPDK 22.03) can be used safely from
the secondary process to retrieve xstats
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #2: Type: text/html, Size: 2738 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-20 11:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 8:50 [Bug 996] PDK:20.11.1: net/ena crash while fetching xstats bugzilla
2022-04-19 20:35 ` [Bug 996] DPDK:20.11.1: " bugzilla
2022-04-27 11:04 ` bugzilla
2023-08-20 11:03 ` bugzilla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).