DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Michał Krawczyk" <mk@semihalf.com>
To: Amiya Mohakud <amohakud@paloaltonetworks.com>
Cc: dev <dev@dpdk.org>, Sachin Kanoje <skanoje@paloaltonetworks.com>,
	 Megha Punjani <mpunjani@paloaltonetworks.com>,
	Sharad Saha <ssaha@paloaltonetworks.com>,
	 Eswar Sadaram <esadaram@paloaltonetworks.com>,
	"Brandes, Shai" <shaibran@amazon.com>,
	ena-dev <ena-dev@semihalf.com>
Subject: Re: DPDK:20.11.1: net/ena crash while fetching xstats
Date: Tue, 19 Apr 2022 14:10:23 +0200	[thread overview]
Message-ID: <CAJMMOfP6yzCWVfeY4ZwYSx12QgHXd3PkdtcsF=qMksJUn68Czw@mail.gmail.com> (raw)
In-Reply-To: <CAC6kMfMhL9mWhDL1EHKCA=SUSaemEvjTOLMO5PYAFqT01oxD5A@mail.gmail.com>

pon., 18 kwi 2022 o 17:19 Amiya Mohakud
<amohakud@paloaltonetworks.com> napisał(a):
>
> + Megha, Sharad and Eswar.
>
> On Mon, Apr 18, 2022 at 2:03 PM Amiya Mohakud <amohakud@paloaltonetworks.com> wrote:
>>
>> Hi Michal/DPDK-Experts,
>>
>> I am facing one issue in net/ena driver while fetching extended stats (xstats). The DPDK seems to segfault with below backtrace.
>>
>> DPDK Version: 20.11.1
>> ENA version: 2.2.1
>>
>>
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>
>> Core was generated by `/opt/dpfs/usr/local/bin/brdagent'.
>>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>>
>> #0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
>>
>> 232             VMOVU   %VEC(0), (%rdi)
>>
>> [Current thread is 1 (Thread 0x7fffed93a400 (LWP 5060))]
>>
>>
>> Thread 1 (Thread 0x7fffed93a400 (LWP 5060)):
>>
>> #0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
>>
>> #1  0x00007ffff3c246df in ena_com_handle_admin_completion () from ../lib64/../../lib64/libdpdk.so.20
>>
>> #2  0x00007ffff3c1e7f5 in ena_interrupt_handler_rte () from ../lib64/../../lib64/libdpdk.so.20
>>
>> #3  0x00007ffff3519902 in eal_intr_thread_main () from /../lib64/../../lib64/libdpdk.so.20
>>
>> #4  0x00007ffff510714a in start_thread (arg=<optimized out>) at pthread_create.c:479
>>
>> #5  0x00007ffff561ff23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>>
>>
>>
>> Background:
>>
>> This used to work fine with DPDK-19.11.3 , that means there was no crash observed with the 19.11.3 DPDK version, but now after upgrading to DPDK 20.11.1, DPDK is crashing with the above trace.
>> It looks to me as a DPDK issue.
>> I could see multiple fixes/patches in the net/ena area, but not able to identify which patch would exactly fix this issue.
>>
>> For example: http://git.dpdk.org/dpdk/diff/?h=releases&id=aab58857330bb4bd03f6699bf1ee716f72993774
>> https://inbox.dpdk.org/dev/20210430125725.28796-6-mk@semihalf.com/T/#me99457c706718bb236d1fd8006ee7a0319ce76fc
>>
>>
>> Could you please help here and let me know what patch could fix this issue.
>>

+ Shai Brandes and ena-dev

Hi Amiya,

Thanks for reaching me out. Could you please provide us with more
details regarding the reproduction? I cannot reproduce this on my
setup for DPDK v20.11.1 when using testpmd and probing for the xstats.

=======================================================================
[ec2-user@<removed> dpdk]$ sudo ./build/app/dpdk-testpmd -- -i
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL:   Invalid NUMA socket, default to 0
EAL:   Invalid NUMA socket, default to 0
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket 0)
EAL: No legacy callbacks, legacy socket not created
Interactive-mode selected
ena_mtu_set(): Set MTU: 1500

testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc

Warning! port-topology=paired and odd forward ports number, the last
port will pair with itself.

Configuring Port 0 (socket 0)
Port 0: <removed>
Checking link statuses...
Done
Error during enabling promiscuous mode for port 0: Operation not
supported - ignore
testpmd> start
io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support
enabled, MP allocation mode: native
Logical Core 1 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=1
  port 0: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=0 - RX free threshold=0
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=0 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
testpmd> show port xstats 0
###### NIC extended statistics for port 0
rx_good_packets: 1
tx_good_packets: 1
rx_good_bytes: 42
tx_good_bytes: 42
rx_missed_errors: 0
rx_errors: 0
tx_errors: 0
rx_mbuf_allocation_errors: 0
rx_q0_packets: 1
rx_q0_bytes: 42
rx_q0_errors: 0
tx_q0_packets: 1
tx_q0_bytes: 42
wd_expired: 0
dev_start: 1
dev_stop: 0
tx_drops: 0
bw_in_allowance_exceeded: 0
bw_out_allowance_exceeded: 0
pps_allowance_exceeded: 0
conntrack_allowance_exceeded: 0
linklocal_allowance_exceeded: 0
rx_q0_cnt: 1
rx_q0_bytes: 42
rx_q0_refill_partial: 0
rx_q0_bad_csum: 0
rx_q0_mbuf_alloc_fail: 0
rx_q0_bad_desc_num: 0
rx_q0_bad_req_id: 0
tx_q0_cnt: 1
tx_q0_bytes: 42
tx_q0_prepare_ctx_err: 0
tx_q0_linearize: 0
tx_q0_linearize_failed: 0
tx_q0_tx_poll: 1
tx_q0_doorbells: 1
tx_q0_bad_req_id: 0
tx_q0_available_desc: 1022
=======================================================================

I think that you can see the regression because of the new xstats (ENI
limiters), which were added after DPDK v19.11 (mainline commit:
45718ada5fa12619db4821646ba964a2df365c68), but I'm not sure what is
the reason why you can see that.

Especially I've got few questions below.

1. Is the application you're using the single-process or multiprocess?
If so, from which process are you probing for the xstats?
2. Have you tried running latest DPDK v20.11 LTS?
3. What kernel module are you using (igb_uio/vfio-pci)?
4. On what AWS instance type it was reproduced?
5. Is the Seg Fault happening the first time you call for the xstats?

If you've got any other information which could be useful, please
share, it will help us with resolving the cause of the issue.

Thanks,
Michal

>>
>> Regards
>> Amiya

  reply	other threads:[~2022-04-19 12:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-18  8:33 Amiya Mohakud
2022-04-18 15:18 ` Amiya Mohakud
2022-04-19 12:10   ` Michał Krawczyk [this message]
2022-04-19 15:01     ` Stephen Hemminger
2022-04-19 20:27       ` Michał Krawczyk
2022-04-19 22:25         ` Stephen Hemminger
2022-04-19 23:09         ` Stephen Hemminger
2022-04-20  8:37           ` Amiya Mohakud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJMMOfP6yzCWVfeY4ZwYSx12QgHXd3PkdtcsF=qMksJUn68Czw@mail.gmail.com' \
    --to=mk@semihalf.com \
    --cc=amohakud@paloaltonetworks.com \
    --cc=dev@dpdk.org \
    --cc=ena-dev@semihalf.com \
    --cc=esadaram@paloaltonetworks.com \
    --cc=mpunjani@paloaltonetworks.com \
    --cc=shaibran@amazon.com \
    --cc=skanoje@paloaltonetworks.com \
    --cc=ssaha@paloaltonetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).