DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Michał Krawczyk" <mk@semihalf.com>
Cc: Amiya Mohakud <amohakud@paloaltonetworks.com>, dev <dev@dpdk.org>,
	Sachin Kanoje <skanoje@paloaltonetworks.com>,
	Megha Punjani <mpunjani@paloaltonetworks.com>,
	Sharad Saha <ssaha@paloaltonetworks.com>,
	Eswar Sadaram <esadaram@paloaltonetworks.com>,
	"Brandes, Shai" <shaibran@amazon.com>,
	ena-dev <ena-dev@semihalf.com>
Subject: Re: DPDK:20.11.1: net/ena crash while fetching xstats
Date: Tue, 19 Apr 2022 08:01:50 -0700	[thread overview]
Message-ID: <20220419080150.2511dee2@hermes.local> (raw)
In-Reply-To: <CAJMMOfP6yzCWVfeY4ZwYSx12QgHXd3PkdtcsF=qMksJUn68Czw@mail.gmail.com>

On Tue, 19 Apr 2022 14:10:23 +0200
Michał Krawczyk <mk@semihalf.com> wrote:

> pon., 18 kwi 2022 o 17:19 Amiya Mohakud
> <amohakud@paloaltonetworks.com> napisał(a):
> >
> > + Megha, Sharad and Eswar.
> >
> > On Mon, Apr 18, 2022 at 2:03 PM Amiya Mohakud <amohakud@paloaltonetworks.com> wrote:  
> >>
> >> Hi Michal/DPDK-Experts,
> >>
> >> I am facing one issue in net/ena driver while fetching extended stats (xstats). The DPDK seems to segfault with below backtrace.
> >>
> >> DPDK Version: 20.11.1
> >> ENA version: 2.2.1
> >>
> >>
> >> Using host libthread_db library "/lib64/libthread_db.so.1".
> >>
> >> Core was generated by `/opt/dpfs/usr/local/bin/brdagent'.
> >>
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >>
> >> #0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
> >>
> >> 232             VMOVU   %VEC(0), (%rdi)
> >>
> >> [Current thread is 1 (Thread 0x7fffed93a400 (LWP 5060))]
> >>
> >>
> >> Thread 1 (Thread 0x7fffed93a400 (LWP 5060)):
> >>
> >> #0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:232
> >>
> >> #1  0x00007ffff3c246df in ena_com_handle_admin_completion () from ../lib64/../../lib64/libdpdk.so.20
> >>
> >> #2  0x00007ffff3c1e7f5 in ena_interrupt_handler_rte () from ../lib64/../../lib64/libdpdk.so.20
> >>
> >> #3  0x00007ffff3519902 in eal_intr_thread_main () from /../lib64/../../lib64/libdpdk.so.20
> >>
> >> #4  0x00007ffff510714a in start_thread (arg=<optimized out>) at pthread_create.c:479
> >>
> >> #5  0x00007ffff561ff23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> >>
> >>
> >>
> >>
> >> Background:
> >>
> >> This used to work fine with DPDK-19.11.3 , that means there was no crash observed with the 19.11.3 DPDK version, but now after upgrading to DPDK 20.11.1, DPDK is crashing with the above trace.
> >> It looks to me as a DPDK issue.
> >> I could see multiple fixes/patches in the net/ena area, but not able to identify which patch would exactly fix this issue.
> >>
> >> For example: http://git.dpdk.org/dpdk/diff/?h=releases&id=aab58857330bb4bd03f6699bf1ee716f72993774
> >> https://inbox.dpdk.org/dev/20210430125725.28796-6-mk@semihalf.com/T/#me99457c706718bb236d1fd8006ee7a0319ce76fc
> >>
> >>
> >> Could you please help here and let me know what patch could fix this issue.
> >>  
> 
> + Shai Brandes and ena-dev
> 
> Hi Amiya,
> 
> Thanks for reaching me out. Could you please provide us with more
> details regarding the reproduction? I cannot reproduce this on my
> setup for DPDK v20.11.1 when using testpmd and probing for the xstats.
> 
> =======================================================================
> [ec2-user@<removed> dpdk]$ sudo ./build/app/dpdk-testpmd -- -i
> EAL: Detected 8 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> EAL:   Invalid NUMA socket, default to 0
> EAL:   Invalid NUMA socket, default to 0
> EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket 0)
> EAL: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> ena_mtu_set(): Set MTU: 1500
> 
> testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> 
> Warning! port-topology=paired and odd forward ports number, the last
> port will pair with itself.
> 
> Configuring Port 0 (socket 0)
> Port 0: <removed>
> Checking link statuses...
> Done
> Error during enabling promiscuous mode for port 0: Operation not
> supported - ignore
> testpmd> start  
> io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support
> enabled, MP allocation mode: native
> Logical Core 1 (socket 0) forwards packets on 1 streams:
>   RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
> 
>   io packet forwarding packets/burst=32
>   nb forwarding cores=1 - nb forwarding ports=1
>   port 0: RX queue number: 1 Tx queue number: 1
>     Rx offloads=0x0 Tx offloads=0x0
>     RX queue: 0
>       RX desc=0 - RX free threshold=0
>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       RX Offloads=0x0
>     TX queue: 0
>       TX desc=0 - TX free threshold=0
>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       TX offloads=0x0 - TX RS bit threshold=0
> testpmd> show port xstats 0  
> ###### NIC extended statistics for port 0
> rx_good_packets: 1
> tx_good_packets: 1
> rx_good_bytes: 42
> tx_good_bytes: 42
> rx_missed_errors: 0
> rx_errors: 0
> tx_errors: 0
> rx_mbuf_allocation_errors: 0
> rx_q0_packets: 1
> rx_q0_bytes: 42
> rx_q0_errors: 0
> tx_q0_packets: 1
> tx_q0_bytes: 42
> wd_expired: 0
> dev_start: 1
> dev_stop: 0
> tx_drops: 0
> bw_in_allowance_exceeded: 0
> bw_out_allowance_exceeded: 0
> pps_allowance_exceeded: 0
> conntrack_allowance_exceeded: 0
> linklocal_allowance_exceeded: 0
> rx_q0_cnt: 1
> rx_q0_bytes: 42
> rx_q0_refill_partial: 0
> rx_q0_bad_csum: 0
> rx_q0_mbuf_alloc_fail: 0
> rx_q0_bad_desc_num: 0
> rx_q0_bad_req_id: 0
> tx_q0_cnt: 1
> tx_q0_bytes: 42
> tx_q0_prepare_ctx_err: 0
> tx_q0_linearize: 0
> tx_q0_linearize_failed: 0
> tx_q0_tx_poll: 1
> tx_q0_doorbells: 1
> tx_q0_bad_req_id: 0
> tx_q0_available_desc: 1022
> =======================================================================
> 
> I think that you can see the regression because of the new xstats (ENI
> limiters), which were added after DPDK v19.11 (mainline commit:
> 45718ada5fa12619db4821646ba964a2df365c68), but I'm not sure what is
> the reason why you can see that.
> 
> Especially I've got few questions below.
> 
> 1. Is the application you're using the single-process or multiprocess?
> If so, from which process are you probing for the xstats?
> 2. Have you tried running latest DPDK v20.11 LTS?
> 3. What kernel module are you using (igb_uio/vfio-pci)?
> 4. On what AWS instance type it was reproduced?
> 5. Is the Seg Fault happening the first time you call for the xstats?
> 
> If you've got any other information which could be useful, please
> share, it will help us with resolving the cause of the issue.
> 
> Thanks,
> Michal
> 
> >>
> >> Regards
> >> Amiya  

Try getting xstats in secondary process.
I think that is where the bug was found.

  reply	other threads:[~2022-04-19 15:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-18  8:33 Amiya Mohakud
2022-04-18 15:18 ` Amiya Mohakud
2022-04-19 12:10   ` Michał Krawczyk
2022-04-19 15:01     ` Stephen Hemminger [this message]
2022-04-19 20:27       ` Michał Krawczyk
2022-04-19 22:25         ` Stephen Hemminger
2022-04-19 23:09         ` Stephen Hemminger
2022-04-20  8:37           ` Amiya Mohakud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220419080150.2511dee2@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=amohakud@paloaltonetworks.com \
    --cc=dev@dpdk.org \
    --cc=ena-dev@semihalf.com \
    --cc=esadaram@paloaltonetworks.com \
    --cc=mk@semihalf.com \
    --cc=mpunjani@paloaltonetworks.com \
    --cc=shaibran@amazon.com \
    --cc=skanoje@paloaltonetworks.com \
    --cc=ssaha@paloaltonetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).