DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Juraj Linkeš" <juraj.linkes@pantheon.tech>
To: "Xing, Beilei" <beilei.xing@intel.com>,
	David Marchand <david.marchand@redhat.com>,
	"Guo, Jia" <jia.guo@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"Kinsella, Ray" <ray.kinsella@intel.com>,
	"Andrew Yourtchenko (ayourtch)" <ayourtch@cisco.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
Date: Fri, 11 Dec 2020 09:00:10 +0000	[thread overview]
Message-ID: <1514c90912fb46d998af46c64708eb65@pantheon.tech> (raw)
In-Reply-To: <MN2PR11MB3807F2BCC718487FF595843DF7CC0@MN2PR11MB3807.namprd11.prod.outlook.com>



> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Wednesday, December 9, 2020 1:45 AM
> To: Juraj Linkeš <juraj.linkes@pantheon.tech>; David Marchand
> <david.marchand@redhat.com>; Guo, Jia <jia.guo@intel.com>
> Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew Yourtchenko
> (ayourtch) <ayourtch@cisco.com>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup when
> multiple DPDK instances use different VFs with the same PF
> 
> 
> 
> > -----Original Message-----
> > From: Juraj Linkeš <juraj.linkes@pantheon.tech>
> > Sent: Tuesday, December 8, 2020 5:27 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; David Marchand
> > <david.marchand@redhat.com>; Guo, Jia <jia.guo@intel.com>
> > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Yigit, Ferruh
> > <ferruh.yigit@intel.com>
> > Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup
> > when multiple DPDK instances use different VFs with the same PF
> >
> >
> >
> > > -----Original Message-----
> > > From: Xing, Beilei <beilei.xing@intel.com>
> > > Sent: Tuesday, December 8, 2020 8:14 AM
> > > To: David Marchand <david.marchand@redhat.com>; Guo, Jia
> > > <jia.guo@intel.com>
> > > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > > Yourtchenko
> > > (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > > <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup
> > > when multiple DPDK instances use different VFs with the same PF
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> > > > Sent: Monday, December 7, 2020 6:55 PM
> > > > To: Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
> > > > <jia.guo@intel.com>
> > > > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > > > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > > > <juraj.linkes@pantheon.tech>; Yigit, Ferruh
> > > > <ferruh.yigit@intel.com>
> > > > Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK
> > > > startup when multiple DPDK instances use different VFs with the
> > > > same PF
> > > >
> > > > On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš
> > > > <juraj.linkes@pantheon.tech>
> > > > wrote:
> > > > >
> > > > > Hi DPDK devs,
> > > > >
> > > > > A while back I've submitted this bug:
> > > > https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty
> > > > good idea where the issue stems from. TL;DL: it seems to be in
> > > > either
> > > > XL710 firmware or i40e driver, with downstream effects which we
> > > > may need to address in DPDK.
> > > > >
> > > > > What is the issue?
> > > > > We're using an XL710 NIC with SR-IOV setup with multiple virtual
> > > > > functions
> > > > (VFs) that belong to the same physical function (PF). We're
> > > > observing intermittent failures when multiple DPDK EAL instances
> > > > are trying to initialize different VFs from the PF. One of the
> > > > failures looks like
> > this:
> > > > > i40evf_check_api_version(): PF/VF API version
> > > > > mismatch:(0.0)-(1.1)
> > > > >
> > > > > This results in VPP (which uses DPDK to initialize these VFs)
> > > > > not being able to
> > > > use the VFs. There an associated syslog:
> > > > >
> > > > > [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the
> > > > > message to
> > > > VF 49 aq_err 12
> > > > >
> > > > > Digging in the sources we've found that this is the error message:
> > > > >
> > > > https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet
> > > > /i
> > > > nt
> > > > el/i40ev
> > > > f/i40e_adminq_cmd.h#L115
> > > > >
> > > > > This suggests it's an issue with either the driver or firmware
> > > > > and that leads us
> > > > to two questions:
> > > > > 1) Is this an expected condition to happen? What is the reason
> > > > > for this
> > > > contention and is it normal to have it, and what is the expected
> > > > correct behavior of the calling code?
> > >
> > > aq_err 12 is I40E_AQ_RC_EBUSY, which is returned by firmware. It
> > > indicates mailbox is full and device is too busy to handle other
> > > requests. So when multiple DPDK instances are trying to initialize
> > > different VFs from the PF, there'll be many requirements from PF to
> > > firmware,
> > it will be easy to full the mailbox.
> > >
> > > > > 2) If "yes" to the previous question - then, since the caller in
> > > > > this case
> > > > initialization code of DPDK, should we address it there (e.g. some
> > > > retries or a lock)?
> > >
> > > I agree to use retry or lock to address it, but it should be
> > > addressed in kernel driver not DPDK, since the kernel PF is
> > > responsible for communicating with firmware. When there's aq_err 12
> > > returned, PF should retry to send the AQ command to firmware.
> > >
> >
> > Thanks, Beilei, for the clarification. Do you know how/where should I
> > raise the bug with the i40e driver? The kernel bugzilla [0]?
> >
> > [0] https://bugzilla.kernel.org/
> >
> 
> I think so, you should report it in kernel community or report to Intel PAE.
> 

Thanks.
I don't know what an Intel PAE is, so I've submitted the bug here: https://bugzilla.kernel.org/show_bug.cgi?id=210627

> > > > >
> > > > > Are there any Intel (or SR-IOV) experts who could help with
> > > > > answering the
> > > > first question? Or is it possible that no matter what the expected
> > > > behavior is should we address it in DPDK?
> > > >
> > > > Added i40e maintainers.
> > > >
> > > >
> > > > --
> > > > David Marchand


      reply	other threads:[~2020-12-11  9:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07 10:48 Juraj Linkeš
2020-12-07 10:55 ` David Marchand
2020-12-08  7:14   ` Xing, Beilei
2020-12-08  9:27     ` Juraj Linkeš
2020-12-09  0:45       ` Xing, Beilei
2020-12-11  9:00         ` Juraj Linkeš [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1514c90912fb46d998af46c64708eb65@pantheon.tech \
    --to=juraj.linkes@pantheon.tech \
    --cc=ayourtch@cisco.com \
    --cc=beilei.xing@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jia.guo@intel.com \
    --cc=ray.kinsella@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).