DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
@ 2020-12-07 10:48 Juraj Linkeš
  2020-12-07 10:55 ` David Marchand
  0 siblings, 1 reply; 6+ messages in thread
From: Juraj Linkeš @ 2020-12-07 10:48 UTC (permalink / raw)
  To: dev; +Cc: Kinsella, Ray, Andrew Yourtchenko (ayourtch)

Hi DPDK devs,

A while back I've submitted this bug: https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty good idea where the issue stems from. TL;DL: it seems to be in either XL710 firmware or i40e driver, with downstream effects which we may need to address in DPDK.

What is the issue?
We're using an XL710 NIC with SR-IOV setup with multiple virtual functions (VFs) that belong to the same physical function (PF). We're observing intermittent failures when multiple DPDK EAL instances are trying to initialize different VFs from the PF. One of the failures looks like this:
i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)

This results in VPP (which uses DPDK to initialize these VFs) not being able to use the VFs. There an associated syslog:

[Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the message to VF 49 aq_err 12

Digging in the sources we've found that this is the error message:
https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h#L115

This suggests it's an issue with either the driver or firmware and that leads us to two questions:
1) Is this an expected condition to happen? What is the reason for this contention and is it normal to have it, and what is the expected correct behavior of the calling code?
2) If "yes" to the previous question - then, since the caller in this case initialization code of DPDK, should we address it there (e.g. some retries or a lock)?

Are there any Intel (or SR-IOV) experts who could help with answering the first question? Or is it possible that no matter what the expected behavior is should we address it in DPDK?

This is just a short description, there's more information in Bugzilla.

Thanks,
Juraj

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
  2020-12-07 10:48 [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF Juraj Linkeš
@ 2020-12-07 10:55 ` David Marchand
  2020-12-08  7:14   ` Xing, Beilei
  0 siblings, 1 reply; 6+ messages in thread
From: David Marchand @ 2020-12-07 10:55 UTC (permalink / raw)
  To: Beilei Xing, Jeff Guo
  Cc: dev, Kinsella, Ray, Andrew Yourtchenko (ayourtch),
	Juraj Linkeš,
	Yigit, Ferruh

On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš <juraj.linkes@pantheon.tech> wrote:
>
> Hi DPDK devs,
>
> A while back I've submitted this bug: https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty good idea where the issue stems from. TL;DL: it seems to be in either XL710 firmware or i40e driver, with downstream effects which we may need to address in DPDK.
>
> What is the issue?
> We're using an XL710 NIC with SR-IOV setup with multiple virtual functions (VFs) that belong to the same physical function (PF). We're observing intermittent failures when multiple DPDK EAL instances are trying to initialize different VFs from the PF. One of the failures looks like this:
> i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)
>
> This results in VPP (which uses DPDK to initialize these VFs) not being able to use the VFs. There an associated syslog:
>
> [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the message to VF 49 aq_err 12
>
> Digging in the sources we've found that this is the error message:
> https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h#L115
>
> This suggests it's an issue with either the driver or firmware and that leads us to two questions:
> 1) Is this an expected condition to happen? What is the reason for this contention and is it normal to have it, and what is the expected correct behavior of the calling code?
> 2) If "yes" to the previous question - then, since the caller in this case initialization code of DPDK, should we address it there (e.g. some retries or a lock)?
>
> Are there any Intel (or SR-IOV) experts who could help with answering the first question? Or is it possible that no matter what the expected behavior is should we address it in DPDK?

Added i40e maintainers.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
  2020-12-07 10:55 ` David Marchand
@ 2020-12-08  7:14   ` Xing, Beilei
  2020-12-08  9:27     ` Juraj Linkeš
  0 siblings, 1 reply; 6+ messages in thread
From: Xing, Beilei @ 2020-12-08  7:14 UTC (permalink / raw)
  To: David Marchand, Guo, Jia
  Cc: dev, Kinsella, Ray, Andrew Yourtchenko (ayourtch),
	Juraj Linkeš,
	Yigit, Ferruh



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> Sent: Monday, December 7, 2020 6:55 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> Yourtchenko (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK startup when
> multiple DPDK instances use different VFs with the same PF
> 
> On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš <juraj.linkes@pantheon.tech>
> wrote:
> >
> > Hi DPDK devs,
> >
> > A while back I've submitted this bug:
> https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty good
> idea where the issue stems from. TL;DL: it seems to be in either XL710 firmware
> or i40e driver, with downstream effects which we may need to address in
> DPDK.
> >
> > What is the issue?
> > We're using an XL710 NIC with SR-IOV setup with multiple virtual functions
> (VFs) that belong to the same physical function (PF). We're observing
> intermittent failures when multiple DPDK EAL instances are trying to initialize
> different VFs from the PF. One of the failures looks like this:
> > i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)
> >
> > This results in VPP (which uses DPDK to initialize these VFs) not being able to
> use the VFs. There an associated syslog:
> >
> > [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the message to
> VF 49 aq_err 12
> >
> > Digging in the sources we've found that this is the error message:
> >
> https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/intel/i40ev
> f/i40e_adminq_cmd.h#L115
> >
> > This suggests it's an issue with either the driver or firmware and that leads us
> to two questions:
> > 1) Is this an expected condition to happen? What is the reason for this
> contention and is it normal to have it, and what is the expected correct
> behavior of the calling code?

aq_err 12 is I40E_AQ_RC_EBUSY, which is returned by firmware. It indicates mailbox
is full and device is too busy to handle other requests. So when multiple DPDK instances
are trying to initialize different VFs from the PF, there'll be many requirements from PF
to firmware, it will be easy to full the mailbox.

> > 2) If "yes" to the previous question - then, since the caller in this case
> initialization code of DPDK, should we address it there (e.g. some retries or a
> lock)?

I agree to use retry or lock to address it, but it should be addressed in kernel driver
not DPDK, since the kernel PF is responsible for communicating with firmware. When
there's aq_err 12 returned, PF should retry to send the AQ command to firmware.

> >
> > Are there any Intel (or SR-IOV) experts who could help with answering the
> first question? Or is it possible that no matter what the expected behavior is
> should we address it in DPDK?
> 
> Added i40e maintainers.
> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
  2020-12-08  7:14   ` Xing, Beilei
@ 2020-12-08  9:27     ` Juraj Linkeš
  2020-12-09  0:45       ` Xing, Beilei
  0 siblings, 1 reply; 6+ messages in thread
From: Juraj Linkeš @ 2020-12-08  9:27 UTC (permalink / raw)
  To: Xing, Beilei, David Marchand, Guo, Jia
  Cc: dev, Kinsella, Ray, Andrew Yourtchenko (ayourtch), Yigit, Ferruh



> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Tuesday, December 8, 2020 8:14 AM
> To: David Marchand <david.marchand@redhat.com>; Guo, Jia
> <jia.guo@intel.com>
> Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew Yourtchenko
> (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>;
> Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup when
> multiple DPDK instances use different VFs with the same PF
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> > Sent: Monday, December 7, 2020 6:55 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK startup
> > when multiple DPDK instances use different VFs with the same PF
> >
> > On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš
> > <juraj.linkes@pantheon.tech>
> > wrote:
> > >
> > > Hi DPDK devs,
> > >
> > > A while back I've submitted this bug:
> > https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty
> > good idea where the issue stems from. TL;DL: it seems to be in either
> > XL710 firmware or i40e driver, with downstream effects which we may
> > need to address in DPDK.
> > >
> > > What is the issue?
> > > We're using an XL710 NIC with SR-IOV setup with multiple virtual
> > > functions
> > (VFs) that belong to the same physical function (PF). We're observing
> > intermittent failures when multiple DPDK EAL instances are trying to
> > initialize different VFs from the PF. One of the failures looks like this:
> > > i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)
> > >
> > > This results in VPP (which uses DPDK to initialize these VFs) not
> > > being able to
> > use the VFs. There an associated syslog:
> > >
> > > [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the
> > > message to
> > VF 49 aq_err 12
> > >
> > > Digging in the sources we've found that this is the error message:
> > >
> > https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/int
> > el/i40ev
> > f/i40e_adminq_cmd.h#L115
> > >
> > > This suggests it's an issue with either the driver or firmware and
> > > that leads us
> > to two questions:
> > > 1) Is this an expected condition to happen? What is the reason for
> > > this
> > contention and is it normal to have it, and what is the expected
> > correct behavior of the calling code?
> 
> aq_err 12 is I40E_AQ_RC_EBUSY, which is returned by firmware. It indicates
> mailbox is full and device is too busy to handle other requests. So when multiple
> DPDK instances are trying to initialize different VFs from the PF, there'll be many
> requirements from PF to firmware, it will be easy to full the mailbox.
> 
> > > 2) If "yes" to the previous question - then, since the caller in
> > > this case
> > initialization code of DPDK, should we address it there (e.g. some
> > retries or a lock)?
> 
> I agree to use retry or lock to address it, but it should be addressed in kernel
> driver not DPDK, since the kernel PF is responsible for communicating with
> firmware. When there's aq_err 12 returned, PF should retry to send the AQ
> command to firmware.
> 

Thanks, Beilei, for the clarification. Do you know how/where should I raise the bug with the i40e driver? The kernel bugzilla [0]?

[0] https://bugzilla.kernel.org/

> > >
> > > Are there any Intel (or SR-IOV) experts who could help with
> > > answering the
> > first question? Or is it possible that no matter what the expected
> > behavior is should we address it in DPDK?
> >
> > Added i40e maintainers.
> >
> >
> > --
> > David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
  2020-12-08  9:27     ` Juraj Linkeš
@ 2020-12-09  0:45       ` Xing, Beilei
  2020-12-11  9:00         ` Juraj Linkeš
  0 siblings, 1 reply; 6+ messages in thread
From: Xing, Beilei @ 2020-12-09  0:45 UTC (permalink / raw)
  To: Juraj Linkeš, David Marchand, Guo, Jia
  Cc: dev, Kinsella, Ray, Andrew Yourtchenko (ayourtch), Yigit, Ferruh



> -----Original Message-----
> From: Juraj Linkeš <juraj.linkes@pantheon.tech>
> Sent: Tuesday, December 8, 2020 5:27 PM
> To: Xing, Beilei <beilei.xing@intel.com>; David Marchand
> <david.marchand@redhat.com>; Guo, Jia <jia.guo@intel.com>
> Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> Yourtchenko (ayourtch) <ayourtch@cisco.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup when
> multiple DPDK instances use different VFs with the same PF
> 
> 
> 
> > -----Original Message-----
> > From: Xing, Beilei <beilei.xing@intel.com>
> > Sent: Tuesday, December 8, 2020 8:14 AM
> > To: David Marchand <david.marchand@redhat.com>; Guo, Jia
> > <jia.guo@intel.com>
> > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > Yourtchenko
> > (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup
> > when multiple DPDK instances use different VFs with the same PF
> >
> >
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> > > Sent: Monday, December 7, 2020 6:55 PM
> > > To: Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
> > > <jia.guo@intel.com>
> > > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > > <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK startup
> > > when multiple DPDK instances use different VFs with the same PF
> > >
> > > On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš
> > > <juraj.linkes@pantheon.tech>
> > > wrote:
> > > >
> > > > Hi DPDK devs,
> > > >
> > > > A while back I've submitted this bug:
> > > https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty
> > > good idea where the issue stems from. TL;DL: it seems to be in
> > > either
> > > XL710 firmware or i40e driver, with downstream effects which we may
> > > need to address in DPDK.
> > > >
> > > > What is the issue?
> > > > We're using an XL710 NIC with SR-IOV setup with multiple virtual
> > > > functions
> > > (VFs) that belong to the same physical function (PF). We're
> > > observing intermittent failures when multiple DPDK EAL instances are
> > > trying to initialize different VFs from the PF. One of the failures looks like
> this:
> > > > i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)
> > > >
> > > > This results in VPP (which uses DPDK to initialize these VFs) not
> > > > being able to
> > > use the VFs. There an associated syslog:
> > > >
> > > > [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the
> > > > message to
> > > VF 49 aq_err 12
> > > >
> > > > Digging in the sources we've found that this is the error message:
> > > >
> > > https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/i
> > > nt
> > > el/i40ev
> > > f/i40e_adminq_cmd.h#L115
> > > >
> > > > This suggests it's an issue with either the driver or firmware and
> > > > that leads us
> > > to two questions:
> > > > 1) Is this an expected condition to happen? What is the reason for
> > > > this
> > > contention and is it normal to have it, and what is the expected
> > > correct behavior of the calling code?
> >
> > aq_err 12 is I40E_AQ_RC_EBUSY, which is returned by firmware. It
> > indicates mailbox is full and device is too busy to handle other
> > requests. So when multiple DPDK instances are trying to initialize
> > different VFs from the PF, there'll be many requirements from PF to firmware,
> it will be easy to full the mailbox.
> >
> > > > 2) If "yes" to the previous question - then, since the caller in
> > > > this case
> > > initialization code of DPDK, should we address it there (e.g. some
> > > retries or a lock)?
> >
> > I agree to use retry or lock to address it, but it should be addressed
> > in kernel driver not DPDK, since the kernel PF is responsible for
> > communicating with firmware. When there's aq_err 12 returned, PF
> > should retry to send the AQ command to firmware.
> >
> 
> Thanks, Beilei, for the clarification. Do you know how/where should I raise the
> bug with the i40e driver? The kernel bugzilla [0]?
> 
> [0] https://bugzilla.kernel.org/
> 

I think so, you should report it in kernel community or report to Intel PAE.

> > > >
> > > > Are there any Intel (or SR-IOV) experts who could help with
> > > > answering the
> > > first question? Or is it possible that no matter what the expected
> > > behavior is should we address it in DPDK?
> > >
> > > Added i40e maintainers.
> > >
> > >
> > > --
> > > David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
  2020-12-09  0:45       ` Xing, Beilei
@ 2020-12-11  9:00         ` Juraj Linkeš
  0 siblings, 0 replies; 6+ messages in thread
From: Juraj Linkeš @ 2020-12-11  9:00 UTC (permalink / raw)
  To: Xing, Beilei, David Marchand, Guo, Jia
  Cc: dev, Kinsella, Ray, Andrew Yourtchenko (ayourtch), Yigit, Ferruh



> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Wednesday, December 9, 2020 1:45 AM
> To: Juraj Linkeš <juraj.linkes@pantheon.tech>; David Marchand
> <david.marchand@redhat.com>; Guo, Jia <jia.guo@intel.com>
> Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew Yourtchenko
> (ayourtch) <ayourtch@cisco.com>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup when
> multiple DPDK instances use different VFs with the same PF
> 
> 
> 
> > -----Original Message-----
> > From: Juraj Linkeš <juraj.linkes@pantheon.tech>
> > Sent: Tuesday, December 8, 2020 5:27 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; David Marchand
> > <david.marchand@redhat.com>; Guo, Jia <jia.guo@intel.com>
> > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Yigit, Ferruh
> > <ferruh.yigit@intel.com>
> > Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup
> > when multiple DPDK instances use different VFs with the same PF
> >
> >
> >
> > > -----Original Message-----
> > > From: Xing, Beilei <beilei.xing@intel.com>
> > > Sent: Tuesday, December 8, 2020 8:14 AM
> > > To: David Marchand <david.marchand@redhat.com>; Guo, Jia
> > > <jia.guo@intel.com>
> > > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > > Yourtchenko
> > > (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > > <juraj.linkes@pantheon.tech>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > Subject: RE: [dpdk-dev] Faulty VF initialization during DPDK startup
> > > when multiple DPDK instances use different VFs with the same PF
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> > > > Sent: Monday, December 7, 2020 6:55 PM
> > > > To: Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
> > > > <jia.guo@intel.com>
> > > > Cc: dev@dpdk.org; Kinsella, Ray <ray.kinsella@intel.com>; Andrew
> > > > Yourtchenko (ayourtch) <ayourtch@cisco.com>; Juraj Linkeš
> > > > <juraj.linkes@pantheon.tech>; Yigit, Ferruh
> > > > <ferruh.yigit@intel.com>
> > > > Subject: Re: [dpdk-dev] Faulty VF initialization during DPDK
> > > > startup when multiple DPDK instances use different VFs with the
> > > > same PF
> > > >
> > > > On Mon, Dec 7, 2020 at 11:49 AM Juraj Linkeš
> > > > <juraj.linkes@pantheon.tech>
> > > > wrote:
> > > > >
> > > > > Hi DPDK devs,
> > > > >
> > > > > A while back I've submitted this bug:
> > > > https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty
> > > > good idea where the issue stems from. TL;DL: it seems to be in
> > > > either
> > > > XL710 firmware or i40e driver, with downstream effects which we
> > > > may need to address in DPDK.
> > > > >
> > > > > What is the issue?
> > > > > We're using an XL710 NIC with SR-IOV setup with multiple virtual
> > > > > functions
> > > > (VFs) that belong to the same physical function (PF). We're
> > > > observing intermittent failures when multiple DPDK EAL instances
> > > > are trying to initialize different VFs from the PF. One of the
> > > > failures looks like
> > this:
> > > > > i40evf_check_api_version(): PF/VF API version
> > > > > mismatch:(0.0)-(1.1)
> > > > >
> > > > > This results in VPP (which uses DPDK to initialize these VFs)
> > > > > not being able to
> > > > use the VFs. There an associated syslog:
> > > > >
> > > > > [Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the
> > > > > message to
> > > > VF 49 aq_err 12
> > > > >
> > > > > Digging in the sources we've found that this is the error message:
> > > > >
> > > > https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet
> > > > /i
> > > > nt
> > > > el/i40ev
> > > > f/i40e_adminq_cmd.h#L115
> > > > >
> > > > > This suggests it's an issue with either the driver or firmware
> > > > > and that leads us
> > > > to two questions:
> > > > > 1) Is this an expected condition to happen? What is the reason
> > > > > for this
> > > > contention and is it normal to have it, and what is the expected
> > > > correct behavior of the calling code?
> > >
> > > aq_err 12 is I40E_AQ_RC_EBUSY, which is returned by firmware. It
> > > indicates mailbox is full and device is too busy to handle other
> > > requests. So when multiple DPDK instances are trying to initialize
> > > different VFs from the PF, there'll be many requirements from PF to
> > > firmware,
> > it will be easy to full the mailbox.
> > >
> > > > > 2) If "yes" to the previous question - then, since the caller in
> > > > > this case
> > > > initialization code of DPDK, should we address it there (e.g. some
> > > > retries or a lock)?
> > >
> > > I agree to use retry or lock to address it, but it should be
> > > addressed in kernel driver not DPDK, since the kernel PF is
> > > responsible for communicating with firmware. When there's aq_err 12
> > > returned, PF should retry to send the AQ command to firmware.
> > >
> >
> > Thanks, Beilei, for the clarification. Do you know how/where should I
> > raise the bug with the i40e driver? The kernel bugzilla [0]?
> >
> > [0] https://bugzilla.kernel.org/
> >
> 
> I think so, you should report it in kernel community or report to Intel PAE.
> 

Thanks.
I don't know what an Intel PAE is, so I've submitted the bug here: https://bugzilla.kernel.org/show_bug.cgi?id=210627

> > > > >
> > > > > Are there any Intel (or SR-IOV) experts who could help with
> > > > > answering the
> > > > first question? Or is it possible that no matter what the expected
> > > > behavior is should we address it in DPDK?
> > > >
> > > > Added i40e maintainers.
> > > >
> > > >
> > > > --
> > > > David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-11  9:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-07 10:48 [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF Juraj Linkeš
2020-12-07 10:55 ` David Marchand
2020-12-08  7:14   ` Xing, Beilei
2020-12-08  9:27     ` Juraj Linkeš
2020-12-09  0:45       ` Xing, Beilei
2020-12-11  9:00         ` Juraj Linkeš

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).