DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Juraj Linkeš" <juraj.linkes@pantheon.tech>
To: "dev@dpdk.org" <dev@dpdk.org>
Cc: "Kinsella, Ray" <ray.kinsella@intel.com>,
	"Andrew Yourtchenko (ayourtch)" <ayourtch@cisco.com>
Subject: [dpdk-dev] Faulty VF initialization during DPDK startup when multiple DPDK instances use different VFs with the same PF
Date: Mon, 7 Dec 2020 10:48:38 +0000	[thread overview]
Message-ID: <9d5b0d3a3bb648d5a296eb794006db14@pantheon.tech> (raw)

Hi DPDK devs,

A while back I've submitted this bug: https://bugs.dpdk.org/show_bug.cgi?id=578 and now we have a pretty good idea where the issue stems from. TL;DL: it seems to be in either XL710 firmware or i40e driver, with downstream effects which we may need to address in DPDK.

What is the issue?
We're using an XL710 NIC with SR-IOV setup with multiple virtual functions (VFs) that belong to the same physical function (PF). We're observing intermittent failures when multiple DPDK EAL instances are trying to initialize different VFs from the PF. One of the failures looks like this:
i40evf_check_api_version(): PF/VF API version mismatch:(0.0)-(1.1)

This results in VPP (which uses DPDK to initialize these VFs) not being able to use the VFs. There an associated syslog:

[Thu Dec  3 02:30:56 2020] i40e 0000:05:00.1: Unable to send the message to VF 49 aq_err 12

Digging in the sources we've found that this is the error message:
https://elixir.bootlin.com/linux/v4.15/source/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h#L115

This suggests it's an issue with either the driver or firmware and that leads us to two questions:
1) Is this an expected condition to happen? What is the reason for this contention and is it normal to have it, and what is the expected correct behavior of the calling code?
2) If "yes" to the previous question - then, since the caller in this case initialization code of DPDK, should we address it there (e.g. some retries or a lock)?

Are there any Intel (or SR-IOV) experts who could help with answering the first question? Or is it possible that no matter what the expected behavior is should we address it in DPDK?

This is just a short description, there's more information in Bugzilla.

Thanks,
Juraj

             reply	other threads:[~2020-12-07 10:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07 10:48 Juraj Linkeš [this message]
2020-12-07 10:55 ` David Marchand
2020-12-08  7:14   ` Xing, Beilei
2020-12-08  9:27     ` Juraj Linkeš
2020-12-09  0:45       ` Xing, Beilei
2020-12-11  9:00         ` Juraj Linkeš

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d5b0d3a3bb648d5a296eb794006db14@pantheon.tech \
    --to=juraj.linkes@pantheon.tech \
    --cc=ayourtch@cisco.com \
    --cc=dev@dpdk.org \
    --cc=ray.kinsella@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).