DPDK patches and discussions
 help / color / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: kvm@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, dev@dpdk.org, mtosatti@redhat.com,
	thomas@monjalon.net, bluca@debian.org, jerinjacobk@gmail.com,
	bruce.richardson@intel.com, cohuck@redhat.com
Subject: Re: [dpdk-dev] [PATCH 0/7] vfio/pci: SR-IOV support
Date: Fri, 14 Feb 2020 08:27:20 -0700
Message-ID: <20200214082720.7dc33bdf@x1.home> (raw)
In-Reply-To: <22153755-598f-d25c-55a2-799c008d8d2b@ozlabs.ru>

On Fri, 14 Feb 2020 15:57:04 +1100
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> On 12/02/2020 10:05, Alex Williamson wrote:
> > Given the mostly positive feedback from the RFC[1], here's a new
> > non-RFC revision.  Changes since RFC:
> > 
> >  - vfio_device_ops.match semantics refined
> >  - Use helpers for struct pci_dev.physfn to avoid breakage without
> >  - Relax to allow SR-IOV configuration changes while PF is opened.
> >    There are potentially interesting use cases here, including
> >    perhaps QEMU emulating an SR-IOV capability and calling out
> >    to a privileged entity to manipulate sriov_numvfs and corral
> >    the resulting devices.
> >  - Retest vfio_device_feature.argsz to include uuid length.
> >  - Add Connie's R-b on 6/7
> > 
> > I still wish we had a solution to make it less opaque to the user
> > why a VFIO_GROUP_GET_DEVICE_FD() has failed if a VF token is
> > required, but this is still the best I've been able to come up with.
> > If there are objections or better ideas, please raise them now.
> > 
> > The synopsis of this series is that we have an ongoing desire to drive
> > PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> > for this with DPDK drivers and potentially interesting future use
> > cases in virtualization.  We've been reluctant to add this support
> > previously due to the dependency and trust relationship between the
> > VF device and PF driver.  Minimally the PF driver can induce a denial
> > of service to the VF, but depending on the specific implementation,
> > the PF driver might also be responsible for moving data between VFs
> > or have direct access to the state of the VF, including data or state
> > otherwise private to the VF or VF driver.
> > 
> > To help resolve these concerns, we introduce a VF token into the VFIO
> > PCI ABI, which acts as a shared secret key between drivers.  The
> > userspace PF driver is required to set the VF token to a known value
> > and userspace VF drivers are required to provide the token to access
> > the VF device.  If a PF driver is restarted with VF drivers in use, it
> > must also provide the current token in order to prevent a rogue
> > untrusted PF driver from replacing a known driver.  The degree to
> > which this new token is considered secret is left to the userspace
> > drivers, the kernel intentionally provides no means to retrieve the
> > current token.
> > 
> > Note that the above token is only required for this new model where
> > both the PF and VF devices are usable through vfio-pci.  Existing
> > models of VFIO drivers where the PF is used without SR-IOV enabled
> > or the VF is bound to a userspace driver with an in-kernel, host PF
> > driver are unaffected.
> > 
> > The latter configuration above also highlights a new inverted scenario
> > that is now possible, a userspace PF driver with in-kernel VF drivers.
> > I believe this is a scenario that should be allowed, but should not be
> > enabled by default.  This series includes code to set a default
> > driver_override for VFs sourced from a vfio-pci user owned PF, such
> > that the VFs are also bound to vfio-pci.  This model is compatible
> > with tools like driverctl and allows the system administrator to
> > decide if other bindings should be enabled.  The VF token interface
> > above exists only between vfio-pci PF and VF drivers, once a VF is
> > bound to another driver, the administrator has effectively pronounced
> > the device as trusted.  The vfio-pci driver will note alternate
> > binding in dmesg for logging and debugging purposes.
> > 
> > Please review, comment, and test.  The example QEMU implementation
> > provided with the RFC[2] is still current for this version.  Thanks,  
> It is a cool feature. One question - what device have you tested it with?
> Does not a PF want to control/manage VFs on a PF driver side? I am
> thinking of Mellanox CX5 or similar NIC and it acts as an managed
> ethernet switch which might want to do something to VFs and VFs may not
> work as expected without PF's native driver doing things to it, or this
> is not a concern, is it? Thanks,

TBH, I'm starting with the premise that a userspace PF driver already
works.  The DPDK folks have produced some "interesting" code that
allows SR-IOV to be enabled on a PF underneath vfio-pci.  There's also
a non-upstream igb-uio driver associated with DPDK that seems to be
recommended for SR-IOV PF driver use cases, particularly for an FPGA
device.  The testing I've done, and what's provided by the QEMU patch I
reference, is really only unit testing the vf_token support and
DEVICE_FEATURE ioctl provided here.  I used this with an Intel 82576
(igb) where the PF driver doesn't particularly like being assigned to a
VM with SR-IOV enabled.  Likewise, I can prove that the interfaces here
provide the correct restrictions for the VF, but the VF doesn't work in
a VM due to the state of the PF.  I'm hoping we'll have some
confirmation from the DPDK folks that this provides what they need to
abandon the non-upstream drivers and more nefarious hacks.  There's a
lot more virtualization work to be done in QEMU before I'd propose
patch I reference above upstream.

To your specific question regarding CX5, I think there are very few
SR-IOV devices where the PF doesn't act as some kind of packet router
or ring management engine.  The Amazon device listed in the pci-pf-stub
driver seems to be one of the few SR-IOV devices which claim the PF has
no special interfaces other than exposing the SR-IOV capability itself.
So I think we generally expect a device specific SR-IOV aware driver
running on the PF via this interface.  That's certainly the case for
the DPDK code for the FPGA device above.  Thanks,


      reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 23:05 Alex Williamson
2020-02-11 23:05 ` [dpdk-dev] [PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
2020-02-13 10:31   ` Cornelia Huck
2020-02-11 23:05 ` [dpdk-dev] [PATCH 2/7] vfio/pci: Implement match ops Alex Williamson
2020-02-13 11:04   ` Cornelia Huck
2020-02-11 23:05 ` [dpdk-dev] [PATCH 3/7] vfio/pci: Introduce VF token Alex Williamson
2020-02-13 11:46   ` Cornelia Huck
2020-02-13 17:23     ` Alex Williamson
2020-02-13 18:35       ` Cornelia Huck
2020-02-14 23:40         ` Alex Williamson
2020-02-11 23:05 ` [dpdk-dev] [PATCH 4/7] vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user Alex Williamson
2020-02-13 12:41   ` Cornelia Huck
2020-02-13 17:39     ` Alex Williamson
2020-02-13 18:08       ` Cornelia Huck
2020-02-11 23:06 ` [dpdk-dev] [PATCH 5/7] vfio/pci: Add sriov_configure support Alex Williamson
2020-02-11 23:06 ` [dpdk-dev] [PATCH 6/7] vfio/pci: Remove dev_fmt definition Alex Williamson
2020-02-11 23:06 ` [dpdk-dev] [PATCH 7/7] vfio/pci: Cleanup .probe() exit paths Alex Williamson
2020-02-14  4:57 ` [dpdk-dev] [PATCH 0/7] vfio/pci: SR-IOV support Alexey Kardashevskiy
2020-02-14 15:27   ` Alex Williamson [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200214082720.7dc33bdf@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=bluca@debian.org \
    --cc=bruce.richardson@intel.com \
    --cc=cohuck@redhat.com \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=thomas@monjalon.net \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
	public-inbox-index dev

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/ public-inbox