DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Xia, Chenbo" <chenbo.xia@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"Liang, Cunming" <cunming.liang@intel.com>,
	 "Wu, Jingjing" <jingjing.wu@intel.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"mdr@ashroe.eu" <mdr@ashroe.eu>,
	"nhorman@tuxdriver.com" <nhorman@tuxdriver.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"parav@nvidia.com" <parav@nvidia.com>,
	"xuemingl@nvidia.com" <xuemingl@nvidia.com>
Subject: Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
Date: Tue, 15 Jun 2021 10:44:42 +0000
Message-ID: <MN2PR11MB4063DA43F72B1BFE8374473A9C309@MN2PR11MB4063.namprd11.prod.outlook.com> (raw)
In-Reply-To: <50744230.0ZSezZt4d8@thomas>

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, June 15, 2021 3:48 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; Liang, Cunming <cunming.liang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; mdr@ashroe.eu; nhorman@tuxdriver.com;
> Richardson, Bruce <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> jgg@nvidia.com; parav@nvidia.com; xuemingl@nvidia.com
> Subject: Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in
> DPDK
> 
> 15/06/2021 04:49, Xia, Chenbo:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 01/06/2021 05:06, Chenbo Xia:
> > > > Hi everyone,
> > > >
> > > > This is a draft implementation of the mdev (Mediated device [1])
> > > > support in DPDK PCI bus driver. Mdev is a way to virtualize devices
> > > > in Linux kernel. Based on the device-api (mdev_type/device_api),
> > > > there could be different types of mdev devices (e.g. vfio-pci).
> > >
> > > Please could you illustrate with an usage of mdev in DPDK?
> > > What does it enable which is not possible today?
> >
> > The main purpose is for DPDK to drive mdev-based devices, which is not
> > possible today.
> >
> > I'd take PCI devices for an example. Currently DPDK can only drive devices
> > of physical pci bus under /sys/bus/pci and kernel exposes the pci devices
> > to APP in that way.
> >
> > But there are PCI devices using vfio-mdev as a software framework to expose
> > Mdev to APP under /sys/bus/mdev. Devices could choose this way of
> virtualizing
> > itself to let multiple APPs share one physical device. For example, Intel
> > Scalable IOV technology is known to use vfio-mdev as SW framework for
> Scalable
> > IOV enabled devices (and Intel net/crypto/raw devices support this tech).
> For
> > those mdev-based devices, DPDK needs support on the bus layer to
> scan/plug/probe/..
> > them, which is the main effort this patchset does. There are also other
> devices
> > using the vfio-mdev framework, AFAIK, Nvidia's GPU is the first one using
> mdev
> > and Intel's GPU virtualization also uses it.
> 
> Yes mdev was designed for virtualization I think.
> The use of mdev for Scalable IOV without virtualization
> may be seen as an abuse by Linux maintainers,
> as they currently seem to prefer the auxiliary bus (which is a real bus).
> 
> Mellanox got a push back when trying to use mdev for the same purpose
> (Scalable Function, also called Sub-Function) in the kernel.
> The Linux community decided to use the auxiliary bus.
> 
> Any other feedback on the choice mdev vs aux?

OK. Thanks for the info. Much appreciated.

I could investigate a bit about the choice and later come back to you.

> Is there any kernel code supporting this mdev model for Intel devices?

Now there's only intel GPU. But I think you care more about devices that DPDK could
drive: a dma device (DPDK's name ioat under raw/ioat) is on its way upstreaming
(https://www.spinics.net/lists/kvm/msg244417.html)

Thanks,
Chenbo

> 
> > > > In this patchset, the PCI bus driver is extended to support scanning
> > > > and probing the mdev devices whose device-api is "vfio-pci".
> > > >
> > > >                      +---------+
> > > >                      | PCI bus |
> > > >                      +----+----+
> > > >                           |
> > > >          +--------+-------+-------+--------+
> > > >          |        |               |        |
> > > >   Physical PCI devices ...   Mediated PCI devices ...
> > > >
> > > > The first four patches in this patchset are mainly preparation of mdev
> > > > bus support. The left two patches are the key implementation of mdev bus.
> > > >
> > > > The implementation of mdev bus in DPDK has several options:
> > > >
> > > > 1: Embed mdev bus in current pci bus
> > > >
> > > >    This patchset takes this option for an example. Mdev has several
> > > >    device types: pci/platform/amba/ccw/ap. DPDK currently only cares
> > > >    pci devices in all mdev device types so we could embed the mdev bus
> > > >    into current pci bus. Then pci bus with mdev support will scan/plug/
> > > >    unplug/.. not only normal pci devices but also mediated pci devices.
> > >
> > > I think it is a different bus.
> > > It would be cleaner to not touch the PCI bus.
> > > Having a separate bus will allow an easy way to identify a device
> > > with the new generic devargs syntax, example:
> > > 	bus=mdev,uuid=XXX
> > > or more complex:
> > > 	bus=mdev,uuid=XXX/class=crypto/driver=qat,foo=bar
> >
> > OK. Agree on cleaner to not touch PCI bus. And there may also be a
> 'type=pci'
> > as mdev has several types in its definition (pci/ap/platform/ccw/...).
> >
> > > > 2: A new mdev bus that scans mediated pci devices and probes mdev driver
> to
> > > >    plug-in pci devices to pci bus
> > > >
> > > >    If we took this option, a new mdev bus will be implemented to scan
> > > >    mediated pci devices and a new mdev driver for pci devices will be
> > > >    implemented in pci bus to plug-in mediated pci devices to pci bus.
> > > >
> > > >    Our RFC v1 takes this option:
> > > >    http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-
> > > tiwei.bie@intel.com/
> > > >
> > > >    Note that: for either option 1 or 2, device drivers do not know the
> > > >    implementation difference but only use structs/functions exposed by
> > > >    pci bus. Mediated pci devices are different from normal pci devices
> > > >    on: 1. Mediated pci devices use UUID as address but normal ones use
> BDF.
> > > >    2. Mediated pci devices may have some capabilities that normal pci
> > > >    devices do not have. For example, mediated pci devices could have
> > > >    regions that have sparse mmap capability, which allows a region to
> have
> > > >    multiple mmap areas. Another example is mediated pci devices may have
> > > >    regions/part of regions not mmaped but need to access them. Above
> > > >    difference will change the current ABI (i.e., struct rte_pci_device).
> > > >    Please check 5th and 6th patch for details.
> > > >
> > > > 3. A brand new mdev bus that does everything
> > > >
> > > >    This option will implement a new and standalone mdev bus. This option
> > > >    does not need any changes in current pci bus but only needs some
> shared
> > > >    code (linux vfio part) in pci bus. Drivers of devices that support
> mdev
> > > >    will register itself as a mdev driver and do not rely on pci bus
> anymore.
> > > >    This option, IMHO, will make the code clean. The only potential
> problem
> > > >    may be code duplication, which could be solved by making code of
> linux
> > > >    vfio part of pci bus common and shared.
> > >
> > > Yes I prefer this third option.
> > > We can find an elegant way of sharing some VFIO code between buses.
> >
> > Yes, I have not thought about the details of the code sharing but will try
> to make
> > it elegant.
> 
> Great, thanks.
> 


  reply	other threads:[~2021-06-15 10:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-03  7:18 [dpdk-dev] [RFC 0/3] " Tiwei Bie
2019-04-03  7:18 ` Tiwei Bie
2019-04-03  7:18 ` [dpdk-dev] [RFC 1/3] eal: add a helper for reading string from sysfs Tiwei Bie
2019-04-03  7:18   ` Tiwei Bie
2019-04-03  7:18 ` [dpdk-dev] [RFC 2/3] bus/mdev: add mdev bus support Tiwei Bie
2019-04-03  7:18   ` Tiwei Bie
2019-04-03  7:18 ` [dpdk-dev] [RFC 3/3] bus/pci: add mdev support Tiwei Bie
2019-04-03  7:18   ` Tiwei Bie
2019-04-03 14:13   ` Wiles, Keith
2019-04-03 14:13     ` Wiles, Keith
2019-04-04  4:19     ` Tiwei Bie
2019-04-04  4:19       ` Tiwei Bie
2019-04-08  8:44 ` [dpdk-dev] [RFC 0/3] Add mdev (Mediated device) support in DPDK Alejandro Lucero
2019-04-08  8:44   ` Alejandro Lucero
2019-04-08  9:36   ` Tiwei Bie
2019-04-08  9:36     ` Tiwei Bie
2019-04-10 10:02     ` Francois Ozog
2019-04-10 10:02       ` Francois Ozog
2019-07-15  7:52 ` [dpdk-dev] [RFC v2 0/5] " Tiwei Bie
2019-07-15  7:52   ` [dpdk-dev] [RFC v2 1/5] bus/pci: introduce an internal representation of PCI device Tiwei Bie
2019-07-15  7:52   ` [dpdk-dev] [RFC v2 2/5] bus/pci: avoid depending on private value in kernel source Tiwei Bie
2019-07-15  7:52   ` [dpdk-dev] [RFC v2 3/5] bus/pci: introduce helper for MMIO read and write Tiwei Bie
2019-07-15  7:52   ` [dpdk-dev] [RFC v2 4/5] eal: add a helper for reading string from sysfs Tiwei Bie
2019-07-15  7:52   ` [dpdk-dev] [RFC v2 5/5] bus/pci: add mdev support Tiwei Bie
2021-06-01  3:06     ` [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK Chenbo Xia
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 1/6] bus/pci: introduce an internal representation of PCI device Chenbo Xia
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 2/6] bus/pci: avoid depending on private value in kernel source Chenbo Xia
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 3/6] bus/pci: introduce helper for MMIO read and write Chenbo Xia
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 4/6] eal: add a helper for reading string from sysfs Chenbo Xia
2021-06-01  5:37         ` Stephen Hemminger
2021-06-08  5:47           ` Xia, Chenbo
2021-06-01  5:39         ` Stephen Hemminger
2021-06-08  5:48           ` Xia, Chenbo
2021-06-11  7:19         ` Thomas Monjalon
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 5/6] bus/pci: add mdev support Chenbo Xia
2021-06-01  3:06       ` [dpdk-dev] [RFC v3 6/6] bus/pci: add sparse mmap support for mediated PCI devices Chenbo Xia
2021-06-11  7:15       ` [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK Thomas Monjalon
2021-06-15  2:49         ` Xia, Chenbo
2021-06-15  7:48           ` Thomas Monjalon
2021-06-15 10:44             ` Xia, Chenbo [this message]
2021-06-15 11:57             ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR11MB4063DA43F72B1BFE8374473A9C309@MN2PR11MB4063.namprd11.prod.outlook.com \
    --to=chenbo.xia@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jingjing.wu@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mdr@ashroe.eu \
    --cc=nhorman@tuxdriver.com \
    --cc=parav@nvidia.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=xuemingl@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git