From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
"dev@dpdk.org" <dev@dpdk.org>
Cc: "Tan, Jianfeng" <jianfeng.tan@intel.com>,
"Bie, Tiwei" <tiwei.bie@intel.com>,
"yliu@fridaylinux.org" <yliu@fridaylinux.org>,
"Liang, Cunming" <cunming.liang@intel.com>,
"Daly, Dan" <dan.daly@intel.com>,
"Wang, Zhihong" <zhihong.wang@intel.com>
Subject: Re: [dpdk-dev] [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver
Date: Mon, 12 Feb 2018 15:36:09 +0000 [thread overview]
Message-ID: <B7F2E978279D1D49A3034B7786DACF406F83EC66@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1909c444-4b88-bbb3-9fc2-b85ac8a3f5cb@redhat.com>
Hi Maxime,
> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Thursday, February 8, 2018 5:09 PM
> To: Wang, Xiao W <xiao.w.wang@intel.com>; dev@dpdk.org
> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> yliu@fridaylinux.org; Liang, Cunming <cunming.liang@intel.com>; Daly, Dan
> <dan.daly@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>
> Subject: Re: [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver
>
> Hi Xiao,
>
> On 02/08/2018 03:23 AM, Wang, Xiao W wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> >> Sent: Tuesday, February 6, 2018 10:24 PM
> >> To: Wang, Xiao W <xiao.w.wang@intel.com>; dev@dpdk.org
> >> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Bie, Tiwei
> <tiwei.bie@intel.com>;
> >> yliu@fridaylinux.org; Liang, Cunming <cunming.liang@intel.com>; Daly, Dan
> >> <dan.daly@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>
> >> Subject: Re: [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver
> >>
> >> Hi Xiao,
> >>
> >> On 02/04/2018 03:55 PM, Xiao Wang wrote:
> >>> This driver is a reference sample of making vDPA device driver based
> >>> on vhost lib, this driver uses a standard virtio-net PCI device as
> >>> vDPA device, it can serve as a backend for a virtio-net pci device
> >>> in nested VM.
> >>>
> >>> The key driver ops implemented are:
> >>>
> >>> * vdpa_virtio_eng_init
> >>> Mapping virtio pci device with VFIO into userspace, and read device
> >>> capability and intialize internal data.
> >>>
> >>> * vdpa_virtio_eng_uninit
> >>> Release the mapped device.
> >>>
> >>> * vdpa_virtio_info_query
> >>> Device capability reporting, e.g. queue number, features.
> >>>
> >>> * vdpa_virtio_dev_config
> >>> With the guest virtio information provideed by vhost lib, this
> >>> function configures device and IOMMU to set up vhost datapath,
> >>> which includes: Rx/Tx vring, VFIO interrupt, kick relay.
> >>>
> >>> * vdpa_virtio_dev_close
> >>> Unset the stuff that are configured previously by dev_conf.
> >>>
> >>> This driver requires the virtio device supports
> VIRTIO_F_IOMMU_PLATFORM
> >>> , because the buffer address written in desc is IOVA.
> >>>
> >>> Because vDPA driver needs to set up MSI-X vector to interrupt the guest,
> >>> only vfio-pci is supported currently.
> >>>
> >>> Signed-off-by: Xiao Wang<xiao.w.wang@intel.com>
> >>> ---
> >>> config/common_base | 6 +
> >>> config/common_linuxapp | 1 +
> >>> drivers/net/Makefile | 1 +
> >>> drivers/net/vdpa_virtio_pci/Makefile | 31 +
> >>> .../net/vdpa_virtio_pci/rte_eth_vdpa_virtio_pci.c | 1527
> >> ++++++++++++++++++++
> >>> .../rte_vdpa_virtio_pci_version.map | 4 +
> >>> mk/rte.app.mk | 1 +
> >>> 7 files changed, 1571 insertions(+)
> >>> create mode 100644 drivers/net/vdpa_virtio_pci/Makefile
> >>> create mode 100644
> drivers/net/vdpa_virtio_pci/rte_eth_vdpa_virtio_pci.c
> >>> create mode 100644
> >> drivers/net/vdpa_virtio_pci/rte_vdpa_virtio_pci_version.map
> >>
> >> Is there a specific constraint that makes you expose PCI functions and
> >> duplicate a lot of vfio code into the driver?
> >
> > The existing vfio code doesn't fit VDPA well, this vDPA driver needs to
> program IOMMU for a vDPA device with a VM's memory table.
> > While the eal/vfio uses a struct vfio_cfg to takes all regular devices and add
> them to a single vfio_container, and program IOMMU with DPDK process's
> memory table.
> >
> > This driver doing PCI VFIO initialization itself can avoid affecting the global
> vfio_cfg structure.
>
> Ok, I get it.
> So I think what you have to do is to extend eal/vfio for this case.
> Or at least, have a vdpa layer to perform this, else every offload
> driver will have to duplicate the code.
I think I need to extend eal/vfio to provide container based APIs, such as creating container,
vfio group fd binding with container, DMAR programming, etc.
>
> >>
> >> Wouldn't it be better (if possible) to use RTE_PMD_REGISTER_PCI() & co.
> >> to benefit from all the existing infrastructure?
> >
> > RTE_PMD_REGISTER_PCI() & co will make this driver as PCI driver (physical
> device), then this will conflict with the virtio_pmd.
> > So I make vDPA device driver as a vdev driver.
>
> Yes, but it is a PCI device, not a virtual device. You have to extend
> the EAL to support this new class of devices/drivers. Think of it as in
> kernel when a NIC device can be either binded to its NIC driver, VFIO or
> UIO.
>
> If I look at patch 3, you have to set --no-pci, or at least I think to
> blacklist the Virtio device.
>
> I wonder if real vDPA cards will support either vDPA mode or or behave
> like a regular NIC, like the Virtio case in your example.
> If this is the case, maybe the vDPA code for a NIC could be in the same
> driver as the "NIC" mode.
> A new struct rte_pci_driver driver flag could be introduced to specify
> that the driver supports vDPA.
> Then, in EAL arguments, if a vhost vdev specifies it wants Virtio device
> at PCI addr 00:01:00 as offload, the PCI layer could probe this device
> in "vdpa" mode.
Considering that we could have a pool of vDPA devices, we need to have a port supporting port representor,
it defines control domain to which these vDPA devices belong to, we can have a vdev port for this
purpose and this vdev helps to register vDPA ports by port-representor library (patch submitted).
+------+
| vdev |
+---+ |------|
|app|--register representor port-->|broker|-->add port with vDPA device 0/1/2...
+---+ +------+
I plan to submit vdpa driver patch for a real vDPA card, that card will have different sub device_id/vendor_id,
so we won't have conflict issue on that driver.
>
> Also, I don't know if this will be possible with real vDPA cards, but we
> could have the application doing packet switching between vhost-user
> vdev and the Virtio device. And at some point, at runtime, switch into
> vDPA mode. This use-case would be much easier to implement if vDPA
> relied on existing PCI layer.
In vDPA mode, each vhost-user datapath is performed by a vDPA device,
If switchover to normal SW packet switching, it will be typically many vhost-user ports and one uplink port.
Thanks,
Xiao
>
> I may be not very clear, don't hesitate to ask questions.
> But generally, I think vDPA has to fit in existing DPDK architecture,
> and not try to live outside of it.
>
> Thanks,
> Maxime
> >>
> >> Maxime
> >
> > Thanks for the comments,
> > Xiao
> >
next prev parent reply other threads:[~2018-02-12 15:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-04 14:55 [dpdk-dev] [PATCH 0/3] add vDPA " Xiao Wang
2018-02-04 14:55 ` [dpdk-dev] [PATCH 1/3] bus/pci: expose API for vDPA Xiao Wang
2018-02-04 14:55 ` [dpdk-dev] [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver Xiao Wang
2018-02-06 14:24 ` Maxime Coquelin
2018-02-08 2:23 ` Wang, Xiao W
2018-02-08 9:08 ` Maxime Coquelin
2018-02-12 15:36 ` Wang, Xiao W [this message]
2018-02-04 14:55 ` [dpdk-dev] [PATCH 3/3] examples/vdpa: add a new sample for vdpa Xiao Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B7F2E978279D1D49A3034B7786DACF406F83EC66@SHSMSX101.ccr.corp.intel.com \
--to=xiao.w.wang@intel.com \
--cc=cunming.liang@intel.com \
--cc=dan.daly@intel.com \
--cc=dev@dpdk.org \
--cc=jianfeng.tan@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=tiwei.bie@intel.com \
--cc=yliu@fridaylinux.org \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).