DPDK patches and discussions
 help / color / mirror / Atom feed
From: beilei.xing@intel.com
To: anatoly.burakov@intel.com
Cc: dev@dpdk.org, thomas@monjalon.net, ferruh.yigit@amd.com,
	bruce.richardson@intel.com, chenbox@nvidia.com,
	yahui.cao@intel.com, Beilei Xing <beilei.xing@intel.com>
Subject: [PATCH 0/4] add VFIO IOMMUFD/CDEV support
Date: Fri, 22 Dec 2023 19:44:49 +0000	[thread overview]
Message-ID: <20231222194453.3049693-1-beilei.xing@intel.com> (raw)

From: Beilei Xing <beilei.xing@intel.com>

This is a draft implementation to support IOMMUFD[1] user interface and VFIO
CDEV user interface[2].

Problem statement:
Linux now includes multiple device-passthrough frameworks (e.g. VFIO and vDPA)
and those frameworks implement their own logic for managing I/O page tables,
which is hard to scale to support modern IOMMU features like PASID, I/O page
fault, IOMMU dirty page tracking, etc.

In order to fix the issue, a new standalone IOMMU subsystem called IOMMUFD is
introduced in Linux Kernel since v6.2. The goal is to make Linux subsystems like
VFIO and vDPA to consume a unified IOMMU framework. Along with this new IOMMUFD
framework, new device-centric VFIO uAPI called VFIO CDEV is also introduced
since Linux Kernel v6.6. vDPA support for IOMMUFD in Linux Kernel is still work
in progress[3].

Since all new IOMMU features provided by different vendors will only be supported
in the new framework instead of legacy one, it's important for DPDK to support
this new IOMMUFD framework to use latest IOMMU features.

For VFIO subsystem, mainline Linux supports both of VFIO Container/GROUP interface
and VFIO IOMMUFD/CDEV interface. IOMMUFD has no impact on the existing VFIO
Container/Group interface, while latest IOMMU feature(e.g. PASID/SSID) may be only
available through VFIO IOMMUFD/CDEV interface. Comparing with VFIO Container and
VFIO IOMMUFD, vfio device uAPI does not change while I/O page tables management is
moved from VFIO Container into IOMMUFD interface.

Design:
For DPDK implementation, since VFIO Container/GROUP & VFIO IOMMUFD/CDEV may co-exist
now, a new VFIO IOMMUFD file/interface will be added in EAL. Since IOMMUFD is a
unified framework which can be consumed by VFIO, vDPA, etc, iommufd will be added
as a standalone file/interface in EAL. Hence, DPDK bus driver (e.g. PCI) has 2
option to probe vfio device.

The diagram below shows relationship between VFIO Container/GROUP, IOMMUFD, VFIO
CDEV and bus driver (e.g. PCI) in DPDK with some comments below.

                     _____________________
                    |        [4]          |
                    |                     |
                    |                     |
                    |PCI BUS              |
                    |_____________________|
                        |             |
                        |             |
        ________________v___       ___v______________      ________________________
       |       [1]          |     |       [2]        |    |                        |
       |vfio container      |     |                  |    |                        |
       |vfio group          |     |vfio cdev         |    |   Other Consumer       |
       |                    |     |                  |    |   (vDPA IOMMUFD,       |
       |VFIO                |     |VFIO IOMMUFD(new) |    |    common memory)      |
       |____________________|     |__________________|    |________________________|
                                                |              |
                                                |              |
                                             ___v______________v___
                                            |        [3]           |
                                            | i/o page table mgmt  |
                                            |                      |
                                            |                      |
                                            |IOMMUFD(new)          |
                                            |______________________|

1. VFIO component is the existed and mature framework for device passthrough. No
   function changes here.
2. VFIO IOMMUFD is a new component added to co-work with IOMMUFD. It exposes function
   for PCI BUS to probe PCI device through VFIO CDEV interface.
3. IOMMUFD is a new component added. It exposes unified interface for VFIO IOMMUFD
   and other consumer to manage I/O page table.
4. PCI BUS is the existed component. Since now Linux has both of VFIO Container/GROUP
   & VFIO IOMMUFD/CDEV support, PCI BUS needs to determine the interface to probe the
   PCI device depending on user configuration.

TBD:
Multi-process will be supported in future.

[1] https://lwn.net/Articles/912515/
[2] https://patchwork.kernel.org/project/kvm/cover/20230718135551.6592-1-yi.l.liu@intel.com/
[3] https://lore.kernel.org/lkml/20231103171641.1703146-1-lulu@redhat.com/

Beilei Xing (3):
  vfio: add VFIO IOMMUFD support
  bus/pci: add VFIO CDEV support
  eal: add new args to choose VFIO mode

Yahui Cao (1):
  iommufd: add IOMMUFD support

 config/meson.build                  |   3 +
 config/rte_config.h                 |   1 +
 drivers/bus/pci/bus_pci_driver.h    |   1 +
 drivers/bus/pci/linux/pci.c         |  21 +-
 drivers/bus/pci/linux/pci_init.h    |   4 +
 drivers/bus/pci/linux/pci_vfio.c    |  52 +++-
 lib/eal/common/eal_common_config.c  |   6 +
 lib/eal/common/eal_common_options.c |  48 +++-
 lib/eal/common/eal_internal_cfg.h   |   1 +
 lib/eal/common/eal_options.h        |   2 +
 lib/eal/include/rte_eal.h           |  18 ++
 lib/eal/include/rte_iommufd.h       |  73 ++++++
 lib/eal/include/rte_vfio.h          |  55 ++++
 lib/eal/linux/eal.c                 |  22 ++
 lib/eal/linux/eal_iommufd.c         | 183 +++++++++++++
 lib/eal/linux/eal_iommufd.h         |  43 ++++
 lib/eal/linux/eal_vfio.h            |   3 +
 lib/eal/linux/eal_vfio_iommufd.c    | 385 ++++++++++++++++++++++++++++
 lib/eal/linux/meson.build           |   2 +
 lib/eal/version.map                 |   6 +
 20 files changed, 918 insertions(+), 11 deletions(-)
 create mode 100644 lib/eal/include/rte_iommufd.h
 create mode 100644 lib/eal/linux/eal_iommufd.c
 create mode 100644 lib/eal/linux/eal_iommufd.h
 create mode 100644 lib/eal/linux/eal_vfio_iommufd.c

-- 
2.34.1


             reply	other threads:[~2023-12-22 11:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-22 19:44 beilei.xing [this message]
2023-12-22 19:44 ` [PATCH 1/4] iommufd: add IOMMUFD support beilei.xing
2023-12-22 19:44 ` [PATCH 2/4] vfio: add VFIO " beilei.xing
2023-12-22 17:17   ` Stephen Hemminger
2023-12-25  6:30     ` Xing, Beilei
2023-12-22 19:44 ` [PATCH 3/4] bus/pci: add VFIO CDEV support beilei.xing
2023-12-22 19:44 ` [PATCH 4/4] eal: add new args to choose VFIO mode beilei.xing
2023-12-22 17:17   ` Stephen Hemminger
2023-12-25  6:06     ` Xing, Beilei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231222194453.3049693-1-beilei.xing@intel.com \
    --to=beilei.xing@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=chenbox@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=thomas@monjalon.net \
    --cc=yahui.cao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).