DPDK patches and discussions
 help / color / mirror / Atom feed
From: Anatoly Burakov <anatoly.burakov@intel.com>
To: dev@dpdk.org
Subject: [PATCH v2 00/19] Support VFIO cdev API in DPDK
Date: Fri, 14 Nov 2025 17:40:10 +0000	[thread overview]
Message-ID: <cover.1763142007.git.anatoly.burakov@intel.com> (raw)
In-Reply-To: <cover.1761669438.git.anatoly.burakov@intel.com>

This patchset introduces a major refactor of the VFIO subsystem in DPDK to
support character device (cdev) interface introduced in Linux kernel, as well as
make the API more streamlined and useful. The goal is to simplify device
management, improve compatibility, and clarify API responsibilities.

The following sections outline the key issues addressed by this patchset and the
corresponding changes introduced.

1. Only group mode is supported
===============================

Since kernel version 4.14.327 (LTS), VFIO supports the new character device
(cdev)-based way of working with VFIO devices (otherwise known as IOMMUFD). This
is a device-centric mode and does away with all the complexity regarding groups
and IOMMU types, delegating it all to the kernel, and exposes a much simpler
interface to userspace.

The old group interface is still around, and will need to be kept in DPDK both
for compatibility reasons, as well as supporting special cases (FSLMC bus, NBL
driver, etc.).

To enable this, VFIO is heavily refactored, so that the code can support both
modes while relying on (mostly) common infrastructure.

Note that the existing `rte_vfio_device_setup/release` model is fundamentally
incompatible with cdev mode, because for custom container cases, the expected
flow is that the user binds the IOMMU group (and thus, implicitly, the device
itself) to a specific container using `rte_vfio_container_group_bind`, whereas
this step is not needed for cdev as the device fd is assigned to the container
straight away.

Therefore, what we do instead is introduce a new API for container device
assignment which, semantically, will assign a device to specified container, so
that when it is mapped using `rte_pci_map_device`, the appropriate container is
selected. Under the hood though, we essentially transition to getting device fd
straight away at assign stage, so that by the time the PCI bus attempts to map
the device, it is already mapped and we just return an fd.

Additionally, a new `rte_vfio_get_mode` API is added for those cases that need
some introspection into VFIO's internals, with three new modes: group
(old-style), no-iommu (old-style but without IOMMU), and cdev (the new mode).
Although no-IOMMU is technically a variant of group mode, the distinction is
largely irrelevant to the user, as all usages of noiommu checks in our codebase
are for deciding whether to use IOVA or PA, not anything to do with managing
groups. The current plan for kernel community is to *not* introduce no-IOMMU
cdev implementation, which is why this will be kept for compatibility for these
use cases.

There were other users of VFIO which relied on group API but only for convenience
purposes; no actual VFIO functionality depended on those API's. Therefore, group
API's are removed and, where appropriate, replaced with the new API's.

List of removed API's:

* `rte_vfio_get_group_fd`
* `rte_vfio_clear_group`
* `rte_vfio_container_group_bind`
* `rte_vfio_container_group_unbind`
* `rte_vfio_noiommu_is_enabled` (replaced by new mode API)

2. The API responsibilities aren't clear and bleed into each other
==================================================================

Some API's do multiple things at once. In particular:

* `rte_vfio_get_device_info` will setup the device
* `rte_vfio_setup_device` will get device info

These API's have been adjusted to do one thing only.

v2:
- Make the entire API internal
- More aggressive API pruning, complete removal of group API
- Fixed a bug in group mode where device could not be used
- Better documentation and deprecation notice patches
- Moved doc patches to beginning of patchset

Anatoly Burakov (19):
  doc: add deprecation notice for VFIO API
  doc: add deprecation notice for vDPA driver API
  uapi: update to v6.17 and add iommufd.h
  vfio: make all functions internal
  vfio: add container device assignment API
  vfio: split get device info from setup
  net/nbl: do not use VFIO group bind API
  net/ntnic: use container device assignment API
  vdpa/ifc: use container device assignment API
  vdpa/nfp: use container device assignment API
  vdpa/sfc: use container device assignment API
  vhost: remove group-related API from drivers
  vfio: cleanup and refactor
  bus/pci: use the new VFIO mode API
  bus/fslmc: use the new VFIO mode API
  net/hinic3: use the new VFIO mode API
  net/ntnic: use the new VFIO mode API
  vfio: remove group API functions
  vfio: introduce cdev mode

 config/arm/meson.build                    |    1 +
 config/meson.build                        |    1 +
 doc/guides/prog_guide/vhost_lib.rst       |    4 -
 doc/guides/rel_notes/deprecation.rst      |    7 +
 drivers/bus/cdx/cdx_vfio.c                |   21 +-
 drivers/bus/fslmc/fslmc_bus.c             |   10 +-
 drivers/bus/fslmc/fslmc_vfio.c            |    2 +-
 drivers/bus/pci/linux/pci.c               |    2 +-
 drivers/bus/pci/linux/pci_vfio.c          |   29 +-
 drivers/bus/platform/platform.c           |    8 +-
 drivers/crypto/bcmfs/bcmfs_vfio.c         |   14 +-
 drivers/net/hinic3/base/hinic3_hwdev.c    |    2 +-
 drivers/net/nbl/nbl_common/nbl_userdev.c  |   18 +-
 drivers/net/nbl/nbl_include/nbl_include.h |    1 +
 drivers/net/ntnic/ntnic_ethdev.c          |    2 +-
 drivers/net/ntnic/ntnic_vfio.c            |   30 +-
 drivers/vdpa/ifc/ifcvf_vdpa.c             |   34 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c             |    1 -
 drivers/vdpa/nfp/nfp_vdpa.c               |   37 +-
 drivers/vdpa/sfc/sfc_vdpa.c               |   39 +-
 drivers/vdpa/sfc/sfc_vdpa.h               |    2 -
 kernel/linux/uapi/linux/iommufd.h         | 1292 +++++++++++
 kernel/linux/uapi/linux/vduse.h           |    2 +-
 kernel/linux/uapi/linux/vfio.h            |   12 +-
 kernel/linux/uapi/version                 |    2 +-
 lib/eal/freebsd/eal.c                     |   98 +-
 lib/eal/include/rte_vfio.h                |  386 ++--
 lib/eal/linux/eal_vfio.c                  | 2439 ++++++++-------------
 lib/eal/linux/eal_vfio.h                  |  167 +-
 lib/eal/linux/eal_vfio_cdev.c             |  389 ++++
 lib/eal/linux/eal_vfio_group.c            |  983 +++++++++
 lib/eal/linux/eal_vfio_mp_sync.c          |   80 +-
 lib/eal/linux/meson.build                 |    2 +
 lib/eal/windows/eal.c                     |    4 +-
 lib/vhost/vdpa_driver.h                   |    3 -
 35 files changed, 4263 insertions(+), 1861 deletions(-)
 create mode 100644 kernel/linux/uapi/linux/iommufd.h
 create mode 100644 lib/eal/linux/eal_vfio_cdev.c
 create mode 100644 lib/eal/linux/eal_vfio_group.c

-- 
2.47.3


  parent reply	other threads:[~2025-11-14 17:40 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 16:43 [PATCH v1 0/8] " Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 1/8] uapi: update to v6.17 and add iommufd.h Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 2/8] vfio: add container device assignment API Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 3/8] vhost: remove group-related API from drivers Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 4/8] vfio: do not setup the device on get device info Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 5/8] vfio: cleanup and refactor Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 6/8] vfio: introduce cdev mode Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 7/8] doc: deprecate VFIO group-based APIs Anatoly Burakov
2025-10-28 16:43 ` [PATCH v1 8/8] vfio: deprecate group-based API Anatoly Burakov
2025-10-29  9:50 ` 回复:[PATCH v1 0/8] Support VFIO cdev API in DPDK Dimon
2025-10-29 12:03   ` Burakov, Anatoly
2025-10-30  9:21 ` [PATCH " David Marchand
2025-10-30 10:11   ` Burakov, Anatoly
2025-11-14 17:40 ` Anatoly Burakov [this message]
2025-11-14 17:40   ` [PATCH v2 01/19] doc: add deprecation notice for VFIO API Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 02/19] doc: add deprecation notice for vDPA driver API Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 03/19] uapi: update to v6.17 and add iommufd.h Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 04/19] vfio: make all functions internal Anatoly Burakov
2025-11-14 18:18     ` Stephen Hemminger
2025-11-14 17:40   ` [PATCH v2 05/19] vfio: add container device assignment API Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 06/19] vfio: split get device info from setup Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 07/19] net/nbl: do not use VFIO group bind API Anatoly Burakov
2025-11-15  8:31     ` 回复:[PATCH " Dimon
2025-11-14 17:40   ` [PATCH v2 08/19] net/ntnic: use container device assignment API Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 09/19] vdpa/ifc: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 10/19] vdpa/nfp: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 11/19] vdpa/sfc: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 12/19] vhost: remove group-related API from drivers Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 13/19] vfio: cleanup and refactor Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 14/19] bus/pci: use the new VFIO mode API Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 15/19] bus/fslmc: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 16/19] net/hinic3: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 17/19] net/ntnic: " Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 18/19] vfio: remove group API functions Anatoly Burakov
2025-11-14 17:40   ` [PATCH v2 19/19] vfio: introduce cdev mode Anatoly Burakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1763142007.git.anatoly.burakov@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).