DPDK patches and discussions
 help / color / mirror / Atom feed
From: Kyo Liu <kyo.liu@nebula-matrix.com>
To: kyo.liu@nebula-matrix.com, dev@dpdk.org
Cc: Chenbo Xia <chenbox@nvidia.com>, Nipun Gupta <nipun.gupta@amd.com>
Subject: [PATCH v1 10/17] net/nbl:  bus/pci: introduce get_iova_mode for pci dev
Date: Thu, 12 Jun 2025 08:58:31 +0000	[thread overview]
Message-ID: <20250612085840.729830-11-kyo.liu@nebula-matrix.com> (raw)
In-Reply-To: <20250612085840.729830-1-kyo.liu@nebula-matrix.com>

I propose this patch for DPDK to enable coexistence between
DPDK and kernel drivers for regular NICs.This solution requires
adding a new pci_ops in rte_pci_driver, through which DPDK will
retrieve the required IOVA mode from the vendor driver.
This mechanism is necessary to handle different IOMMU
configurations and operating modes. Below is a detailed
analysis of various scenarios:

1. When IOMMU is enabled:
1.1 With PT (Pass-Through) enabled:
In this case, the domain type is IOMMU_DOMAIN_IDENTITY,
which prevents vendor drivers from setting IOVA->PA mapping tables.
Therefore, DPDK must use PA mode. To achieve this:
The vendor kernel driver will register a character device (cdev) to
communicate with DPDK. This cdev handles device operations
(open, mmap, etc.) and ultimately
programs the hardware registers.

1.2 With PT disabled:
Here, the vendor driver doesn't enforce specific IOVA mode requirements.
Our implementation will:
Integrate a mediated device (mdev) in the vendor driver.
This mdev interacts with DPDK and manages IOVA->PA mapping configurations.

2. When IOMMU is disabled:
The vendor driver mandates PA mode (consistent with DPDK's PA mode
requirement in this scenario).
A character device (cdev) will similarly be registered for DPDK
communication.

Summary:
The solution leverages multiple technologies:
mdev for IOVA management when IOMMU is partially enabled.
VFIO for device passthrough operations.
cdev for register programming coordination.
A new pci_ops interface in DPDK to dynamically determine IOVA modes.
This architecture enables clean coexistence by establishing standardized
communication channels between DPDK and vendor drivers across different
IOMMU configurations.

Motivation for the Patch:
This patch is introduced to prepare for the upcoming open-source
contribution of our NebulaMatrix SNIC driver to DPDK. We aim to
ensure that our SNIC can seamlessly coexist with kernel drivers
using this mechanism. By adopting the proposed
architecture—leveraging dynamic IOVA mode negotiation via pci_ops,
mediated devices (mdev), and character device (cdev)
interactions—we enable our SNIC to operate in hybrid environments
here both DPDK and kernel drivers may manage the same hardware.
This design aligns with DPDK’s scalability goals and ensures
compatibility across diverse IOMMU configurations, which is critical
for real-world deployment scenarios.

Signed-off-by: Kyo Liu <kyo.liu@nebula-matrix.com>
---
 doc/guides/rel_notes/release_25_07.rst |  5 +++++
 drivers/bus/pci/bus_pci_driver.h       | 11 +++++++++++
 drivers/bus/pci/linux/pci.c            |  2 ++
 3 files changed, 18 insertions(+)

diff --git a/doc/guides/rel_notes/release_25_07.rst b/doc/guides/rel_notes/release_25_07.rst
index 9afc4520a6..a282b3e5a9 100644
--- a/doc/guides/rel_notes/release_25_07.rst
+++ b/doc/guides/rel_notes/release_25_07.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added get_iova_mode for rte_pci_driver.**
+
+  Introduce `pci_get_iova_mode` rte_pci_ops for `pci_get_iova_mode`
+  to PCI bus so that PCI drivers could get their wanted iova_mode
+
 * **Added PMU library.**
 
   Added a Performance Monitoring Unit (PMU) library which allows Linux applications
diff --git a/drivers/bus/pci/bus_pci_driver.h b/drivers/bus/pci/bus_pci_driver.h
index 2cc1119072..e5e36b0a5c 100644
--- a/drivers/bus/pci/bus_pci_driver.h
+++ b/drivers/bus/pci/bus_pci_driver.h
@@ -125,6 +125,16 @@ typedef int (pci_dma_map_t)(struct rte_pci_device *dev, void *addr,
 typedef int (pci_dma_unmap_t)(struct rte_pci_device *dev, void *addr,
 			      uint64_t iova, size_t len);
 
+/**
+ * retrieve the required IOVA mode from the vendor driver
+ *
+ * @param dev
+ *   Pointer to the PCI device.
+ * @return
+ *   - rte_iova_mode
+ */
+typedef int (pci_get_iova_mode)(const struct rte_pci_device *pdev);
+
 /**
  * A structure describing a PCI driver.
  */
@@ -136,6 +146,7 @@ struct rte_pci_driver {
 	pci_dma_map_t *dma_map;		   /**< device dma map function. */
 	pci_dma_unmap_t *dma_unmap;	   /**< device dma unmap function. */
 	const struct rte_pci_id *id_table; /**< ID table, NULL terminated. */
+	pci_get_iova_mode *get_iova_mode;  /**< Device get iova_mode function */
 	uint32_t drv_flags;                /**< Flags RTE_PCI_DRV_*. */
 };
 
diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
index c20d159218..fd69a02989 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -624,6 +624,8 @@ pci_device_iova_mode(const struct rte_pci_driver *pdrv,
 	default:
 		if ((pdrv->drv_flags & RTE_PCI_DRV_NEED_IOVA_AS_VA) != 0)
 			iova_mode = RTE_IOVA_VA;
+		else if (pdrv->get_iova_mode)
+			iova_mode = pdrv->get_iova_mode(pdev);
 		break;
 	}
 	return iova_mode;
-- 
2.43.0


  parent reply	other threads:[~2025-06-12  9:00 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-12  8:58 [PATCH v1 00/17] NBL PMD for Nebulamatrix NICs Kyo Liu
2025-06-12  8:58 ` [PATCH v1 01/17] net/nbl: add doc and minimum nbl build framework Kyo Liu
2025-06-12  8:58 ` [PATCH v1 02/17] net/nbl: add simple probe/remove and log module Kyo Liu
2025-06-12 17:49   ` Stephen Hemminger
2025-06-13  2:32     ` 回复:[PATCH " Kyo.Liu
2025-06-12  8:58 ` [PATCH v1 03/17] net/nbl: add PHY layer definitions and implementation Kyo Liu
2025-06-12  8:58 ` [PATCH v1 04/17] net/nbl: add Channel " Kyo Liu
2025-06-12  8:58 ` [PATCH v1 05/17] net/nbl: add Resource " Kyo Liu
2025-06-12  8:58 ` [PATCH v1 06/17] net/nbl: add Dispatch " Kyo Liu
2025-06-12  8:58 ` [PATCH v1 07/17] net/nbl: add Dev " Kyo Liu
2025-06-12  8:58 ` [PATCH v1 08/17] net/nbl: add complete device init and uninit functionality Kyo Liu
2025-06-12  8:58 ` [PATCH v1 09/17] net/nbl: add uio and vfio mode for nbl Kyo Liu
2025-06-12  8:58 ` Kyo Liu [this message]
2025-06-12 17:40   ` [PATCH v1 10/17] net/nbl: bus/pci: introduce get_iova_mode for pci dev Stephen Hemminger
2025-06-13  2:28     ` 回复:[PATCH " Kyo.Liu
2025-06-13  7:35       ` [PATCH " David Marchand
2025-06-13 15:21       ` 回复:[PATCH " Stephen Hemminger
2025-06-12  8:58 ` [PATCH v1 11/17] net/nbl: add nbl coexistence mode for nbl Kyo Liu
2025-06-12  8:58 ` [PATCH v1 12/17] net/nbl: add nbl ethdev configuration Kyo Liu
2025-06-12  8:58 ` [PATCH v1 13/17] net/nbl: add nbl device rxtx queue setup and release ops Kyo Liu
2025-06-12  8:58 ` [PATCH v1 14/17] net/nbl: add nbl device start and stop ops Kyo Liu
2025-06-12  8:58 ` [PATCH v1 15/17] net/nbl: add nbl device tx and rx burst Kyo Liu
2025-06-12  8:58 ` [PATCH v1 16/17] net/nbl: add nbl device xstats and stats Kyo Liu
2025-06-12  8:58 ` [PATCH v1 17/17] net/nbl: nbl device support set mtu and promisc Kyo Liu
2025-06-12 17:35 ` [PATCH v1 00/17] NBL PMD for Nebulamatrix NICs Stephen Hemminger
2025-06-12 17:44 ` Stephen Hemminger
2025-06-13  2:31   ` 回复:[PATCH " Kyo.Liu
2025-06-12 17:46 ` [PATCH " Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250612085840.729830-11-kyo.liu@nebula-matrix.com \
    --to=kyo.liu@nebula-matrix.com \
    --cc=chenbox@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=nipun.gupta@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).