From: Shahaf Shuler <shahafs@mellanox.com>
To: anatoly.burakov@intel.com, yskoh@mellanox.com,
thomas@monjalon.net, ferruh.yigit@intel.com,
nhorman@tuxdriver.com, gaetan.rivet@6wind.com
Cc: dev@dpdk.org
Subject: [dpdk-dev] [PATCH 3/6] bus: introduce DMA memory mapping for external memory
Date: Wed, 13 Feb 2019 11:10:23 +0200 [thread overview]
Message-ID: <323319abdbdc238c3586dafe9ad49dab554d6e64.1550048188.git.shahafs@mellanox.com> (raw)
In-Reply-To: <cover.1550048187.git.shahafs@mellanox.com>
In-Reply-To: <cover.1550048187.git.shahafs@mellanox.com>
The DPDK APIs expose 3 different modes to work with memory used for DMA:
1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
This memory is allocated by the DPDK libraries, included in the DPDK
memory system (memseg lists) and automatically DMA mapped by the DPDK
layers.
2. Use memory allocated by the user and register to the DPDK memory
systems. This is also referred as external memory. Upon registration of
the external memory, the DPDK layers will DMA map it to all needed
devices.
3. Use memory allocated by the user and not registered to the DPDK memory
system. This is for users who wants to have tight control on this
memory. The user will need to explicitly call DMA map function in order
to register such memory to the different devices.
The scope of the patch focus on #3 above.
Currently the only way to map external memory is through VFIO
(rte_vfio_dma_map). While VFIO is common, there are other vendors
which use different ways to map memory (e.g. Mellanox and NXP).
The work in this patch moves the DMA mapping to vendor agnostic APIs.
A new map and unmap ops were added to rte_bus structure. Implementation
of those was done currently only on the PCI bus. The implementation takes
the driver map and umap implementation as bypass to the VFIO mapping.
That is, in case of no specific map/unmap from the PCI driver,
VFIO mapping, if possible, will be used.
Application use with those APIs is quite simple:
* allocate memory
* take a device, and query its rte_device.
* call the bus map function for this device.
Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
APIs, leaving the PCI device APIs as the preferred option for the user.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
---
drivers/bus/pci/pci_common.c | 78 ++++++++++++++++++++++++++++
drivers/bus/pci/rte_bus_pci.h | 14 +++++
lib/librte_eal/common/eal_common_bus.c | 22 ++++++++
lib/librte_eal/common/include/rte_bus.h | 57 ++++++++++++++++++++
lib/librte_eal/rte_eal_version.map | 2 +
5 files changed, 173 insertions(+)
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 6276e5d695..018080c48b 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -528,6 +528,82 @@ pci_unplug(struct rte_device *dev)
return ret;
}
+/**
+ * DMA Map memory segment to device. After a successful call the device
+ * will be able to read/write from/to this segment.
+ *
+ * @param dev
+ * Pointer to the PCI device.
+ * @param addr
+ * Starting virtual address of memory to be mapped.
+ * @param iova
+ * Starting IOVA address of memory to be mapped.
+ * @param len
+ * Length of memory segment being mapped.
+ * @return
+ * - 0 On success.
+ * - Negative value and rte_errno is set otherwise.
+ */
+static int __rte_experimental
+pci_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
+{
+ struct rte_pci_device *pdev = RTE_DEV_TO_PCI(dev);
+
+ if (!pdev || !pdev->driver) {
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ if (pdev->driver->map)
+ return pdev->driver->map(pdev, addr, iova, len);
+ /**
+ * In case driver don't provides any specific mapping
+ * try fallback to VFIO.
+ */
+ if (pdev->kdrv == RTE_KDRV_VFIO)
+ return rte_vfio_container_dma_map(-1, (uintptr_t)addr, iova,
+ len);
+ rte_errno = ENOTSUP;
+ return -rte_errno;
+}
+
+/**
+ * Un-map memory segment to device. After a successful call the device
+ * will not be able to read/write from/to this segment.
+ *
+ * @param dev
+ * Pointer to the PCI device.
+ * @param addr
+ * Starting virtual address of memory to be unmapped.
+ * @param iova
+ * Starting IOVA address of memory to be unmapped.
+ * @param len
+ * Length of memory segment being unmapped.
+ * @return
+ * - 0 On success.
+ * - Negative value and rte_errno is set otherwise.
+ */
+static int __rte_experimental
+pci_dma_unmap(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
+{
+ struct rte_pci_device *pdev = RTE_DEV_TO_PCI(dev);
+
+ if (!pdev || !pdev->driver) {
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ if (pdev->driver->unmap)
+ return pdev->driver->unmap(pdev, addr, iova, len);
+ /**
+ * In case driver don't provides any specific mapping
+ * try fallback to VFIO.
+ */
+ if (pdev->kdrv == RTE_KDRV_VFIO)
+ return rte_vfio_container_dma_unmap(-1, (uintptr_t)addr, iova,
+ len);
+ rte_errno = ENOTSUP;
+ return -rte_errno;
+}
+
struct rte_pci_bus rte_pci_bus = {
.bus = {
.scan = rte_pci_scan,
@@ -536,6 +612,8 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .map = pci_dma_map,
+ .unmap = pci_dma_unmap,
.get_iommu_class = rte_pci_get_iommu_class,
.dev_iterate = rte_pci_dev_iterate,
.hot_unplug_handler = pci_hot_unplug_handler,
diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index f0d6d81c00..00b2d412c7 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -114,6 +114,18 @@ typedef int (pci_probe_t)(struct rte_pci_driver *, struct rte_pci_device *);
typedef int (pci_remove_t)(struct rte_pci_device *);
/**
+ * Driver-specific DMA mapping.
+ */
+typedef int (pci_dma_map_t)(struct rte_pci_device *dev, void *addr,
+ uint64_t iova, size_t len);
+
+/**
+ * Driver-specific DMA unmapping.
+ */
+typedef int (pci_dma_unmap_t)(struct rte_pci_device *dev, void *addr,
+ uint64_t iova, size_t len);
+
+/**
* A structure describing a PCI driver.
*/
struct rte_pci_driver {
@@ -122,6 +134,8 @@ struct rte_pci_driver {
struct rte_pci_bus *bus; /**< PCI bus reference. */
pci_probe_t *probe; /**< Device Probe function. */
pci_remove_t *remove; /**< Device Remove function. */
+ pci_dma_map_t *map; /**< device dma map function. */
+ pci_dma_unmap_t *unmap; /**< device dma unmap function. */
const struct rte_pci_id *id_table; /**< ID table, NULL terminated. */
uint32_t drv_flags; /**< Flags RTE_PCI_DRV_*. */
};
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index c8f1901f0b..b7911d5ddd 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -285,3 +285,25 @@ rte_bus_sigbus_handler(const void *failure_addr)
return ret;
}
+
+int __rte_experimental
+rte_bus_dma_map(struct rte_device *dev, void *addr, uint64_t iova,
+ size_t len)
+{
+ if (dev->bus->map == NULL || len == 0) {
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ return dev->bus->map(dev, addr, iova, len);
+}
+
+int __rte_experimental
+rte_bus_dma_unmap(struct rte_device *dev, void *addr, uint64_t iova,
+ size_t len)
+{
+ if (dev->bus->unmap == NULL || len == 0) {
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ return dev->bus->unmap(dev, addr, iova, len);
+}
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 6be4b5cabe..90e4bf51b2 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -168,6 +168,48 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
typedef int (*rte_bus_parse_t)(const char *name, void *addr);
/**
+ * Bus specific DMA map function.
+ * After a successful call, the memory segment will be mapped to the
+ * given device.
+ *
+ * @param dev
+ * Device pointer.
+ * @param addr
+ * Virtual address to map.
+ * @param iova
+ * IOVA address to map.
+ * @param len
+ * Length of the memory segment being mapped.
+ *
+ * @return
+ * 0 if mapping was successful.
+ * Negative value and rte_errno is set otherwise.
+ */
+typedef int (*rte_bus_map_t)(struct rte_device *dev, void *addr,
+ uint64_t iova, size_t len);
+
+/**
+ * Bus specific DMA unmap function.
+ * After a successful call, the memory segment will no longer be
+ * accessible by the given device.
+ *
+ * @param dev
+ * Device pointer.
+ * @param addr
+ * Virtual address to unmap.
+ * @param iova
+ * IOVA address to unmap.
+ * @param len
+ * Length of the memory segment being mapped.
+ *
+ * @return
+ * 0 if un-mapping was successful.
+ * Negative value and rte_errno is set otherwise.
+ */
+typedef int (*rte_bus_unmap_t)(struct rte_device *dev, void *addr,
+ uint64_t iova, size_t len);
+
+/**
* Implement a specific hot-unplug handler, which is responsible for
* handle the failure when device be hot-unplugged. When the event of
* hot-unplug be detected, it could call this function to handle
@@ -238,6 +280,8 @@ struct rte_bus {
rte_bus_plug_t plug; /**< Probe single device for drivers */
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
+ rte_bus_map_t map; /**< DMA map for device in the bus */
+ rte_bus_unmap_t unmap; /**< DMA unmap for device in the bus */
struct rte_bus_conf conf; /**< Bus configuration */
rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
rte_dev_iterate_t dev_iterate; /**< Device iterator. */
@@ -356,6 +400,19 @@ struct rte_bus *rte_bus_find_by_name(const char *busname);
enum rte_iova_mode rte_bus_get_iommu_class(void);
/**
+ * Wrapper to call the bus specific DMA map function.
+ */
+int __rte_experimental
+rte_bus_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len);
+
+/**
+ * Wrapper to call the bus specific DMA unmap function.
+ */
+int __rte_experimental
+rte_bus_dma_unmap(struct rte_device *dev, void *addr, uint64_t iova,
+ size_t len);
+
+/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
*/
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index eb5f7b9cbd..23f3adb73a 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -364,4 +364,6 @@ EXPERIMENTAL {
rte_service_may_be_active;
rte_socket_count;
rte_socket_id_by_idx;
+ rte_bus_dma_map;
+ rte_bus_dma_unmap;
};
--
2.12.0
next prev parent reply other threads:[~2019-02-13 9:11 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-13 9:10 [dpdk-dev] [PATCH 0/6] " Shahaf Shuler
2019-02-13 9:10 ` [dpdk-dev] [PATCH 1/6] vfio: allow DMA map of memory for the default vfio fd Shahaf Shuler
2019-02-13 9:45 ` Gaëtan Rivet
2019-02-13 11:38 ` Gaëtan Rivet
2019-02-13 15:23 ` Shahaf Shuler
2019-02-13 14:41 ` Burakov, Anatoly
2019-02-13 9:10 ` [dpdk-dev] [PATCH 2/6] vfio: don't fail to DMA map if memory is already mapped Shahaf Shuler
2019-02-13 9:58 ` Gaëtan Rivet
2019-02-13 19:52 ` Shahaf Shuler
2019-02-13 9:10 ` Shahaf Shuler [this message]
2019-02-13 11:17 ` [dpdk-dev] [PATCH 3/6] bus: introduce DMA memory mapping for external memory Gaëtan Rivet
2019-02-13 19:07 ` Shahaf Shuler
2019-02-14 14:00 ` Gaëtan Rivet
2019-02-17 6:23 ` Shahaf Shuler
2019-02-13 9:10 ` [dpdk-dev] [PATCH 4/6] net/mlx5: refactor external memory registration Shahaf Shuler
2019-02-13 9:10 ` [dpdk-dev] [PATCH 5/6] net/mlx5: support PCI device DMA map and unmap Shahaf Shuler
2019-02-13 11:35 ` Gaëtan Rivet
2019-02-13 11:44 ` Gaëtan Rivet
2019-02-13 19:11 ` Shahaf Shuler
2019-02-14 10:21 ` Gaëtan Rivet
2019-02-21 9:21 ` Shahaf Shuler
2019-02-13 9:10 ` [dpdk-dev] [PATCH 6/6] doc: deprecate VFIO DMA map APIs Shahaf Shuler
2019-02-13 11:43 ` [dpdk-dev] [PATCH 0/6] introduce DMA memory mapping for external memory Alejandro Lucero
2019-02-13 19:24 ` Shahaf Shuler
2019-02-14 10:19 ` Burakov, Anatoly
2019-02-14 13:28 ` Shahaf Shuler
2019-02-14 16:19 ` Burakov, Anatoly
2019-02-17 6:18 ` Shahaf Shuler
2019-02-18 12:21 ` Burakov, Anatoly
2019-02-14 12:22 ` Alejandro Lucero
2019-02-14 12:27 ` Alejandro Lucero
2019-02-14 13:41 ` Shahaf Shuler
2019-02-14 16:43 ` Burakov, Anatoly
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 " Shahaf Shuler
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 " Shahaf Shuler
2019-03-10 8:27 ` [dpdk-dev] [PATCH v4 " Shahaf Shuler
2019-03-10 8:27 ` [dpdk-dev] [PATCH v4 1/6] vfio: allow DMA map of memory for the default vfio fd Shahaf Shuler
2019-03-30 0:23 ` Thomas Monjalon
2019-03-30 0:23 ` Thomas Monjalon
2019-03-30 14:29 ` Thomas Monjalon
2019-03-30 14:29 ` Thomas Monjalon
2019-03-10 8:27 ` [dpdk-dev] [PATCH v4 2/6] vfio: don't fail to DMA map if memory is already mapped Shahaf Shuler
2019-03-10 8:28 ` [dpdk-dev] [PATCH v4 3/6] bus: introduce device level DMA memory mapping Shahaf Shuler
2019-03-11 10:19 ` Burakov, Anatoly
2019-03-13 9:56 ` Thomas Monjalon
2019-03-13 11:12 ` Shahaf Shuler
2019-03-13 11:19 ` Thomas Monjalon
2019-03-13 11:47 ` Burakov, Anatoly
2019-03-30 14:36 ` Thomas Monjalon
2019-03-30 14:36 ` Thomas Monjalon
2019-03-10 8:28 ` [dpdk-dev] [PATCH v4 4/6] net/mlx5: refactor external memory registration Shahaf Shuler
2019-03-10 8:28 ` [dpdk-dev] [PATCH v4 5/6] net/mlx5: support PCI device DMA map and unmap Shahaf Shuler
2019-03-10 8:28 ` [dpdk-dev] [PATCH v4 6/6] doc: deprecation notice for VFIO DMA map APIs Shahaf Shuler
2019-03-11 10:20 ` Burakov, Anatoly
2019-03-11 17:35 ` Rami Rosen
2019-10-01 15:20 ` David Marchand
2019-10-02 4:53 ` Shahaf Shuler
2019-10-02 7:51 ` David Marchand
2019-03-11 9:27 ` [dpdk-dev] [PATCH v4 0/6] introduce DMA memory mapping for external memory Gaëtan Rivet
2019-03-30 14:40 ` Thomas Monjalon
2019-03-30 14:40 ` Thomas Monjalon
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 1/6] vfio: allow DMA map of memory for the default vfio fd Shahaf Shuler
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 2/6] vfio: don't fail to DMA map if memory is already mapped Shahaf Shuler
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 3/6] bus: introduce device level DMA memory mapping Shahaf Shuler
2019-03-05 16:35 ` Burakov, Anatoly
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 4/6] net/mlx5: refactor external memory registration Shahaf Shuler
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 5/6] net/mlx5: support PCI device DMA map and unmap Shahaf Shuler
2019-03-05 13:59 ` [dpdk-dev] [PATCH v3 6/6] doc: deprecation notice for VFIO DMA map APIs Shahaf Shuler
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 1/6] vfio: allow DMA map of memory for the default vfio fd Shahaf Shuler
2019-02-28 11:56 ` Burakov, Anatoly
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 2/6] vfio: don't fail to DMA map if memory is already mapped Shahaf Shuler
2019-02-28 11:58 ` Burakov, Anatoly
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 3/6] bus: introduce device level DMA memory mapping Shahaf Shuler
2019-02-28 12:14 ` Burakov, Anatoly
2019-02-28 14:41 ` Burakov, Anatoly
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 4/6] net/mlx5: refactor external memory registration Shahaf Shuler
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 5/6] net/mlx5: support PCI device DMA map and unmap Shahaf Shuler
2019-02-21 14:50 ` [dpdk-dev] [PATCH v2 6/6] doc: deprecate VFIO DMA map APIs Shahaf Shuler
2019-02-21 15:50 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=323319abdbdc238c3586dafe9ad49dab554d6e64.1550048188.git.shahafs@mellanox.com \
--to=shahafs@mellanox.com \
--cc=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=gaetan.rivet@6wind.com \
--cc=nhorman@tuxdriver.com \
--cc=thomas@monjalon.net \
--cc=yskoh@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).