* [dpdk-dev] [PATCH v4 01/12] eal/pci: introduce PCI driver iova as va flag
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 02/12] eal/pci: export match function Santosh Shukla
` (12 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing RTE_PCI_DRV_IOVA_AS_VA flag. Flag used when driver needs
to operate in iova=va mode.
Why driver need iova=va mapping?
On NPU style co-processors like Octeontx, the buffer recycling has been
done in HW, unlike SW model. Here is the data flow:
1) On control path, Fill the HW mempool with buffers(iova as pa address)
2) on rx_burst, HW gives you IOVA address(iova as pa address)
3) As application expects VA to operate on it, rx_burst() needs to
convert to _va from _pa. Which is very expensive.
Instead of that if iova as va mapping, we can avoid the cost of
converting with help of IOMMU/SMMU.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
v3 --> v4:
- Renamed RTE_PCI_DRV_NEED_IOVA_VA to RTE_PCI_DRV_IOVA_AS_VA.
(Suggested by Maxime)
lib/librte_eal/common/include/rte_pci.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..743392f91 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 02/12] eal/pci: export match function
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 03/12] eal/pci: get iommu class Santosh Shukla
` (11 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
4 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 480ad234c..e81cbb286 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -200,6 +200,7 @@ DPDK_17.08 {
rte_bus_find;
rte_bus_find_by_device;
rte_bus_find_by_name;
+ rte_pci_match;
} DPDK_17.05;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 76bbcc853..8b6ecebd6 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -128,16 +128,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 743392f91..47f0532e4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -368,6 +368,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index fbaec39f7..a69bbb599 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -205,6 +205,7 @@ DPDK_17.08 {
rte_bus_find;
rte_bus_find_by_device;
rte_bus_find_by_name;
+ rte_pci_match;
} DPDK_17.05;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 03/12] eal/pci: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 02/12] eal/pci: export match function Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 04/12] bsdapp/eal_pci: " Santosh Shukla
` (10 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
v3 --> v4:
- Created a separate patch per suggestion from Maxime.
Initially thought to squash patch into [01/12] but
then [01/12] will have more context so decided to
keep it as separate patch.
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
2 files changed, 21 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index af9f0e13f..e06084253 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 47f0532e4..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -383,6 +383,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 04/12] bsdapp/eal_pci: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (2 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 03/12] eal/pci: get iommu class Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 05/12] linuxapp/eal_pci: " Santosh Shukla
` (9 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Bsdapp case returns default iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
v3 --> v4:
- Removed rte_pci_get_iommu_class api declaration. Now that
sits into separate patch [03/12].
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
2 files changed, 11 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index dcb3b51ad..965255f79 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index e81cbb286..4b25318be 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -201,6 +201,7 @@ DPDK_17.08 {
rte_bus_find_by_device;
rte_bus_find_by_name;
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.05;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 05/12] linuxapp/eal_pci: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (3 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 04/12] bsdapp/eal_pci: " Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 10:55 ` Hemant Agrawal
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 06/12] bus: " Santosh Shukla
` (8 subsequent siblings)
13 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
v3 --> v4 :
- Reworded WARNING message (suggested by Maxime)
- Added pci_device_is_bound func to check for no device case
(suggested by Hemant).
- Added ifdef vfio_present.
v1 --> v2:
- Removed Linux version check in vfio_noiommu func. Refer [1].
- Extending autodetction logic for _iommu_class.
Refer [2].
[1] https://www.mail-archive.com/dev@dpdk.org/msg70108.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
4 files changed, 119 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 7d9e1a99b..ecd946250 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -488,6 +489,100 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 : 0;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a69bbb599..5dd40f948 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -206,6 +206,7 @@ DPDK_17.08 {
rte_bus_find_by_device;
rte_bus_find_by_name;
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.05;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 05/12] linuxapp/eal_pci: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 05/12] linuxapp/eal_pci: " Santosh Shukla
@ 2017-07-18 10:55 ` Hemant Agrawal
0 siblings, 0 replies; 248+ messages in thread
From: Hemant Agrawal @ 2017-07-18 10:55 UTC (permalink / raw)
To: Santosh Shukla, thomas, dev
Cc: bruce.richardson, jerin.jacob, shreyansh.jain, gaetan.rivet,
sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz
On 7/18/2017 11:29 AM, Santosh Shukla wrote:
> Get iommu class of PCI device on the bus and returns preferred iova
> mapping mode for that bus.
>
> Algorithm for iova scheme selection for PCI bus:
> 0. If no device bound then return with RTE_IOVA_DC mapping mode,
> else goto 1).
> 1. Look for device attached to vfio kdrv and has .drv_flag set
> to RTE_PCI_DRV_IOVA_AS_VA.
> 2. Look for any device attached to UIO class of driver.
> 3. Check for vfio-noiommu mode enabled.
>
> If 2) & 3) is false and 1) is true then select
> mapping scheme as RTE_IOVA_VA. Otherwise use default
> mapping scheme (RTE_IOVA_PA).
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> v3 --> v4 :
> - Reworded WARNING message (suggested by Maxime)
> - Added pci_device_is_bound func to check for no device case
> (suggested by Hemant).
> - Added ifdef vfio_present.
>
> v1 --> v2:
> - Removed Linux version check in vfio_noiommu func. Refer [1].
> - Extending autodetction logic for _iommu_class.
> Refer [2].
>
> [1] https://www.mail-archive.com/dev@dpdk.org/msg70108.html
> [2] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
>
> lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
> 4 files changed, 119 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index 7d9e1a99b..ecd946250 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -45,6 +45,7 @@
> #include "eal_filesystem.h"
> #include "eal_private.h"
> #include "eal_pci_init.h"
> +#include "eal_vfio.h"
>
> /**
> * @file
> @@ -488,6 +489,100 @@ rte_pci_scan(void)
> return -1;
> }
>
> +/*
> + * Is pci device bound to any kdrv
> + */
> +static inline int
> +pci_device_is_bound(void)
> +{
> + struct rte_pci_device *dev = NULL;
> + int ret = 0;
> +
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_UNKNOWN ||
> + dev->kdrv == RTE_KDRV_NONE) {
> + continue;
> + } else {
> + ret = 1;
> + break;
> + }
> + }
> + return ret;
> +}
> +
> +/*
> + * Any one of the device bound to uio
> + */
> +static inline int
> +pci_device_bound_uio(void)
> +{
> + struct rte_pci_device *dev = NULL;
> +
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_IGB_UIO ||
> + dev->kdrv == RTE_KDRV_UIO_GENERIC) {
> + return 1;
> + }
> + }
> + return 0;
> +}
> +
> +/*
> + * Any one of the device has iova as va
> + */
> +static inline int
> +pci_device_has_iova_va(void)
> +{
> + struct rte_pci_device *dev = NULL;
> + struct rte_pci_driver *drv = NULL;
> +
> + FOREACH_DRIVER_ON_PCIBUS(drv) {
> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_VFIO &&
> + rte_pci_match(drv, dev))
> + return 1;
> + }
> + }
> + }
> + return 0;
> +}
> +
> +/*
> + * Get iommu class of PCI devices on the bus.
> + */
> +enum rte_iova_mode
> +rte_pci_get_iommu_class(void)
> +{
> + bool is_bound;
> + bool is_vfio_noiommu_enabled = true;
> + bool has_iova_va;
> + bool is_bound_uio;
> +
> + is_bound = pci_device_is_bound();
> + if (!is_bound)
> + return RTE_IOVA_DC;
> +
> + has_iova_va = pci_device_has_iova_va();
> + is_bound_uio = pci_device_bound_uio();
> +#ifdef VFIO_PRESENT
> + is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 : 0;
> +#endif
> +
> + if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
> + return RTE_IOVA_VA;
> +
> + if (has_iova_va) {
> + RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
> + if (is_vfio_noiommu_enabled)
> + RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
> + if (is_bound_uio)
> + RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
> + }
> +
> + return RTE_IOVA_PA;
> +}
> +
> /* Read PCI config space. */
> int rte_pci_read_config(const struct rte_pci_device *device,
> void *buf, size_t len, off_t offset)
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 946df7e31..c8a97b7e7 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
> return 0;
> }
>
> +int
> +vfio_noiommu_is_enabled(void)
> +{
> + int fd, ret, cnt __rte_unused;
> + char c;
> +
> + ret = -1;
> + fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
> + if (fd < 0)
> + return -1;
> +
> + cnt = read(fd, &c, 1);
> + if (c == 'Y')
> + ret = 1;
> +
> + close(fd);
> + return ret;
> +}
> +
> #endif
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
> index 5ff63e5d7..26ea8e119 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
> @@ -150,6 +150,8 @@ struct vfio_config {
> #define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
> #define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
> #define VFIO_GET_REGION_IDX(x) (x >> 40)
> +#define VFIO_NOIOMMU_MODE \
> + "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
>
> /* DMA mapping function prototype.
> * Takes VFIO container fd as a parameter.
> @@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
>
> int vfio_mp_sync_setup(void);
>
> +int vfio_noiommu_is_enabled(void);
> +
> #define SOCKET_REQ_CONTAINER 0x100
> #define SOCKET_REQ_GROUP 0x200
> #define SOCKET_CLR_GROUP 0x300
> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> index a69bbb599..5dd40f948 100644
> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> @@ -206,6 +206,7 @@ DPDK_17.08 {
> rte_bus_find_by_device;
> rte_bus_find_by_name;
> rte_pci_match;
> + rte_pci_get_iommu_class;
>
> } DPDK_17.05;
>
>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 06/12] bus: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (4 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 05/12] linuxapp/eal_pci: " Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 11:05 ` Hemant Agrawal
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 07/12] eal: introduce iova mode helper api Santosh Shukla
` (7 subsequent siblings)
13 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
v3 --> v4:
- Initialized mode to RTE_IOVA_DC in rte_bus_get_iommu_class.
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 22 ++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 48 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 4b25318be..b9ee82b6b 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -202,6 +202,7 @@ DPDK_17.08 {
rte_bus_find_by_name;
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.05;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 8b6ecebd6..bdf2e7c3a 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -552,6 +552,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index e06084253..94f1fdfca 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -182,6 +182,17 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get iommu class of devices on the bus.
+ * Check that those devices are attached to iommu driver.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -195,6 +206,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -294,6 +306,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get iommu class of devices on the bus.
+ * Check that those devices are attached to iommu driver.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 5dd40f948..705af3adc 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -207,6 +207,7 @@ DPDK_17.08 {
rte_bus_find_by_name;
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.05;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 06/12] bus: get iommu class
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 06/12] bus: " Santosh Shukla
@ 2017-07-18 11:05 ` Hemant Agrawal
2017-07-18 11:16 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Hemant Agrawal @ 2017-07-18 11:05 UTC (permalink / raw)
To: Santosh Shukla, thomas, dev
Cc: bruce.richardson, jerin.jacob, shreyansh.jain, gaetan.rivet,
sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz
On 7/18/2017 11:29 AM, Santosh Shukla wrote:
> API(rte_bus_get_iommu_class) helps to automatically detect and select
> appropriate iova mapping scheme for iommu capable device on that bus.
>
> Algorithm for iova scheme selection for bus:
> 0. Iterate through bus_list.
> 1. Collect each bus iova mode value and update into 'mode' var.
> 2. Mode selection scheme is:
> if mode == 0 then iova mode is _pa,
> if mode == 1 then iova mode is _pa,
> if mode == 2 then iova mode is _va,
> if mode == 3 then iova mode ia _pa.
>
> So mode !=2 will be default iova mode (_pa).
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> v3 --> v4:
> - Initialized mode to RTE_IOVA_DC in rte_bus_get_iommu_class.
>
> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
> lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
> lib/librte_eal/common/eal_common_pci.c | 1 +
> lib/librte_eal/common/include/rte_bus.h | 22 ++++++++++++++++++++++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
> 5 files changed, 48 insertions(+)
>
> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> index 4b25318be..b9ee82b6b 100644
> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> @@ -202,6 +202,7 @@ DPDK_17.08 {
> rte_bus_find_by_name;
> rte_pci_match;
> rte_pci_get_iommu_class;
> + rte_bus_get_iommu_class;
>
> } DPDK_17.05;
>
> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
> index 08bec2d93..a30a8982e 100644
> --- a/lib/librte_eal/common/eal_common_bus.c
> +++ b/lib/librte_eal/common/eal_common_bus.c
> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
> c[0] = '\0';
> return rte_bus_find(NULL, bus_can_parse, name);
> }
> +
> +
> +/*
> + * Get iommu class of devices on the bus.
> + */
> +enum rte_iova_mode
> +rte_bus_get_iommu_class(void)
> +{
> + int mode = RTE_IOVA_DC;
> + struct rte_bus *bus;
> +
> + TAILQ_FOREACH(bus, &rte_bus_list, next) {
> +
> + if (bus->get_iommu_class)
> + mode |= bus->get_iommu_class();
> + }
> +
> + if (mode != RTE_IOVA_VA) {
> + /* Use default IOVA mode */
> + mode = RTE_IOVA_PA;
> + }
> + return mode;
> +}
> diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
> index 8b6ecebd6..bdf2e7c3a 100644
> --- a/lib/librte_eal/common/eal_common_pci.c
> +++ b/lib/librte_eal/common/eal_common_pci.c
> @@ -552,6 +552,7 @@ struct rte_pci_bus rte_pci_bus = {
> .plug = pci_plug,
> .unplug = pci_unplug,
> .parse = pci_parse,
> + .get_iommu_class = rte_pci_get_iommu_class,
> },
> .device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
> .driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
> index e06084253..94f1fdfca 100644
> --- a/lib/librte_eal/common/include/rte_bus.h
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -182,6 +182,17 @@ struct rte_bus_conf {
> enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
> };
>
> +
> +/**
> + * Get iommu class of devices on the bus.
> + * Check that those devices are attached to iommu driver.
Can we try to improve this description.
" Get common iommu class of the all the devices on the bus. Bus may
check that those devices are attached to iommu driver.
If not devices are attached to the bus. Bus may return with don't core
value."
otherwise
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> + *
> + * @return
> + * enum rte_iova_mode value.
> + */
> +typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
> +
> +
> /**
> * A structure describing a generic bus.
> */
> @@ -195,6 +206,7 @@ struct rte_bus {
> rte_bus_unplug_t unplug; /**< Remove single device from driver */
> rte_bus_parse_t parse; /**< Parse a device name */
> struct rte_bus_conf conf; /**< Bus configuration */
> + rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
> };
>
> /**
> @@ -294,6 +306,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
> */
> struct rte_bus *rte_bus_find_by_name(const char *busname);
>
> +
> +/**
> + * Get iommu class of devices on the bus.
> + * Check that those devices are attached to iommu driver.
Get the common iommu class of devices bound on to buses available in the
system. The default mode is PA.
> + *
> + * @return
> + * enum rte_iova_mode value.
> + */
> +enum rte_iova_mode rte_bus_get_iommu_class(void);
> +
> /**
> * Helper for Bus registration.
> * The constructor has higher priority than PMD constructors.
> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> index 5dd40f948..705af3adc 100644
> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> @@ -207,6 +207,7 @@ DPDK_17.08 {
> rte_bus_find_by_name;
> rte_pci_match;
> rte_pci_get_iommu_class;
> + rte_bus_get_iommu_class;
>
> } DPDK_17.05;
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 06/12] bus: get iommu class
2017-07-18 11:05 ` Hemant Agrawal
@ 2017-07-18 11:16 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-07-18 11:16 UTC (permalink / raw)
To: Hemant Agrawal, thomas, dev
Cc: bruce.richardson, jerin.jacob, shreyansh.jain, gaetan.rivet,
sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz
Hi Hemant,
On Tuesday 18 July 2017 04:35 PM, Hemant Agrawal wrote:
> On 7/18/2017 11:29 AM, Santosh Shukla wrote:
>> API(rte_bus_get_iommu_class) helps to automatically detect and select
>> appropriate iova mapping scheme for iommu capable device on that bus.
>>
>> Algorithm for iova scheme selection for bus:
>> 0. Iterate through bus_list.
>> 1. Collect each bus iova mode value and update into 'mode' var.
>> 2. Mode selection scheme is:
>> if mode == 0 then iova mode is _pa,
>> if mode == 1 then iova mode is _pa,
>> if mode == 2 then iova mode is _va,
>> if mode == 3 then iova mode ia _pa.
>>
>> So mode !=2 will be default iova mode (_pa).
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> ---
>> v3 --> v4:
>> - Initialized mode to RTE_IOVA_DC in rte_bus_get_iommu_class.
>>
>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
>> lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
>> lib/librte_eal/common/eal_common_pci.c | 1 +
>> lib/librte_eal/common/include/rte_bus.h | 22 ++++++++++++++++++++++
>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>> 5 files changed, 48 insertions(+)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> index 4b25318be..b9ee82b6b 100644
>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> @@ -202,6 +202,7 @@ DPDK_17.08 {
>> rte_bus_find_by_name;
>> rte_pci_match;
>> rte_pci_get_iommu_class;
>> + rte_bus_get_iommu_class;
>>
>> } DPDK_17.05;
>>
>> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
>> index 08bec2d93..a30a8982e 100644
>> --- a/lib/librte_eal/common/eal_common_bus.c
>> +++ b/lib/librte_eal/common/eal_common_bus.c
>> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
>> c[0] = '\0';
>> return rte_bus_find(NULL, bus_can_parse, name);
>> }
>> +
>> +
>> +/*
>> + * Get iommu class of devices on the bus.
>> + */
>> +enum rte_iova_mode
>> +rte_bus_get_iommu_class(void)
>> +{
>> + int mode = RTE_IOVA_DC;
>> + struct rte_bus *bus;
>> +
>> + TAILQ_FOREACH(bus, &rte_bus_list, next) {
>> +
>> + if (bus->get_iommu_class)
>> + mode |= bus->get_iommu_class();
>> + }
>> +
>> + if (mode != RTE_IOVA_VA) {
>> + /* Use default IOVA mode */
>> + mode = RTE_IOVA_PA;
>> + }
>> + return mode;
>> +}
>> diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
>> index 8b6ecebd6..bdf2e7c3a 100644
>> --- a/lib/librte_eal/common/eal_common_pci.c
>> +++ b/lib/librte_eal/common/eal_common_pci.c
>> @@ -552,6 +552,7 @@ struct rte_pci_bus rte_pci_bus = {
>> .plug = pci_plug,
>> .unplug = pci_unplug,
>> .parse = pci_parse,
>> + .get_iommu_class = rte_pci_get_iommu_class,
>> },
>> .device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
>> .driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
>> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
>> index e06084253..94f1fdfca 100644
>> --- a/lib/librte_eal/common/include/rte_bus.h
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -182,6 +182,17 @@ struct rte_bus_conf {
>> enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
>> };
>>
>> +
>> +/**
>> + * Get iommu class of devices on the bus.
>> + * Check that those devices are attached to iommu driver.
>
> Can we try to improve this description.
> " Get common iommu class of the all the devices on the bus. Bus may check that those devices are attached to iommu driver.
> If not devices are attached to the bus. Bus may return with don't core value."
>
> otherwise
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>
We'll reword description in v5. Thanks for suggestion.
>> + *
>> + * @return
>> + * enum rte_iova_mode value.
>> + */
>> +typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
>> +
>> +
>> /**
>> * A structure describing a generic bus.
>> */
>> @@ -195,6 +206,7 @@ struct rte_bus {
>> rte_bus_unplug_t unplug; /**< Remove single device from driver */
>> rte_bus_parse_t parse; /**< Parse a device name */
>> struct rte_bus_conf conf; /**< Bus configuration */
>> + rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
>> };
>>
>> /**
>> @@ -294,6 +306,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
>> */
>> struct rte_bus *rte_bus_find_by_name(const char *busname);
>>
>> +
>> +/**
>> + * Get iommu class of devices on the bus.
>> + * Check that those devices are attached to iommu driver.
>
> Get the common iommu class of devices bound on to buses available in the system. The default mode is PA.
>
ditto... in v5.
>> + *
>> + * @return
>> + * enum rte_iova_mode value.
>> + */
>> +enum rte_iova_mode rte_bus_get_iommu_class(void);
>> +
>> /**
>> * Helper for Bus registration.
>> * The constructor has higher priority than PMD constructors.
>> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> index 5dd40f948..705af3adc 100644
>> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> @@ -207,6 +207,7 @@ DPDK_17.08 {
>> rte_bus_find_by_name;
>> rte_pci_match;
>> rte_pci_get_iommu_class;
>> + rte_bus_get_iommu_class;
>>
>> } DPDK_17.05;
>>
>>
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 07/12] eal: introduce iova mode helper api
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (5 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 06/12] bus: " Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
` (6 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 80fe21de3..2a49e9fde 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index b9ee82b6b..75a86a9d7 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -203,6 +203,7 @@ DPDK_17.08 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.05;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index b28bbab54..fffdf0d15 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 705af3adc..7161d1d83 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -208,6 +208,7 @@ DPDK_17.08 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.05;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (6 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 07/12] eal: introduce iova mode helper api Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 11:34 ` Hemant Agrawal
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
` (5 subsequent siblings)
13 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Auto detect iova mapping mode, based on the result of
rte_bus_scan_iommu_class.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index fffdf0d15..49b52ce4f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,12 +904,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
@ 2017-07-18 11:34 ` Hemant Agrawal
2017-07-18 11:56 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Hemant Agrawal @ 2017-07-18 11:34 UTC (permalink / raw)
To: Santosh Shukla, thomas, dev
Cc: bruce.richardson, jerin.jacob, shreyansh.jain, gaetan.rivet,
sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz
On 7/18/2017 11:29 AM, Santosh Shukla wrote:
> - Moving late bus scanning to up..just after eal_parsing.
> - Auto detect iova mapping mode, based on the result of
> rte_bus_scan_iommu_class.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index fffdf0d15..49b52ce4f 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
> return -1;
> }
>
> + if (rte_bus_scan()) {
> + rte_eal_init_alert("Cannot scan the buses for devices\n");
> + rte_errno = ENODEV;
> + return -1;
> + }
> +
> + /* autodetect the iova mapping mode (default is iova_pa) */
> + rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
> +
Santosh,
With some workaround in fslmc bus scanning/probe code. I am able
to test it. It works ok.
Post 17.08, we will be submitting the rework of fslmc bus so that this
patch will not break the dpaa2 platform support.
Regards,
Hemant
> if (internal_config.no_hugetlbfs == 0 &&
> internal_config.process_type != RTE_PROC_SECONDARY &&
> internal_config.xen_dom0_support == 0 &&
> @@ -895,12 +904,6 @@ rte_eal_init(int argc, char **argv)
> return -1;
> }
>
> - if (rte_bus_scan()) {
> - rte_eal_init_alert("Cannot scan the buses for devices\n");
> - rte_errno = ENODEV;
> - return -1;
> - }
> -
> RTE_LCORE_FOREACH_SLAVE(i) {
>
> /*
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode
2017-07-18 11:34 ` Hemant Agrawal
@ 2017-07-18 11:56 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-07-18 11:56 UTC (permalink / raw)
To: Hemant Agrawal, thomas, dev
Cc: bruce.richardson, jerin.jacob, shreyansh.jain, gaetan.rivet,
sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz
On Tuesday 18 July 2017 05:04 PM, Hemant Agrawal wrote:
> On 7/18/2017 11:29 AM, Santosh Shukla wrote:
>> - Moving late bus scanning to up..just after eal_parsing.
>> - Auto detect iova mapping mode, based on the result of
>> rte_bus_scan_iommu_class.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> ---
>> lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
>> 1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index fffdf0d15..49b52ce4f 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
>> return -1;
>> }
>>
>> + if (rte_bus_scan()) {
>> + rte_eal_init_alert("Cannot scan the buses for devices\n");
>> + rte_errno = ENODEV;
>> + return -1;
>> + }
>> +
>> + /* autodetect the iova mapping mode (default is iova_pa) */
>> + rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
>> +
> Santosh,
> With some workaround in fslmc bus scanning/probe code. I am able to test it. It works ok.
>
> Post 17.08, we will be submitting the rework of fslmc bus so that this patch will not break the dpaa2 platform support.
>
Cool ;).
> Regards,
> Hemant
>
>> if (internal_config.no_hugetlbfs == 0 &&
>> internal_config.process_type != RTE_PROC_SECONDARY &&
>> internal_config.xen_dom0_support == 0 &&
>> @@ -895,12 +904,6 @@ rte_eal_init(int argc, char **argv)
>> return -1;
>> }
>>
>> - if (rte_bus_scan()) {
>> - rte_eal_init_alert("Cannot scan the buses for devices\n");
>> - rte_errno = ENODEV;
>> - return -1;
>> - }
>> -
>> RTE_LCORE_FOREACH_SLAVE(i) {
>>
>> /*
>>
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 09/12] bsdapp/eal: auto detect iova mapping mode
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (7 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (4 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Mapping mode would be default for bsdapp. It supports
only one pass through mode (RTE_KDRV_NIC_UIO)
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 2a49e9fde..3cb1bd22f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,15 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,12 +629,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 10/12] linuxapp/eal_vfio: honor iova mode before mapping
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (8 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (3 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 11/12] linuxapp/eal_memory: honor iova mode in virt2phy
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (9 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 12/12] eal/rte_malloc: " Santosh Shukla
` (2 subsequent siblings)
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index daead31c2..249740645 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v4 12/12] eal/rte_malloc: honor iova mode in virt2phy
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (10 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-07-18 5:59 ` Santosh Shukla
2017-07-21 8:07 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Maxime Coquelin
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
13 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-18 5:59 UTC (permalink / raw)
To: thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (11 preceding siblings ...)
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 12/12] eal/rte_malloc: " Santosh Shukla
@ 2017-07-21 8:07 ` Maxime Coquelin
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
13 siblings, 0 replies; 248+ messages in thread
From: Maxime Coquelin @ 2017-07-21 8:07 UTC (permalink / raw)
To: Santosh Shukla, thomas, dev
Cc: bruce.richardson, jerin.jacob, hemant.agrawal, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
olivier.matz
Hi Santosh,
On 07/18/2017 07:59 AM, Santosh Shukla wrote:
> v4:
> Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va mapping.
> If a PCI driver demand for IOVA as VA scheme then the driver can add it in the
> PCI driver registration function.
>
> Algorithm to select IOVA as VA for PCI bus case:
> 0. If no device bound then return with RTE_IOVA_DC mapping mode,
> else goto 1).
> 1. Look for device attached to vfio kdrv and has .drv_flag set
> to RTE_PCI_DRV_IOVA_AS_VA.
> 2. Look for any device attached to UIO class of driver.
> 3. Check for vfio-noiommu mode enabled.
>
> If 2) & 3) is false and 1) is true then select
> mapping scheme as RTE_IOVA_VA. Otherwise use default
> mapping scheme (RTE_IOVA_PA).
>
> That way, Bus can truly autodetect the iova mapping mode for
> a device Or a set of the device.
>
>
> Patch series rebased on 'a599eb31f2e477674fc6176cdf989ee17432b552'.
>
> * Re-introduced RTE_IOVA_DC (Don't care mode) for no-device found case.
> (Identified by Hemant [5]).
> * Renamed flag from RTE_PCI_DRV_NEED_IOVA_VA to RTE_PCI_DRV_IOVA_AS_VA
> (Suggested by Maxime[6]).
> * Based on the discussion on the thread [3], [6] and [5].
>
> v3 --> v4:
> - Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
> - Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
> - Reworded WARNING message(suggested by Maxime[7]).
> - Created a separate patch for rte_pci_get_iommu_class (suggested by Maxime[]).
> - Added VFIO_PRESENT ifdef build fix.
>
> v2 --> v3:
> - Removed rte_mempool_virt2phy (suggested by Olivier [4])
>
> v1 --> v2:
> - Removed override eal option i.e. (--iova-mode=<>) Because we have means to
> truly autodetect the iova mode.
> - Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
> - Using NEED_IOVA_VA drv_flag in autodetection logic.
> - Removed Linux version check macro in vfio code, As per Maxime feedback.
> - Moved rte_pci_match API from local to global.
>
> Patch Summary:
> 0) 1st: Introducing a new flag in rte_pci_drv
> 1) 2nd: declare rte_pci_match api in pci header. Required for autodetection in
> follow up patches.
> 2) 3rd: declare rte_pci_get_iommu_class.
> 3) 4nd - 5th: autodetection mapping infrastructure for Linux/bsdapp.
> 4) 6th: Introduces global bus API named rte_bus_get_iommu_class.
> 5) 7th: iova mode helper API.
> 6) 8th - 9th: Calls rte_bus_get_iommu_class API for Linux/bsdapp and returns
> their iova mode.
> 7) 10th: Check iova mode and accordingly map vfio.dma_map to _pa or _va.
> 8) 11th - 12th: Check for IOVA_VA mode in below APIs
> - rte_mem_virt2phy
> - rte_malloc_virt2phy
>
> Test History:
> - Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
> - Tested for arm64/thunderx vNIC Integrated NIC for both modes
> - Tested for arm64/Octeontx integrated NICs for only
> Iova_va mode(It supports only one mode.)
> - Ran standalone tests like mempool_autotest, mbuf_autotest.
> - Verified for Doxygen.
>
> Work History:
> For v1, Refer [1].
> For v2, Refer [2].
> For v3, Refer [9].
>
>
> Checkpatch result:
> * Debug message - WARNING: line over 80 characters
>
> Thanks.,
>
> [1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
> [2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
> [3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
> [4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
> [5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
> [6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
> [7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
> [8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
> [9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
>
>
> Santosh Shukla (12):
> eal/pci: introduce PCI driver iova as va flag
> eal/pci: export match function
> eal/pci: get iommu class
> bsdapp/eal_pci: get iommu class
> linuxapp/eal_pci: get iommu class
> bus: get iommu class
> eal: introduce iova mode helper api
> linuxapp/eal: auto detect iova mode
> bsdapp/eal: auto detect iova mapping mode
> linuxapp/eal_vfio: honor iova mode before mapping
> linuxapp/eal_memory: honor iova mode in virt2phy
> eal/rte_malloc: honor iova mode in virt2phy
>
> lib/librte_eal/bsdapp/eal/eal.c | 21 ++++--
> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 4 ++
> lib/librte_eal/common/eal_common_bus.c | 23 ++++++
> lib/librte_eal/common/eal_common_pci.c | 11 +--
> lib/librte_eal/common/include/rte_bus.h | 32 +++++++++
> lib/librte_eal/common/include/rte_eal.h | 12 ++++
> lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
> lib/librte_eal/common/rte_malloc.c | 9 ++-
> lib/librte_eal/linuxapp/eal/eal.c | 21 ++++--
> lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
> lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 4 ++
> 15 files changed, 282 insertions(+), 24 deletions(-)
>
With Hermant's comments on patch 6 taken into account, feel free
to add my:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 00/12] Infrastructure to detect iova mapping on the bus
2017-07-18 5:59 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (12 preceding siblings ...)
2017-07-21 8:07 ` [dpdk-dev] [PATCH v4 00/12] Infrastructure to detect iova mapping on the bus Maxime Coquelin
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
` (12 more replies)
13 siblings, 13 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
Patch series rebased on version-17.08-rc2:
'67c4b6db68e199247b5dbd63f560582640b180bf'.
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
0) 1st: Introducing a new flag in rte_pci_drv
1) 2nd: declare rte_pci_match api in pci header. Required for autodetection in
follow up patches.
2) 3rd: declare rte_pci_get_iommu_class.
3) 4nd - 5th: autodetection mapping infrastructure for Linux/bsdapp.
4) 6th: Introduces global bus API named rte_bus_get_iommu_class.
5) 7th: iova mode helper API.
6) 8th - 9th: Calls rte_bus_get_iommu_class API for Linux/bsdapp and returns
their iova mode.
7) 10th: Check iova mode and accordingly map vfio.dma_map to _pa or _va.
8) 11th - 12th: Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
Checkpatch result:
* Debug message - WARNING: line over 80 characters
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
Santosh Shukla (12):
eal/pci: introduce PCI driver iova as va flag
eal/pci: export match function
eal/pci: get iommu class
bsdapp/eal_pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce iova mode helper api
linuxapp/eal: auto detect iova mode
bsdapp/eal: auto detect iova mapping mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 21 ++++--
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 21 ++++--
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 297 insertions(+), 24 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 01/12] eal/pci: introduce PCI driver iova as va flag
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 02/12] eal/pci: export match function Santosh Shukla
` (11 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing RTE_PCI_DRV_IOVA_AS_VA flag. Flag used when driver needs
to operate in iova=va mode.
Why driver need iova=va mapping?
On NPU style co-processors like Octeontx, the buffer recycling has been
done in HW, unlike SW model. Here is the data flow:
1) On control path, Fill the HW mempool with buffers(iova as pa address)
2) on rx_burst, HW gives you IOVA address(iova as pa address)
3) As application expects VA to operate on it, rx_burst() needs to
convert to _va from _pa. Which is very expensive.
Instead of that if iova as va mapping, we can avoid the cost of
converting with help of IOMMU/SMMU.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v3 --> v4:
- Renamed RTE_PCI_DRV_NEED_IOVA_VA to RTE_PCI_DRV_IOVA_AS_VA.
(Suggested by Maxime)
lib/librte_eal/common/include/rte_pci.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..743392f91 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 02/12] eal/pci: export match function
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 03/12] eal/pci: get iommu class Santosh Shukla
` (10 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v4 --> v5:
- Changed DPDK_17.08 to DPDK_17.11 in _version.map
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index f689f0c8f..3d3c70a88 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -236,3 +236,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 743392f91..47f0532e4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -368,6 +368,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 202072189..7d7fff496 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -241,3 +241,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 03/12] eal/pci: get iommu class
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 02/12] eal/pci: export match function Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 04/12] bsdapp/eal_pci: " Santosh Shukla
` (9 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v3 --> v4:
- Created a separate patch per suggestion from Maxime.
Initially thought to squash patch into [01/12] but
then [01/12] will have more context so decided to
keep it as separate patch.
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
2 files changed, 21 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 47f0532e4..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -383,6 +383,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 04/12] bsdapp/eal_pci: get iommu class
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (2 preceding siblings ...)
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 03/12] eal/pci: get iommu class Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 05/12] linuxapp/eal_pci: " Santosh Shukla
` (8 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Bsdapp case returns default iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v3 --> v4:
- Removed rte_pci_get_iommu_class api declaration. Now that
sits into separate patch [03/12].
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
2 files changed, 11 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index d3fb3c2d0..b45649428 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 3d3c70a88..8d5bc5000 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -241,5 +241,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 05/12] linuxapp/eal_pci: get iommu class
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (3 preceding siblings ...)
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 04/12] bsdapp/eal_pci: " Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 06/12] bus: " Santosh Shukla
` (7 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v3 --> v4 :
- Reworded WARNING message (suggested by Maxime)
- Added pci_device_is_bound func to check for no device case
(suggested by Hemant).
- Added ifdef vfio_present.
v1 --> v2:
- Removed Linux version check in vfio_noiommu func. Refer [1].
- Extending autodetction logic for _iommu_class.
Refer [2].
[1] https://www.mail-archive.com/dev@dpdk.org/msg70108.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
4 files changed, 119 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 2041d5f34..81d980817 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -483,6 +484,100 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 : 0;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 7d7fff496..bf68f02bc 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -246,5 +246,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 06/12] bus: get iommu class
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (4 preceding siblings ...)
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 05/12] linuxapp/eal_pci: " Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 07/12] eal: introduce iova mode helper api Santosh Shukla
` (6 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v4 --> v5:
- Reworded bus API description (Suggested by Hemant).
v3 --> v4:
- Initialized mode to RTE_IOVA_DC in rte_bus_get_iommu_class.
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 8d5bc5000..a30085a32 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -242,5 +242,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 9e40687e5..70a291a4d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -178,6 +178,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -191,6 +205,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -290,6 +305,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index bf68f02bc..780539dc7 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -247,5 +247,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 07/12] eal: introduce iova mode helper api
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (5 preceding siblings ...)
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 06/12] bus: " Santosh Shukla
@ 2017-07-24 8:39 ` Santosh Shukla
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
` (5 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:39 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 80fe21de3..2a49e9fde 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index a30085a32..2a3a592b2 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index b28bbab54..fffdf0d15 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 780539dc7..8b9a13fd8 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 08/12] linuxapp/eal: auto detect iova mode
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (6 preceding siblings ...)
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 07/12] eal: introduce iova mode helper api Santosh Shukla
@ 2017-07-24 8:40 ` Santosh Shukla
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
` (4 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:40 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Auto detect iova mapping mode, based on the result of
rte_bus_scan_iommu_class.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index fffdf0d15..49b52ce4f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,12 +904,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 09/12] bsdapp/eal: auto detect iova mapping mode
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (7 preceding siblings ...)
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
@ 2017-07-24 8:40 ` Santosh Shukla
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (3 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:40 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Mapping mode would be default for bsdapp. It supports
only one pass through mode (RTE_KDRV_NIC_UIO)
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 2a49e9fde..3cb1bd22f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,15 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,12 +629,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 10/12] linuxapp/eal_vfio: honor iova mode before mapping
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (8 preceding siblings ...)
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
@ 2017-07-24 8:40 ` Santosh Shukla
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (2 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:40 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 11/12] linuxapp/eal_memory: honor iova mode in virt2phy
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (9 preceding siblings ...)
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-07-24 8:40 ` Santosh Shukla
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 12/12] eal/rte_malloc: " Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:40 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index daead31c2..249740645 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v5 12/12] eal/rte_malloc: honor iova mode in virt2phy
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (10 preceding siblings ...)
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-07-24 8:40 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-07-24 8:40 UTC (permalink / raw)
To: thomas, dev
Cc: hemant.agrawal, bruce.richardson, jerin.jacob, shreyansh.jain,
gaetan.rivet, sergio.gonzalez.monroy, anatoly.burakov, stephen,
maxime.coquelin, olivier.matz, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus
2017-07-24 8:39 ` [dpdk-dev] [PATCH v5 " Santosh Shukla
` (11 preceding siblings ...)
2017-07-24 8:40 ` [dpdk-dev] [PATCH v5 12/12] eal/rte_malloc: " Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
` (12 more replies)
12 siblings, 13 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
v6:
Sending v5 series rebased on top of version: 17.11-rc0.
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
Patch series rebased on version-17.08-rc2:
'67c4b6db68e199247b5dbd63f560582640b180bf'.
v5 --> v6:
- Added api info in eal's versiom.map (release DPDK_v17.11).
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
0) 1st: Introducing a new flag in rte_pci_drv
1) 2nd: declare rte_pci_match api in pci header. Required for autodetection in
follow up patches.
2) 3rd: declare rte_pci_get_iommu_class.
3) 4nd - 5th: autodetection mapping infrastructure for Linux/bsdapp.
4) 6th: Introduces global bus API named rte_bus_get_iommu_class.
5) 7th: iova mode helper API.
6) 8th - 9th: Calls rte_bus_get_iommu_class API for Linux/bsdapp and returns
their iova mode.
7) 10th: Check iova mode and accordingly map vfio.dma_map to _pa or _va.
8) 11th - 12th: Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
Checkpatch result:
* Debug message - WARNING: line over 80 characters
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
Santosh Shukla (12):
eal/pci: introduce PCI driver iova as va flag
eal/pci: export match function
eal/pci: get iommu class
bsdapp/eal_pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce iova mode helper api
linuxapp/eal: auto detect iova mode
bsdapp/eal: auto detect iova mapping mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 20 ++++--
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 21 ++++--
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 296 insertions(+), 24 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-17 12:35 ` Aaron Conole
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 02/12] eal/pci: export match function Santosh Shukla
` (11 subsequent siblings)
12 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Introducing RTE_PCI_DRV_IOVA_AS_VA flag. Flag used when driver needs
to operate in iova=va mode.
Why driver need iova=va mapping?
On NPU style co-processors like Octeontx, the buffer recycling has been
done in HW, unlike SW model. Here is the data flow:
1) On control path, Fill the HW mempool with buffers(iova as pa address)
2) on rx_burst, HW gives you IOVA address(iova as pa address)
3) As application expects VA to operate on it, rx_burst() needs to
convert to _va from _pa. Which is very expensive.
Instead of that if iova as va mapping, we can avoid the cost of
converting with help of IOMMU/SMMU.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/common/include/rte_pci.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..743392f91 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
@ 2017-08-17 12:35 ` Aaron Conole
0 siblings, 0 replies; 248+ messages in thread
From: Aaron Conole @ 2017-08-17 12:35 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen
Santosh Shukla <santosh.shukla@caviumnetworks.com> writes:
> Introducing RTE_PCI_DRV_IOVA_AS_VA flag. Flag used when driver needs
> to operate in iova=va mode.
>
> Why driver need iova=va mapping?
>
> On NPU style co-processors like Octeontx, the buffer recycling has been
> done in HW, unlike SW model. Here is the data flow:
> 1) On control path, Fill the HW mempool with buffers(iova as pa address)
> 2) on rx_burst, HW gives you IOVA address(iova as pa address)
> 3) As application expects VA to operate on it, rx_burst() needs to
> convert to _va from _pa. Which is very expensive.
> Instead of that if iova as va mapping, we can avoid the cost of
> converting with help of IOMMU/SMMU.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
This should be folded into patch 5; there's no clear need for it until
then.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 02/12] eal/pci: export match function
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 03/12] eal/pci: get iommu class Santosh Shukla
` (10 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index aac6fd776..c819e3084 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -237,3 +237,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 743392f91..47f0532e4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -368,6 +368,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 3a8f15406..a15b382ff 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -242,3 +242,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 03/12] eal/pci: get iommu class
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 01/12] eal/pci: introduce PCI driver iova as va flag Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 02/12] eal/pci: export match function Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-17 12:38 ` Aaron Conole
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 04/12] bsdapp/eal_pci: " Santosh Shukla
` (9 subsequent siblings)
12 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
2 files changed, 21 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 47f0532e4..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -383,6 +383,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v6 03/12] eal/pci: get iommu class
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 03/12] eal/pci: get iommu class Santosh Shukla
@ 2017-08-17 12:38 ` Aaron Conole
0 siblings, 0 replies; 248+ messages in thread
From: Aaron Conole @ 2017-08-17 12:38 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen
Santosh Shukla <santosh.shukla@caviumnetworks.com> writes:
> Introducing rte_pci_get_iommu_class API which helps to get iommu class
> of PCI device on the bus and returns preferred iova mapping mode for
> PCI bus.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
I think 3/12 and 4/12 should be combined with 5/12. At the very least,
3/12 and 4/12 should be combined.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 04/12] bsdapp/eal_pci: get iommu class
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (2 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 03/12] eal/pci: get iommu class Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 05/12] linuxapp/eal_pci: " Santosh Shukla
` (8 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Bsdapp case returns default iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
2 files changed, 11 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c819e3084..1fdcfb460 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -242,5 +242,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 05/12] linuxapp/eal_pci: get iommu class
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (3 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 04/12] bsdapp/eal_pci: " Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 06/12] bus: " Santosh Shukla
` (7 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
4 files changed, 119 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..9725fd493 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -487,6 +488,100 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 : 0;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a15b382ff..40420ded3 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -247,5 +247,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 06/12] bus: get iommu class
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (4 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 05/12] linuxapp/eal_pci: " Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 07/12] eal: introduce iova mode helper api Santosh Shukla
` (6 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 1fdcfb460..9942f47aa 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 9e40687e5..70a291a4d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -178,6 +178,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -191,6 +205,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -290,6 +305,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 40420ded3..f35031746 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 07/12] eal: introduce iova mode helper api
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (5 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 06/12] bus: " Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
` (5 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 5fa598842..07e72203f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 9942f47aa..1a63f3f05 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -244,5 +244,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 48f12f44c..febbafdb3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index f35031746..c99f1ed44 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -249,5 +249,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (6 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 07/12] eal: introduce iova mode helper api Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-16 17:38 ` Aaron Conole
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
` (4 subsequent siblings)
12 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Auto detect iova mapping mode, based on the result of
rte_bus_scan_iommu_class.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index febbafdb3..5382f6c00 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -900,12 +909,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
@ 2017-08-16 17:38 ` Aaron Conole
2017-08-17 14:43 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Aaron Conole @ 2017-08-16 17:38 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen
Santosh Shukla <santosh.shukla@caviumnetworks.com> writes:
> - Moving late bus scanning to up..just after eal_parsing.
> - Auto detect iova mapping mode, based on the result of
> rte_bus_scan_iommu_class.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index febbafdb3..5382f6c00 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
> return -1;
> }
>
> + if (rte_bus_scan()) {
> + rte_eal_init_alert("Cannot scan the buses for devices\n");
> + rte_errno = ENODEV;
Since this now happens before hugetlbs are allocated, is it possible to
retry? If so, then I would say to clear the run_once variable.
> + return -1;
> + }
> +
> + /* autodetect the iova mapping mode (default is iova_pa) */
> + rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
> +
> if (internal_config.no_hugetlbfs == 0 &&
> internal_config.process_type != RTE_PROC_SECONDARY &&
> internal_config.xen_dom0_support == 0 &&
> @@ -900,12 +909,6 @@ rte_eal_init(int argc, char **argv)
> return -1;
> }
>
> - if (rte_bus_scan()) {
> - rte_eal_init_alert("Cannot scan the buses for devices\n");
> - rte_errno = ENODEV;
> - return -1;
> - }
> -
> RTE_LCORE_FOREACH_SLAVE(i) {
>
> /*
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode
2017-08-16 17:38 ` Aaron Conole
@ 2017-08-17 14:43 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-08-17 14:43 UTC (permalink / raw)
To: Aaron Conole
Cc: dev, olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen
On Wednesday 16 August 2017 11:08 PM, Aaron Conole wrote:
> Santosh Shukla <santosh.shukla@caviumnetworks.com> writes:
>
>> - Moving late bus scanning to up..just after eal_parsing.
>> - Auto detect iova mapping mode, based on the result of
>> rte_bus_scan_iommu_class.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_eal/linuxapp/eal/eal.c | 15 +++++++++------
>> 1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index febbafdb3..5382f6c00 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -798,6 +798,15 @@ rte_eal_init(int argc, char **argv)
>> return -1;
>> }
>>
>> + if (rte_bus_scan()) {
>> + rte_eal_init_alert("Cannot scan the buses for devices\n");
>> + rte_errno = ENODEV;
> Since this now happens before hugetlbs are allocated, is it possible to
> retry? If so, then I would say to clear the run_once variable.
Yes, Change queued for v7. Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 09/12] bsdapp/eal: auto detect iova mapping mode
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (7 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 08/12] linuxapp/eal: auto detect iova mode Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-17 12:41 ` Aaron Conole
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (3 subsequent siblings)
12 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
- Moving late bus scanning to up..just after eal_parsing.
- Mapping mode would be default for bsdapp. It supports
only one pass through mode (RTE_KDRV_NIC_UIO)
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 07e72203f..53ad87b95 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -540,6 +540,14 @@ rte_eal_init(int argc, char **argv)
rte_atomic32_clear(&run_once);
return -1;
}
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
@@ -625,12 +633,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v6 09/12] bsdapp/eal: auto detect iova mapping mode
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
@ 2017-08-17 12:41 ` Aaron Conole
0 siblings, 0 replies; 248+ messages in thread
From: Aaron Conole @ 2017-08-17 12:41 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen
Santosh Shukla <santosh.shukla@caviumnetworks.com> writes:
> - Moving late bus scanning to up..just after eal_parsing.
> - Mapping mode would be default for bsdapp. It supports
> only one pass through mode (RTE_KDRV_NIC_UIO)
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
Same comments as 8/12; also I think 8/12 and 9/12 can be folded
together.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 10/12] linuxapp/eal_vfio: honor iova mode before mapping
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (8 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 09/12] bsdapp/eal: auto detect iova mapping mode Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (2 subsequent siblings)
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 11/12] linuxapp/eal_memory: honor iova mode in virt2phy
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (9 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 10/12] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 12/12] eal/rte_malloc: " Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 52791282f..2d9d7c2dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v6 12/12] eal/rte_malloc: honor iova mode in virt2phy
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (10 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 11/12] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-08-14 16:10 ` Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
12 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-14 16:10 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.11.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 00/12] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (11 preceding siblings ...)
2017-08-14 16:10 ` [dpdk-dev] [PATCH v6 12/12] eal/rte_malloc: " Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function Santosh Shukla
` (10 more replies)
12 siblings, 11 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
v7:
Includes no major change, minor change detailing:
- patch sqashing (Aaron suggestion)
- added run_once for device_parse() and bus_scan() in eal init
(Aaron suggestion)
- Moved rte_eal_device_parse() up in eal initialization order.
- Patches rebased on top of version: 17.11-rc0
For v6 info refer [11].
v6:
Sending v5 series rebased on top of version: 17.11-rc0.
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
v6 --> v7:
- Patches squashed per v6.
- Added run_once in eal per v6.
- Moved rte_eal_device_parse() up in eal init oder.
v5 --> v6:
- Added api info in eal's versiom.map (release DPDK_v17.11).
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
1) 1nd: declare rte_pci_match api in pci header. Required for autodetection in
follow up patches.
2) 2nd - 3rd - 4th : autodetection mapping infrastructure for Linux/bsdapp.
3) 5th: iova mode helper API.
4) 6th: Infra to detect iova mode.
5) 7th: make vfio mapping iova aware.
6) 8th - 9th : Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
for v6, refer [11].
Checkpatch result:
* Debug message - WARNING: line over 80 characters
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
[11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
Santosh Shukla (9):
eal/pci: export match function
eal/pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce iova mode helper api
eal: auto detect iova mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 311 insertions(+), 34 deletions(-)
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 14:49 ` Burakov, Anatoly
2017-09-06 15:39 ` Ferruh Yigit
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class Santosh Shukla
` (9 subsequent siblings)
10 siblings, 2 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index aac6fd776..c819e3084 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -237,3 +237,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..eab84c7a4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -366,6 +366,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 3a8f15406..a15b382ff 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -242,3 +242,10 @@ EXPERIMENTAL {
rte_service_unregister;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function Santosh Shukla
@ 2017-09-04 14:49 ` Burakov, Anatoly
2017-09-06 15:39 ` Ferruh Yigit
1 sibling, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 14:49 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 1/9] eal/pci: export match function
>
> Export rte_pci_match() function as it needed in the followup patch.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function Santosh Shukla
2017-09-04 14:49 ` Burakov, Anatoly
@ 2017-09-06 15:39 ` Ferruh Yigit
2017-09-18 10:07 ` santosh
1 sibling, 1 reply; 248+ messages in thread
From: Ferruh Yigit @ 2017-09-06 15:39 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole
On 8/31/2017 4:26 AM, Santosh Shukla wrote:
> Export rte_pci_match() function as it needed in the followup patch.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
> lib/librte_eal/common/eal_common_pci.c | 10 +---------
> lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
> 4 files changed, 30 insertions(+), 9 deletions(-)
>
> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> index aac6fd776..c819e3084 100644
> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> @@ -237,3 +237,10 @@ EXPERIMENTAL {
> rte_service_unregister;
>
> } DPDK_17.08;
> +
> +DPDK_17.11 {
> + global:
> +
> + rte_pci_match;
> +
> +} DPDK_17.08;
Is updating .map file required? As far as I can see rte_pci_match()
calls are within the same library, and no need to expose the API out of
library.
<...>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function
2017-09-06 15:39 ` Ferruh Yigit
@ 2017-09-18 10:07 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-09-18 10:07 UTC (permalink / raw)
To: Ferruh Yigit, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole
Hi Ferruh,
On Wednesday 06 September 2017 09:09 PM, Ferruh Yigit wrote:
> On 8/31/2017 4:26 AM, Santosh Shukla wrote:
>> Export rte_pci_match() function as it needed in the followup patch.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
>> lib/librte_eal/common/eal_common_pci.c | 10 +---------
>> lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
>> 4 files changed, 30 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> index aac6fd776..c819e3084 100644
>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> @@ -237,3 +237,10 @@ EXPERIMENTAL {
>> rte_service_unregister;
>>
>> } DPDK_17.08;
>> +
>> +DPDK_17.11 {
>> + global:
>> +
>> + rte_pci_match;
>> +
>> +} DPDK_17.08;
> Is updating .map file required? As far as I can see rte_pci_match()
> calls are within the same library, and no need to expose the API out of
> library.
>
> <...>
>
Its used in file eal/eal_pci.c in following patch.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 14:53 ` Burakov, Anatoly
2017-09-04 15:30 ` Burakov, Anatoly
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: " Santosh Shukla
` (8 subsequent siblings)
10 siblings, 2 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also add rte_pci_get_iommu_class definition for bsdapp,
in bsdapp case - api returns default iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v6 --> v7:
- squashed v6 series patch [02/12] & [03/12] (Aaron comment).
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
4 files changed, 32 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c819e3084..1fdcfb460 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -242,5 +242,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index eab84c7a4..0e36de093 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -381,6 +381,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-09-04 14:53 ` Burakov, Anatoly
2017-09-04 15:13 ` santosh
2017-09-04 15:30 ` Burakov, Anatoly
1 sibling, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 14:53 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 2/9] eal/pci: get iommu class
>
> Introducing rte_pci_get_iommu_class API which helps to get iommu class of
> PCI device on the bus and returns preferred iova mapping mode for PCI bus.
>
> Patch also add rte_pci_get_iommu_class definition for bsdapp, in bsdapp
> case - api returns default iova mode.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> v6 --> v7:
> - squashed v6 series patch [02/12] & [03/12] (Aaron comment).
>
> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
> lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
> lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
> 4 files changed, 32 insertions(+)
>
> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> index 04eacdcc7..e2c252320 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> @@ -403,6 +403,16 @@ rte_pci_scan(void)
> return -1;
> }
>
> +/*
> + * Get iommu class of pci devices on the bus.
> + */
> +enum rte_iova_mode
> +rte_pci_get_iommu_class(void)
> +{
> + /* Supports only RTE_KDRV_NIC_UIO */
> + return RTE_IOVA_PA;
> +}
> +
> int
> pci_update_device(const struct rte_pci_addr *addr) { diff --git
> a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> index c819e3084..1fdcfb460 100644
> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> @@ -242,5 +242,6 @@ DPDK_17.11 {
> global:
>
> rte_pci_match;
> + rte_pci_get_iommu_class;
>
> } DPDK_17.08;
> diff --git a/lib/librte_eal/common/include/rte_bus.h
> b/lib/librte_eal/common/include/rte_bus.h
> index c79368d3c..9e40687e5 100644
> --- a/lib/librte_eal/common/include/rte_bus.h
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -55,6 +55,16 @@ extern "C" {
> /** Double linked list of buses */
> TAILQ_HEAD(rte_bus_list, rte_bus);
>
> +
> +/**
> + * IOVA mapping mode.
> + */
> +enum rte_iova_mode {
> + RTE_IOVA_DC = 0, /* Don't care mode */
> + RTE_IOVA_PA = (1 << 0),
> + RTE_IOVA_VA = (1 << 1)
Hi Santosh,
No need to set values explicitly, standard C will take care of it.
I wonder the purpose of "don't care" mode. It's not used for anything but cases when no driver is bound. All the libraries (e.g. rte_malloc) will only check for IOVA_VA mode. Can't we just used PA in all cases where IOVA_DC would be applicable?
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-09-04 14:53 ` Burakov, Anatoly
@ 2017-09-04 15:13 ` santosh
2017-09-04 15:16 ` Burakov, Anatoly
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-09-04 15:13 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
Hi Anatoly,
On Monday 04 September 2017 08:23 PM, Burakov, Anatoly wrote:
>> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
>> Sent: Thursday, August 31, 2017 4:26 AM
>> To: dev@dpdk.org
>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
>> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
>> <santosh.shukla@caviumnetworks.com>
>> Subject: [PATCH v7 2/9] eal/pci: get iommu class
>>
>> Introducing rte_pci_get_iommu_class API which helps to get iommu class of
>> PCI device on the bus and returns preferred iova mapping mode for PCI bus.
>>
>> Patch also add rte_pci_get_iommu_class definition for bsdapp, in bsdapp
>> case - api returns default iova mode.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> v6 --> v7:
>> - squashed v6 series patch [02/12] & [03/12] (Aaron comment).
>>
>> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
>> lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
>> lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
>> 4 files changed, 32 insertions(+)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> index 04eacdcc7..e2c252320 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> @@ -403,6 +403,16 @@ rte_pci_scan(void)
>> return -1;
>> }
>>
>> +/*
>> + * Get iommu class of pci devices on the bus.
>> + */
>> +enum rte_iova_mode
>> +rte_pci_get_iommu_class(void)
>> +{
>> + /* Supports only RTE_KDRV_NIC_UIO */
>> + return RTE_IOVA_PA;
>> +}
>> +
>> int
>> pci_update_device(const struct rte_pci_addr *addr) { diff --git
>> a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> index c819e3084..1fdcfb460 100644
>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> @@ -242,5 +242,6 @@ DPDK_17.11 {
>> global:
>>
>> rte_pci_match;
>> + rte_pci_get_iommu_class;
>>
>> } DPDK_17.08;
>> diff --git a/lib/librte_eal/common/include/rte_bus.h
>> b/lib/librte_eal/common/include/rte_bus.h
>> index c79368d3c..9e40687e5 100644
>> --- a/lib/librte_eal/common/include/rte_bus.h
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -55,6 +55,16 @@ extern "C" {
>> /** Double linked list of buses */
>> TAILQ_HEAD(rte_bus_list, rte_bus);
>>
>> +
>> +/**
>> + * IOVA mapping mode.
>> + */
>> +enum rte_iova_mode {
>> + RTE_IOVA_DC = 0, /* Don't care mode */
>> + RTE_IOVA_PA = (1 << 0),
>> + RTE_IOVA_VA = (1 << 1)
> Hi Santosh,
>
> No need to set values explicitly, standard C will take care of it.
no strong opinion, change queued for v8.
> I wonder the purpose of "don't care" mode. It's not used for anything but cases when no driver is bound. All the libraries (e.g. rte_malloc) will only check for IOVA_VA mode. Can't we just used PA in all cases where IOVA_DC would be applicable?
Indeed policy is to use iova_pa for _dc case,
but we need a way to distinguish between no device found vs device attached
(if attached then decide upon its iova scheme).
For more detailed info pl. refer [1].
[1] http://dpdk.org/dev/patchwork/patch/26762/
> Thanks,
> Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-09-04 15:13 ` santosh
@ 2017-09-04 15:16 ` Burakov, Anatoly
2017-09-04 15:31 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:16 UTC (permalink / raw)
To: santosh, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
Hi Santosh,
> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Monday, September 4, 2017 4:14 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; stephen@networkplumber.org;
> aconole@redhat.com
> Subject: Re: [PATCH v7 2/9] eal/pci: get iommu class
>
> Hi Anatoly,
>
>
> On Monday 04 September 2017 08:23 PM, Burakov, Anatoly wrote:
> >> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> >> Sent: Thursday, August 31, 2017 4:26 AM
> >> To: dev@dpdk.org
> >> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> >> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> >> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> >> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> >> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> >> gaetan.rivet@6wind.com; Burakov, Anatoly
> <anatoly.burakov@intel.com>;
> >> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> >> <santosh.shukla@caviumnetworks.com>
> >> Subject: [PATCH v7 2/9] eal/pci: get iommu class
> >>
> >> Introducing rte_pci_get_iommu_class API which helps to get iommu
> >> class of PCI device on the bus and returns preferred iova mapping mode
> for PCI bus.
> >>
> >> Patch also add rte_pci_get_iommu_class definition for bsdapp, in
> >> bsdapp case - api returns default iova mode.
> >>
> >> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> >> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> >> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> ---
> >> v6 --> v7:
> >> - squashed v6 series patch [02/12] & [03/12] (Aaron comment).
> >>
> >> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
> >> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
> >> lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
> >> lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
> >> 4 files changed, 32 insertions(+)
> >>
> >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> index 04eacdcc7..e2c252320 100644
> >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> @@ -403,6 +403,16 @@ rte_pci_scan(void)
> >> return -1;
> >> }
> >>
> >> +/*
> >> + * Get iommu class of pci devices on the bus.
> >> + */
> >> +enum rte_iova_mode
> >> +rte_pci_get_iommu_class(void)
> >> +{
> >> + /* Supports only RTE_KDRV_NIC_UIO */
> >> + return RTE_IOVA_PA;
> >> +}
> >> +
> >> int
> >> pci_update_device(const struct rte_pci_addr *addr) { diff --git
> >> a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >> index c819e3084..1fdcfb460 100644
> >> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >> @@ -242,5 +242,6 @@ DPDK_17.11 {
> >> global:
> >>
> >> rte_pci_match;
> >> + rte_pci_get_iommu_class;
> >>
> >> } DPDK_17.08;
> >> diff --git a/lib/librte_eal/common/include/rte_bus.h
> >> b/lib/librte_eal/common/include/rte_bus.h
> >> index c79368d3c..9e40687e5 100644
> >> --- a/lib/librte_eal/common/include/rte_bus.h
> >> +++ b/lib/librte_eal/common/include/rte_bus.h
> >> @@ -55,6 +55,16 @@ extern "C" {
> >> /** Double linked list of buses */
> >> TAILQ_HEAD(rte_bus_list, rte_bus);
> >>
> >> +
> >> +/**
> >> + * IOVA mapping mode.
> >> + */
> >> +enum rte_iova_mode {
> >> + RTE_IOVA_DC = 0, /* Don't care mode */
> >> + RTE_IOVA_PA = (1 << 0),
> >> + RTE_IOVA_VA = (1 << 1)
> > Hi Santosh,
> >
> > No need to set values explicitly, standard C will take care of it.
>
> no strong opinion, change queued for v8.
>
> > I wonder the purpose of "don't care" mode. It's not used for anything but
> cases when no driver is bound. All the libraries (e.g. rte_malloc) will only
> check for IOVA_VA mode. Can't we just used PA in all cases where IOVA_DC
> would be applicable?
>
> Indeed policy is to use iova_pa for _dc case, but we need a way to distinguish
> between no device found vs device attached (if attached then decide upon
> its iova scheme).
>
> For more detailed info pl. refer [1].
>
> [1] http://dpdk.org/dev/patchwork/patch/26762/
>
Maybe make your intentions more explicit then? I.e. instead of "don't care" use "no device" or some such. No strong opinion either way though, I'm fine with leaving it as is.
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-09-04 15:16 ` Burakov, Anatoly
@ 2017-09-04 15:31 ` santosh
2017-09-04 15:35 ` Burakov, Anatoly
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-09-04 15:31 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
Hi Anatoly,
On Monday 04 September 2017 08:46 PM, Burakov, Anatoly wrote:
> Hi Santosh,
>
>> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
>> Sent: Monday, September 4, 2017 4:14 PM
>> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>> gaetan.rivet@6wind.com; stephen@networkplumber.org;
>> aconole@redhat.com
>> Subject: Re: [PATCH v7 2/9] eal/pci: get iommu class
>>
>> Hi Anatoly,
>>
>>
>> On Monday 04 September 2017 08:23 PM, Burakov, Anatoly wrote:
>>>> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
>>>> Sent: Thursday, August 31, 2017 4:26 AM
>>>> To: dev@dpdk.org
>>>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>>>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>>>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>>>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>>>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>>>> gaetan.rivet@6wind.com; Burakov, Anatoly
>> <anatoly.burakov@intel.com>;
>>>> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
>>>> <santosh.shukla@caviumnetworks.com>
>>>> Subject: [PATCH v7 2/9] eal/pci: get iommu class
>>>>
>>>> Introducing rte_pci_get_iommu_class API which helps to get iommu
>>>> class of PCI device on the bus and returns preferred iova mapping mode
>> for PCI bus.
>>>> Patch also add rte_pci_get_iommu_class definition for bsdapp, in
>>>> bsdapp case - api returns default iova mode.
>>>>
>>>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>>>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>> v6 --> v7:
>>>> - squashed v6 series patch [02/12] & [03/12] (Aaron comment).
>>>>
>>>> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
>>>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
>>>> lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
>>>> lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
>>>> 4 files changed, 32 insertions(+)
>>>>
>>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
>>>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>>>> index 04eacdcc7..e2c252320 100644
>>>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>>>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>>>> @@ -403,6 +403,16 @@ rte_pci_scan(void)
>>>> return -1;
>>>> }
>>>>
>>>> +/*
>>>> + * Get iommu class of pci devices on the bus.
>>>> + */
>>>> +enum rte_iova_mode
>>>> +rte_pci_get_iommu_class(void)
>>>> +{
>>>> + /* Supports only RTE_KDRV_NIC_UIO */
>>>> + return RTE_IOVA_PA;
>>>> +}
>>>> +
>>>> int
>>>> pci_update_device(const struct rte_pci_addr *addr) { diff --git
>>>> a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>> index c819e3084..1fdcfb460 100644
>>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>> @@ -242,5 +242,6 @@ DPDK_17.11 {
>>>> global:
>>>>
>>>> rte_pci_match;
>>>> + rte_pci_get_iommu_class;
>>>>
>>>> } DPDK_17.08;
>>>> diff --git a/lib/librte_eal/common/include/rte_bus.h
>>>> b/lib/librte_eal/common/include/rte_bus.h
>>>> index c79368d3c..9e40687e5 100644
>>>> --- a/lib/librte_eal/common/include/rte_bus.h
>>>> +++ b/lib/librte_eal/common/include/rte_bus.h
>>>> @@ -55,6 +55,16 @@ extern "C" {
>>>> /** Double linked list of buses */
>>>> TAILQ_HEAD(rte_bus_list, rte_bus);
>>>>
>>>> +
>>>> +/**
>>>> + * IOVA mapping mode.
>>>> + */
>>>> +enum rte_iova_mode {
>>>> + RTE_IOVA_DC = 0, /* Don't care mode */
>>>> + RTE_IOVA_PA = (1 << 0),
>>>> + RTE_IOVA_VA = (1 << 1)
>>> Hi Santosh,
>>>
>>> No need to set values explicitly, standard C will take care of it.
>> no strong opinion, change queued for v8.
recalling myself on why expressed RTE_IOVA_PA/_VA as 1 << 0/1.
Since user in future (by mistake) may add new entry example: RTE_IOVA_XX = 3 then it will
enable both _pa and _va both, So to avoid such programming error, deliberately
kept _pa = 1 << 0 and _va = 1 << 1, if new entry comes (highly unlikely) then
should be programmed as _xx = 1 << 2;
If you convinced then I think - i don;t need to spin this change for v8.
>>> I wonder the purpose of "don't care" mode. It's not used for anything but
>> cases when no driver is bound. All the libraries (e.g. rte_malloc) will only
>> check for IOVA_VA mode. Can't we just used PA in all cases where IOVA_DC
>> would be applicable?
>>
>> Indeed policy is to use iova_pa for _dc case, but we need a way to distinguish
>> between no device found vs device attached (if attached then decide upon
>> its iova scheme).
>>
>> For more detailed info pl. refer [1].
>>
>> [1] http://dpdk.org/dev/patchwork/patch/26762/
>>
> Maybe make your intentions more explicit then? I.e. instead of "don't care" use "no device" or some such. No strong opinion either way though, I'm fine with leaving it as is.
prefer keeping _DC, if you don;t mind, sounds more appropriate to me.
> Thanks,
> Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-09-04 15:31 ` santosh
@ 2017-09-04 15:35 ` Burakov, Anatoly
0 siblings, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:35 UTC (permalink / raw)
To: santosh, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Monday, September 4, 2017 4:32 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; stephen@networkplumber.org;
> aconole@redhat.com
> Subject: Re: [PATCH v7 2/9] eal/pci: get iommu class
>
> Hi Anatoly,
>
>
> On Monday 04 September 2017 08:46 PM, Burakov, Anatoly wrote:
> > Hi Santosh,
> >
> >> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
> >> Sent: Monday, September 4, 2017 4:14 PM
> >> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
> >> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> >> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> >> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> >> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> >> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> >> gaetan.rivet@6wind.com; stephen@networkplumber.org;
> >> aconole@redhat.com
> >> Subject: Re: [PATCH v7 2/9] eal/pci: get iommu class
> >>
> >> Hi Anatoly,
> >>
> >>
> >> On Monday 04 September 2017 08:23 PM, Burakov, Anatoly wrote:
> >>>> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> >>>> Sent: Thursday, August 31, 2017 4:26 AM
> >>>> To: dev@dpdk.org
> >>>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> >>>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> >>>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> >>>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> >>>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> >>>> gaetan.rivet@6wind.com; Burakov, Anatoly
> >> <anatoly.burakov@intel.com>;
> >>>> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> >>>> <santosh.shukla@caviumnetworks.com>
> >>>> Subject: [PATCH v7 2/9] eal/pci: get iommu class
> >>>>
> >>>> Introducing rte_pci_get_iommu_class API which helps to get iommu
> >>>> class of PCI device on the bus and returns preferred iova mapping
> >>>> mode
> >> for PCI bus.
> >>>> Patch also add rte_pci_get_iommu_class definition for bsdapp, in
> >>>> bsdapp case - api returns default iova mode.
> >>>>
> >>>> Signed-off-by: Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> >>>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> >>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> ---
> >>>> v6 --> v7:
> >>>> - squashed v6 series patch [02/12] & [03/12] (Aaron comment).
> >>>>
> >>>> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
> >>>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
> >>>> lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
> >>>> lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
> >>>> 4 files changed, 32 insertions(+)
> >>>>
> >>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >>>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >>>> index 04eacdcc7..e2c252320 100644
> >>>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >>>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >>>> @@ -403,6 +403,16 @@ rte_pci_scan(void)
> >>>> return -1;
> >>>> }
> >>>>
> >>>> +/*
> >>>> + * Get iommu class of pci devices on the bus.
> >>>> + */
> >>>> +enum rte_iova_mode
> >>>> +rte_pci_get_iommu_class(void)
> >>>> +{
> >>>> + /* Supports only RTE_KDRV_NIC_UIO */
> >>>> + return RTE_IOVA_PA;
> >>>> +}
> >>>> +
> >>>> int
> >>>> pci_update_device(const struct rte_pci_addr *addr) { diff --git
> >>>> a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >>>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >>>> index c819e3084..1fdcfb460 100644
> >>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> >>>> @@ -242,5 +242,6 @@ DPDK_17.11 {
> >>>> global:
> >>>>
> >>>> rte_pci_match;
> >>>> + rte_pci_get_iommu_class;
> >>>>
> >>>> } DPDK_17.08;
> >>>> diff --git a/lib/librte_eal/common/include/rte_bus.h
> >>>> b/lib/librte_eal/common/include/rte_bus.h
> >>>> index c79368d3c..9e40687e5 100644
> >>>> --- a/lib/librte_eal/common/include/rte_bus.h
> >>>> +++ b/lib/librte_eal/common/include/rte_bus.h
> >>>> @@ -55,6 +55,16 @@ extern "C" {
> >>>> /** Double linked list of buses */ TAILQ_HEAD(rte_bus_list,
> >>>> rte_bus);
> >>>>
> >>>> +
> >>>> +/**
> >>>> + * IOVA mapping mode.
> >>>> + */
> >>>> +enum rte_iova_mode {
> >>>> + RTE_IOVA_DC = 0, /* Don't care mode */
> >>>> + RTE_IOVA_PA = (1 << 0),
> >>>> + RTE_IOVA_VA = (1 << 1)
> >>> Hi Santosh,
> >>>
> >>> No need to set values explicitly, standard C will take care of it.
> >> no strong opinion, change queued for v8.
>
> recalling myself on why expressed RTE_IOVA_PA/_VA as 1 << 0/1.
> Since user in future (by mistake) may add new entry example: RTE_IOVA_XX
> = 3 then it will enable both _pa and _va both, So to avoid such programming
> error, deliberately kept _pa = 1 << 0 and _va = 1 << 1, if new entry comes
> (highly unlikely) then should be programmed as _xx = 1 << 2;
>
> If you convinced then I think - i don;t need to spin this change for v8.
Hi Santosh,
Fair enough (on both issues).
>
> >>> I wonder the purpose of "don't care" mode. It's not used for
> >>> anything but
> >> cases when no driver is bound. All the libraries (e.g. rte_malloc)
> >> will only check for IOVA_VA mode. Can't we just used PA in all cases
> >> where IOVA_DC would be applicable?
> >>
> >> Indeed policy is to use iova_pa for _dc case, but we need a way to
> >> distinguish between no device found vs device attached (if attached
> >> then decide upon its iova scheme).
> >>
> >> For more detailed info pl. refer [1].
> >>
> >> [1] http://dpdk.org/dev/patchwork/patch/26762/
> >>
> > Maybe make your intentions more explicit then? I.e. instead of "don't
> care" use "no device" or some such. No strong opinion either way though,
> I'm fine with leaving it as is.
>
> prefer keeping _DC, if you don;t mind, sounds more appropriate to me.
>
> > Thanks,
> > Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class Santosh Shukla
2017-09-04 14:53 ` Burakov, Anatoly
@ 2017-09-04 15:30 ` Burakov, Anatoly
1 sibling, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:30 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 2/9] eal/pci: get iommu class
>
> Introducing rte_pci_get_iommu_class API which helps to get iommu class of
> PCI device on the bus and returns preferred iova mapping mode for PCI bus.
>
> Patch also add rte_pci_get_iommu_class definition for bsdapp, in bsdapp
> case - api returns default iova mode.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>
> +/*
> + * Get iommu class of pci devices on the bus.
> + */
> +enum rte_iova_mode
> +rte_pci_get_iommu_class(void)
> +{
> + /* Supports only RTE_KDRV_NIC_UIO */
> + return RTE_IOVA_PA;
> +}
> +
Hi Santosh,
I think you should add a linuxapp stub in this commit as well.
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 1/9] eal/pci: export match function Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:08 ` Burakov, Anatoly
2017-09-05 9:01 ` Burakov, Anatoly
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 4/9] bus: " Santosh Shukla
` (7 subsequent siblings)
10 siblings, 2 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v6 --> v7:
- squashed v6 series patch no [01/12] & [05/12]..
i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron comment).
lib/librte_eal/common/include/rte_pci.h | 2 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 121 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 0e36de093..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..9725fd493 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -487,6 +488,100 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 : 0;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a15b382ff..40420ded3 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -247,5 +247,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-09-04 15:08 ` Burakov, Anatoly
2017-09-05 8:47 ` santosh
2017-09-05 9:01 ` Burakov, Anatoly
1 sibling, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:08 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>
> Get iommu class of PCI device on the bus and returns preferred iova
> mapping mode for that bus.
>
> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
> Flag used when driver needs to operate in iova=va mode.
>
> Algorithm for iova scheme selection for PCI bus:
> 0. If no device bound then return with RTE_IOVA_DC mapping mode, else
> goto 1).
> 1. Look for device attached to vfio kdrv and has .drv_flag set to
> RTE_PCI_DRV_IOVA_AS_VA.
> 2. Look for any device attached to UIO class of driver.
> 3. Check for vfio-noiommu mode enabled.
>
> If 2) & 3) is false and 1) is true then select mapping scheme as RTE_IOVA_VA.
> Otherwise use default mapping scheme (RTE_IOVA_PA).
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
> v6 --> v7:
> - squashed v6 series patch no [01/12] & [05/12]..
> i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron
> comment).
>
> lib/librte_eal/common/include/rte_pci.h | 2 +
> lib/librte_eal/linuxapp/eal/eal_pci.c | 95
> +++++++++++++++++++++++++
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
> 5 files changed, 121 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/rte_pci.h
> b/lib/librte_eal/common/include/rte_pci.h
> index 0e36de093..a67d77f22 100644
> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -202,6 +202,8 @@ struct rte_pci_bus { #define
> RTE_PCI_DRV_INTR_RMV 0x0010
> /** Device driver needs to keep mapped resources if unsupported dev
> detected */ #define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
> +/** Device driver supports iova as va */ #define
> RTE_PCI_DRV_IOVA_AS_VA
> +0X0040
>
> /**
> * A structure describing a PCI mapping.
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c
> b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index 8951ce742..9725fd493 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -45,6 +45,7 @@
> #include "eal_filesystem.h"
> #include "eal_private.h"
> #include "eal_pci_init.h"
> +#include "eal_vfio.h"
>
> /**
> * @file
> @@ -487,6 +488,100 @@ rte_pci_scan(void)
> return -1;
> }
>
> +/*
> + * Is pci device bound to any kdrv
> + */
> +static inline int
> +pci_device_is_bound(void)
> +{
> + struct rte_pci_device *dev = NULL;
> + int ret = 0;
> +
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_UNKNOWN ||
> + dev->kdrv == RTE_KDRV_NONE) {
> + continue;
> + } else {
> + ret = 1;
> + break;
> + }
> + }
> + return ret;
> +}
> +
> +/*
> + * Any one of the device bound to uio
> + */
> +static inline int
> +pci_device_bound_uio(void)
> +{
> + struct rte_pci_device *dev = NULL;
> +
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_IGB_UIO ||
> + dev->kdrv == RTE_KDRV_UIO_GENERIC) {
> + return 1;
> + }
> + }
> + return 0;
> +}
> +
> +/*
> + * Any one of the device has iova as va */ static inline int
> +pci_device_has_iova_va(void)
> +{
> + struct rte_pci_device *dev = NULL;
> + struct rte_pci_driver *drv = NULL;
> +
> + FOREACH_DRIVER_ON_PCIBUS(drv) {
> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_VFIO &&
> + rte_pci_match(drv, dev))
> + return 1;
> + }
> + }
> + }
> + return 0;
> +}
> +
> +/*
> + * Get iommu class of PCI devices on the bus.
> + */
> +enum rte_iova_mode
> +rte_pci_get_iommu_class(void)
> +{
> + bool is_bound;
> + bool is_vfio_noiommu_enabled = true;
> + bool has_iova_va;
> + bool is_bound_uio;
> +
> + is_bound = pci_device_is_bound();
> + if (!is_bound)
> + return RTE_IOVA_DC;
> +
> + has_iova_va = pci_device_has_iova_va();
> + is_bound_uio = pci_device_bound_uio(); #ifdef VFIO_PRESENT
> + is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 :
> 0;
If you specify is_vfio_noiommu_enabled as bool, you should probably treat it as such, and assign true/false.
Other than that, I'm curious why is it always set to "true" by default? If we don't have VFIO compiled, it seems like the error message would always complain about vfio-noiommu mode being enabled, which is confusing.
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-09-04 15:08 ` Burakov, Anatoly
@ 2017-09-05 8:47 ` santosh
2017-09-05 8:55 ` Burakov, Anatoly
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-09-05 8:47 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
Hi Anatoly,
On Monday 04 September 2017 08:38 PM, Burakov, Anatoly wrote:
>> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
>> Sent: Thursday, August 31, 2017 4:26 AM
>> To: dev@dpdk.org
>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
>> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
>> <santosh.shukla@caviumnetworks.com>
>> Subject: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>>
>> Get iommu class of PCI device on the bus and returns preferred iova
>> mapping mode for that bus.
>>
>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>> Flag used when driver needs to operate in iova=va mode.
>>
>> Algorithm for iova scheme selection for PCI bus:
>> 0. If no device bound then return with RTE_IOVA_DC mapping mode, else
>> goto 1).
>> 1. Look for device attached to vfio kdrv and has .drv_flag set to
>> RTE_PCI_DRV_IOVA_AS_VA.
>> 2. Look for any device attached to UIO class of driver.
>> 3. Check for vfio-noiommu mode enabled.
>>
>> If 2) & 3) is false and 1) is true then select mapping scheme as RTE_IOVA_VA.
>> Otherwise use default mapping scheme (RTE_IOVA_PA).
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>> ---
>> v6 --> v7:
>> - squashed v6 series patch no [01/12] & [05/12]..
>> i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron
>> comment).
>>
>> lib/librte_eal/common/include/rte_pci.h | 2 +
>> lib/librte_eal/linuxapp/eal/eal_pci.c | 95
>> +++++++++++++++++++++++++
>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
>> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>> 5 files changed, 121 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_pci.h
>> b/lib/librte_eal/common/include/rte_pci.h
>> index 0e36de093..a67d77f22 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -202,6 +202,8 @@ struct rte_pci_bus { #define
>> RTE_PCI_DRV_INTR_RMV 0x0010
>> /** Device driver needs to keep mapped resources if unsupported dev
>> detected */ #define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
>> +/** Device driver supports iova as va */ #define
>> RTE_PCI_DRV_IOVA_AS_VA
>> +0X0040
>>
>> /**
>> * A structure describing a PCI mapping.
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c
>> b/lib/librte_eal/linuxapp/eal/eal_pci.c
>> index 8951ce742..9725fd493 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
>> @@ -45,6 +45,7 @@
>> #include "eal_filesystem.h"
>> #include "eal_private.h"
>> #include "eal_pci_init.h"
>> +#include "eal_vfio.h"
>>
>> /**
>> * @file
>> @@ -487,6 +488,100 @@ rte_pci_scan(void)
>> return -1;
>> }
>>
>> +/*
>> + * Is pci device bound to any kdrv
>> + */
>> +static inline int
>> +pci_device_is_bound(void)
>> +{
>> + struct rte_pci_device *dev = NULL;
>> + int ret = 0;
>> +
>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>> + if (dev->kdrv == RTE_KDRV_UNKNOWN ||
>> + dev->kdrv == RTE_KDRV_NONE) {
>> + continue;
>> + } else {
>> + ret = 1;
>> + break;
>> + }
>> + }
>> + return ret;
>> +}
>> +
>> +/*
>> + * Any one of the device bound to uio
>> + */
>> +static inline int
>> +pci_device_bound_uio(void)
>> +{
>> + struct rte_pci_device *dev = NULL;
>> +
>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>> + if (dev->kdrv == RTE_KDRV_IGB_UIO ||
>> + dev->kdrv == RTE_KDRV_UIO_GENERIC) {
>> + return 1;
>> + }
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Any one of the device has iova as va */ static inline int
>> +pci_device_has_iova_va(void)
>> +{
>> + struct rte_pci_device *dev = NULL;
>> + struct rte_pci_driver *drv = NULL;
>> +
>> + FOREACH_DRIVER_ON_PCIBUS(drv) {
>> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>> + if (dev->kdrv == RTE_KDRV_VFIO &&
>> + rte_pci_match(drv, dev))
>> + return 1;
>> + }
>> + }
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Get iommu class of PCI devices on the bus.
>> + */
>> +enum rte_iova_mode
>> +rte_pci_get_iommu_class(void)
>> +{
>> + bool is_bound;
>> + bool is_vfio_noiommu_enabled = true;
>> + bool has_iova_va;
>> + bool is_bound_uio;
>> +
>> + is_bound = pci_device_is_bound();
>> + if (!is_bound)
>> + return RTE_IOVA_DC;
>> +
>> + has_iova_va = pci_device_has_iova_va();
>> + is_bound_uio = pci_device_bound_uio(); #ifdef VFIO_PRESENT
>> + is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 :
>> 0;
> If you specify is_vfio_noiommu_enabled as bool, you should probably treat it as such, and assign true/false.
queued for v8.
> Other than that, I'm curious why is it always set to "true" by default? If we don't have VFIO compiled, it seems like the error message would always complain about vfio-noiommu mode being enabled, which is confusing.
Set to 'true' for case when VFIO_PRESENT unset.. meaning platform
doesn't support VFIO (linux versioned < 3.6)
i.e.. using UIO - In that case, flag makes sure _pa policy selected.
On error message: It won't come in non-vfio case, as 'has_iova_va' will set to 0.
Error message will show for those case where few device out of many bind to uio, so
message will pop-up and iova policy would be _pa in that case.
Thanks.
> Thanks,
> Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-09-05 8:47 ` santosh
@ 2017-09-05 8:55 ` Burakov, Anatoly
2017-09-05 8:59 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-05 8:55 UTC (permalink / raw)
To: santosh, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Tuesday, September 5, 2017 9:48 AM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; stephen@networkplumber.org;
> aconole@redhat.com
> Subject: Re: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>
> Hi Anatoly,
>
>
> On Monday 04 September 2017 08:38 PM, Burakov, Anatoly wrote:
> >> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> >> Sent: Thursday, August 31, 2017 4:26 AM
> >> To: dev@dpdk.org
> >> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> >> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> >> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> >> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> >> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> >> gaetan.rivet@6wind.com; Burakov, Anatoly
> <anatoly.burakov@intel.com>;
> >> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> >> <santosh.shukla@caviumnetworks.com>
> >> Subject: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
> >>
> >> Get iommu class of PCI device on the bus and returns preferred iova
> >> mapping mode for that bus.
> >>
> >> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
> >> Flag used when driver needs to operate in iova=va mode.
> >>
> >> Algorithm for iova scheme selection for PCI bus:
> >> 0. If no device bound then return with RTE_IOVA_DC mapping mode, else
> >> goto 1).
> >> 1. Look for device attached to vfio kdrv and has .drv_flag set to
> >> RTE_PCI_DRV_IOVA_AS_VA.
> >> 2. Look for any device attached to UIO class of driver.
> >> 3. Check for vfio-noiommu mode enabled.
> >>
> >> If 2) & 3) is false and 1) is true then select mapping scheme as
> RTE_IOVA_VA.
> >> Otherwise use default mapping scheme (RTE_IOVA_PA).
> >>
> >> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> >> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> >> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> >> ---
> >> v6 --> v7:
> >> - squashed v6 series patch no [01/12] & [05/12]..
> >> i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron
> >> comment).
> >>
> >> lib/librte_eal/common/include/rte_pci.h | 2 +
> >> lib/librte_eal/linuxapp/eal/eal_pci.c | 95
> >> +++++++++++++++++++++++++
> >> lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
> >> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
> >> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
> >> 5 files changed, 121 insertions(+)
> >>
> >> diff --git a/lib/librte_eal/common/include/rte_pci.h
> >> b/lib/librte_eal/common/include/rte_pci.h
> >> index 0e36de093..a67d77f22 100644
> >> --- a/lib/librte_eal/common/include/rte_pci.h
> >> +++ b/lib/librte_eal/common/include/rte_pci.h
> >> @@ -202,6 +202,8 @@ struct rte_pci_bus { #define
> >> RTE_PCI_DRV_INTR_RMV 0x0010
> >> /** Device driver needs to keep mapped resources if unsupported dev
> >> detected */ #define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
> >> +/** Device driver supports iova as va */ #define
> >> RTE_PCI_DRV_IOVA_AS_VA
> >> +0X0040
> >>
> >> /**
> >> * A structure describing a PCI mapping.
> >> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c
> >> b/lib/librte_eal/linuxapp/eal/eal_pci.c
> >> index 8951ce742..9725fd493 100644
> >> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> >> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> >> @@ -45,6 +45,7 @@
> >> #include "eal_filesystem.h"
> >> #include "eal_private.h"
> >> #include "eal_pci_init.h"
> >> +#include "eal_vfio.h"
> >>
> >> /**
> >> * @file
> >> @@ -487,6 +488,100 @@ rte_pci_scan(void)
> >> return -1;
> >> }
> >>
> >> +/*
> >> + * Is pci device bound to any kdrv
> >> + */
> >> +static inline int
> >> +pci_device_is_bound(void)
> >> +{
> >> + struct rte_pci_device *dev = NULL;
> >> + int ret = 0;
> >> +
> >> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> >> + if (dev->kdrv == RTE_KDRV_UNKNOWN ||
> >> + dev->kdrv == RTE_KDRV_NONE) {
> >> + continue;
> >> + } else {
> >> + ret = 1;
> >> + break;
> >> + }
> >> + }
> >> + return ret;
> >> +}
> >> +
> >> +/*
> >> + * Any one of the device bound to uio */ static inline int
> >> +pci_device_bound_uio(void)
> >> +{
> >> + struct rte_pci_device *dev = NULL;
> >> +
> >> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> >> + if (dev->kdrv == RTE_KDRV_IGB_UIO ||
> >> + dev->kdrv == RTE_KDRV_UIO_GENERIC) {
> >> + return 1;
> >> + }
> >> + }
> >> + return 0;
> >> +}
> >> +
> >> +/*
> >> + * Any one of the device has iova as va */ static inline int
> >> +pci_device_has_iova_va(void)
> >> +{
> >> + struct rte_pci_device *dev = NULL;
> >> + struct rte_pci_driver *drv = NULL;
> >> +
> >> + FOREACH_DRIVER_ON_PCIBUS(drv) {
> >> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
> >> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> >> + if (dev->kdrv == RTE_KDRV_VFIO &&
> >> + rte_pci_match(drv, dev))
> >> + return 1;
> >> + }
> >> + }
> >> + }
> >> + return 0;
> >> +}
> >> +
> >> +/*
> >> + * Get iommu class of PCI devices on the bus.
> >> + */
> >> +enum rte_iova_mode
> >> +rte_pci_get_iommu_class(void)
> >> +{
> >> + bool is_bound;
> >> + bool is_vfio_noiommu_enabled = true;
> >> + bool has_iova_va;
> >> + bool is_bound_uio;
> >> +
> >> + is_bound = pci_device_is_bound();
> >> + if (!is_bound)
> >> + return RTE_IOVA_DC;
> >> +
> >> + has_iova_va = pci_device_has_iova_va();
> >> + is_bound_uio = pci_device_bound_uio(); #ifdef VFIO_PRESENT
> >> + is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 :
> >> 0;
> > If you specify is_vfio_noiommu_enabled as bool, you should probably treat
> it as such, and assign true/false.
>
> queued for v8.
>
> > Other than that, I'm curious why is it always set to "true" by default? If we
> don't have VFIO compiled, it seems like the error message would always
> complain about vfio-noiommu mode being enabled, which is confusing.
>
> Set to 'true' for case when VFIO_PRESENT unset.. meaning platform doesn't
> support VFIO (linux versioned < 3.6) i.e.. using UIO - In that case, flag makes
> sure _pa policy selected.
>
> On error message: It won't come in non-vfio case, as 'has_iova_va' will set to
> 0.
> Error message will show for those case where few device out of many bind
> to uio, so message will pop-up and iova policy would be _pa in that case.
>
> Thanks.
Right. My apologies, I misunderstood the meaning of "has_iova_va" flag.
Thanks,
Anatoly
>
> > Thanks,
> > Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-09-05 8:55 ` Burakov, Anatoly
@ 2017-09-05 8:59 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-09-05 8:59 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
Hi Anatoly,
On Tuesday 05 September 2017 02:25 PM, Burakov, Anatoly wrote:
>> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
>> Sent: Tuesday, September 5, 2017 9:48 AM
>> To: Burakov, Anatoly <anatoly.burakov@intel.com>; dev@dpdk.org
>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>> gaetan.rivet@6wind.com; stephen@networkplumber.org;
>> aconole@redhat.com
>> Subject: Re: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>>
>> Hi Anatoly,
>>
>>
>> On Monday 04 September 2017 08:38 PM, Burakov, Anatoly wrote:
>>>> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
>>>> Sent: Thursday, August 31, 2017 4:26 AM
>>>> To: dev@dpdk.org
>>>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>>>> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
>>>> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
>>>> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
>>>> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
>>>> gaetan.rivet@6wind.com; Burakov, Anatoly
>> <anatoly.burakov@intel.com>;
>>>> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
>>>> <santosh.shukla@caviumnetworks.com>
>>>> Subject: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>>>>
>>>> Get iommu class of PCI device on the bus and returns preferred iova
>>>> mapping mode for that bus.
>>>>
>>>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>>>> Flag used when driver needs to operate in iova=va mode.
>>>>
>>>> Algorithm for iova scheme selection for PCI bus:
>>>> 0. If no device bound then return with RTE_IOVA_DC mapping mode, else
>>>> goto 1).
>>>> 1. Look for device attached to vfio kdrv and has .drv_flag set to
>>>> RTE_PCI_DRV_IOVA_AS_VA.
>>>> 2. Look for any device attached to UIO class of driver.
>>>> 3. Check for vfio-noiommu mode enabled.
>>>>
>>>> If 2) & 3) is false and 1) is true then select mapping scheme as
>> RTE_IOVA_VA.
>>>> Otherwise use default mapping scheme (RTE_IOVA_PA).
>>>>
>>>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>>>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>>>> ---
>>>> v6 --> v7:
>>>> - squashed v6 series patch no [01/12] & [05/12]..
>>>> i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron
>>>> comment).
>>>>
>>>> lib/librte_eal/common/include/rte_pci.h | 2 +
>>>> lib/librte_eal/linuxapp/eal/eal_pci.c | 95
>>>> +++++++++++++++++++++++++
>>>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
>>>> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
>>>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>>>> 5 files changed, 121 insertions(+)
>>>>
>>>> diff --git a/lib/librte_eal/common/include/rte_pci.h
>>>> b/lib/librte_eal/common/include/rte_pci.h
>>>> index 0e36de093..a67d77f22 100644
>>>> --- a/lib/librte_eal/common/include/rte_pci.h
>>>> +++ b/lib/librte_eal/common/include/rte_pci.h
>>>> @@ -202,6 +202,8 @@ struct rte_pci_bus { #define
>>>> RTE_PCI_DRV_INTR_RMV 0x0010
>>>> /** Device driver needs to keep mapped resources if unsupported dev
>>>> detected */ #define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
>>>> +/** Device driver supports iova as va */ #define
>>>> RTE_PCI_DRV_IOVA_AS_VA
>>>> +0X0040
>>>>
>>>> /**
>>>> * A structure describing a PCI mapping.
>>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c
>>>> b/lib/librte_eal/linuxapp/eal/eal_pci.c
>>>> index 8951ce742..9725fd493 100644
>>>> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
>>>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
>>>> @@ -45,6 +45,7 @@
>>>> #include "eal_filesystem.h"
>>>> #include "eal_private.h"
>>>> #include "eal_pci_init.h"
>>>> +#include "eal_vfio.h"
>>>>
>>>> /**
>>>> * @file
>>>> @@ -487,6 +488,100 @@ rte_pci_scan(void)
>>>> return -1;
>>>> }
>>>>
>>>> +/*
>>>> + * Is pci device bound to any kdrv
>>>> + */
>>>> +static inline int
>>>> +pci_device_is_bound(void)
>>>> +{
>>>> + struct rte_pci_device *dev = NULL;
>>>> + int ret = 0;
>>>> +
>>>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>>>> + if (dev->kdrv == RTE_KDRV_UNKNOWN ||
>>>> + dev->kdrv == RTE_KDRV_NONE) {
>>>> + continue;
>>>> + } else {
>>>> + ret = 1;
>>>> + break;
>>>> + }
>>>> + }
>>>> + return ret;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Any one of the device bound to uio */ static inline int
>>>> +pci_device_bound_uio(void)
>>>> +{
>>>> + struct rte_pci_device *dev = NULL;
>>>> +
>>>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>>>> + if (dev->kdrv == RTE_KDRV_IGB_UIO ||
>>>> + dev->kdrv == RTE_KDRV_UIO_GENERIC) {
>>>> + return 1;
>>>> + }
>>>> + }
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Any one of the device has iova as va */ static inline int
>>>> +pci_device_has_iova_va(void)
>>>> +{
>>>> + struct rte_pci_device *dev = NULL;
>>>> + struct rte_pci_driver *drv = NULL;
>>>> +
>>>> + FOREACH_DRIVER_ON_PCIBUS(drv) {
>>>> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
>>>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>>>> + if (dev->kdrv == RTE_KDRV_VFIO &&
>>>> + rte_pci_match(drv, dev))
>>>> + return 1;
>>>> + }
>>>> + }
>>>> + }
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Get iommu class of PCI devices on the bus.
>>>> + */
>>>> +enum rte_iova_mode
>>>> +rte_pci_get_iommu_class(void)
>>>> +{
>>>> + bool is_bound;
>>>> + bool is_vfio_noiommu_enabled = true;
>>>> + bool has_iova_va;
>>>> + bool is_bound_uio;
>>>> +
>>>> + is_bound = pci_device_is_bound();
>>>> + if (!is_bound)
>>>> + return RTE_IOVA_DC;
>>>> +
>>>> + has_iova_va = pci_device_has_iova_va();
>>>> + is_bound_uio = pci_device_bound_uio(); #ifdef VFIO_PRESENT
>>>> + is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == 1 ? 1 :
>>>> 0;
>>> If you specify is_vfio_noiommu_enabled as bool, you should probably treat
>> it as such, and assign true/false.
>>
>> queued for v8.
>>
>>> Other than that, I'm curious why is it always set to "true" by default? If we
>> don't have VFIO compiled, it seems like the error message would always
>> complain about vfio-noiommu mode being enabled, which is confusing.
>>
>> Set to 'true' for case when VFIO_PRESENT unset.. meaning platform doesn't
>> support VFIO (linux versioned < 3.6) i.e.. using UIO - In that case, flag makes
>> sure _pa policy selected.
>>
>> On error message: It won't come in non-vfio case, as 'has_iova_va' will set to
>> 0.
>> Error message will show for those case where few device out of many bind
>> to uio, so message will pop-up and iova policy would be _pa in that case.
>>
>> Thanks.
> Right. My apologies, I misunderstood the meaning of "has_iova_va" flag.
No worry ;). Thanks for review feedback and looking into v7 series.
Can I collect your reviewed-by: for [3/9]?
Thanks.
> Thanks,
> Anatoly
>
>>> Thanks,
>>> Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: " Santosh Shukla
2017-09-04 15:08 ` Burakov, Anatoly
@ 2017-09-05 9:01 ` Burakov, Anatoly
1 sibling, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-05 9:01 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 3/9] linuxapp/eal_pci: get iommu class
>
> Get iommu class of PCI device on the bus and returns preferred iova
> mapping mode for that bus.
>
> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
> Flag used when driver needs to operate in iova=va mode.
>
> Algorithm for iova scheme selection for PCI bus:
> 0. If no device bound then return with RTE_IOVA_DC mapping mode, else
> goto 1).
> 1. Look for device attached to vfio kdrv and has .drv_flag set to
> RTE_PCI_DRV_IOVA_AS_VA.
> 2. Look for any device attached to UIO class of driver.
> 3. Check for vfio-noiommu mode enabled.
>
> If 2) & 3) is false and 1) is true then select mapping scheme as RTE_IOVA_VA.
> Otherwise use default mapping scheme (RTE_IOVA_PA).
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 4/9] bus: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (2 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:25 ` Burakov, Anatoly
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 5/9] eal: introduce iova mode helper api Santosh Shukla
` (6 subsequent siblings)
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 1fdcfb460..9942f47aa 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 9e40687e5..70a291a4d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -178,6 +178,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -191,6 +205,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -290,6 +305,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 40420ded3..f35031746 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 4/9] bus: get iommu class
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 4/9] bus: " Santosh Shukla
@ 2017-09-04 15:25 ` Burakov, Anatoly
0 siblings, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:25 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 4/9] bus: get iommu class
>
> API(rte_bus_get_iommu_class) helps to automatically detect and select
> appropriate iova mapping scheme for iommu capable device on that bus.
>
> Algorithm for iova scheme selection for bus:
> 0. Iterate through bus_list.
> 1. Collect each bus iova mode value and update into 'mode' var.
> 2. Mode selection scheme is:
> if mode == 0 then iova mode is _pa,
> if mode == 1 then iova mode is _pa,
> if mode == 2 then iova mode is _va,
> if mode == 3 then iova mode ia _pa.
>
> So mode !=2 will be default iova mode (_pa).
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 5/9] eal: introduce iova mode helper api
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (3 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 4/9] bus: " Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 6/9] eal: auto detect iova mode Santosh Shukla
` (5 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 5fa598842..07e72203f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 9942f47aa..1a63f3f05 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -244,5 +244,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 48f12f44c..febbafdb3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index f35031746..c99f1ed44 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -249,5 +249,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 6/9] eal: auto detect iova mode
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (4 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 5/9] eal: introduce iova mode helper api Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:32 ` Burakov, Anatoly
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (4 subsequent siblings)
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
For auto detection purpose:
* Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
Based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v6 --> v7:
- Moved eal_option_device_parse() up in then order of eal init.
- Added run_once. (aaron suggestion).
- squashed v6 series patch no. [08/12] & [09/12] into one patch (Aaron comment)
lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 07e72203f..f003f4c04 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,17 +636,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index febbafdb3..f4901ffb6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,17 +911,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 6/9] eal: auto detect iova mode
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 6/9] eal: auto detect iova mode Santosh Shukla
@ 2017-09-04 15:32 ` Burakov, Anatoly
0 siblings, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:32 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 6/9] eal: auto detect iova mode
>
> For auto detection purpose:
> * Below calls moved up in the eal initialization order:
> - eal_option_device_parse
> - rte_bus_scan
>
> Based on the result of rte_bus_scan_iommu_class - select iova mapping
> mode.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (5 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 6/9] eal: auto detect iova mode Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:40 ` Burakov, Anatoly
2017-10-26 12:57 ` Jonas Pfefferle1
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (3 subsequent siblings)
10 siblings, 2 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-09-04 15:40 ` Burakov, Anatoly
2017-10-26 12:57 ` Jonas Pfefferle1
1 sibling, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:40 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
>
> Check iova mode and accordingly map iova to pa or va.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
2017-09-04 15:40 ` Burakov, Anatoly
@ 2017-10-26 12:57 ` Jonas Pfefferle1
2017-11-02 10:17 ` Thomas Monjalon
1 sibling, 1 reply; 248+ messages in thread
From: Jonas Pfefferle1 @ 2017-10-26 12:57 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole
Hi @all
I just stumbled upon this patch while testing on POWER. RTE_IOVA_VA will
not work for the sPAPR code since the dma window size is currently
determined by the physical address only. I'm preparing a patch to address
this.
Thanks,
Jonas
"dev" <dev-bounces@dpdk.org> wrote on 08/31/2017 05:26:16 AM:
> From: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> To: dev@dpdk.org
> Cc: thomas@monjalon.net, jerin.jacob@caviumnetworks.com,
> hemant.agrawal@nxp.com, olivier.matz@6wind.com,
> maxime.coquelin@redhat.com, sergio.gonzalez.monroy@intel.com,
> bruce.richardson@intel.com, shreyansh.jain@nxp.com,
> gaetan.rivet@6wind.com, anatoly.burakov@intel.com,
> stephen@networkplumber.org, aconole@redhat.com, Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Date: 08/31/2017 05:28 AM
> Subject: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> mode before mapping
> Sent by: "dev" <dev-bounces@dpdk.org>
>
> Check iova mode and accordingly map iova to pa or va.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/
> librte_eal/linuxapp/eal/eal_vfio.c
> index c8a97b7e7..b32cd09a2 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
> dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
> dma_map.vaddr = ms[i].addr_64;
> dma_map.size = ms[i].len;
> - dma_map.iova = ms[i].phys_addr;
> + if (rte_eal_iova_mode() == RTE_IOVA_VA)
> + dma_map.iova = dma_map.vaddr;
> + else
> + dma_map.iova = ms[i].phys_addr;
> dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
>
> ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
> @@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
> dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
> dma_map.vaddr = ms[i].addr_64;
> dma_map.size = ms[i].len;
> - dma_map.iova = ms[i].phys_addr;
> + if (rte_eal_iova_mode() == RTE_IOVA_VA)
> + dma_map.iova = dma_map.vaddr;
> + else
> + dma_map.iova = ms[i].phys_addr;
> dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
> VFIO_DMA_MAP_FLAG_WRITE;
>
> --
> 2.13.0
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-10-26 12:57 ` Jonas Pfefferle1
@ 2017-11-02 10:17 ` Thomas Monjalon
2017-11-02 10:26 ` Jonas Pfefferle1
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-11-02 10:17 UTC (permalink / raw)
To: Jonas Pfefferle1
Cc: dev, Santosh Shukla, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole
Hi
26/10/2017 14:57, Jonas Pfefferle1:
>
> Hi @all
>
> I just stumbled upon this patch while testing on POWER. RTE_IOVA_VA will
> not work for the sPAPR code since the dma window size is currently
> determined by the physical address only.
Is it affecting POWER8?
> I'm preparing a patch to address this.
Any news?
Can you use virtual addresses?
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-02 10:17 ` Thomas Monjalon
@ 2017-11-02 10:26 ` Jonas Pfefferle1
2017-11-03 9:56 ` Jonas Pfefferle1
0 siblings, 1 reply; 248+ messages in thread
From: Jonas Pfefferle1 @ 2017-11-02 10:26 UTC (permalink / raw)
To: Thomas Monjalon
Cc: aconole, anatoly.burakov, bruce.richardson, dev, gaetan.rivet,
hemant.agrawal, jerin.jacob, maxime.coquelin, olivier.matz,
Santosh Shukla, sergio.gonzalez.monroy, shreyansh.jain, stephen,
Alexey Kardashevskiy
Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10 AM:
> From: Thomas Monjalon <thomas@monjalon.net>
> To: Jonas Pfefferle1 <JPF@zurich.ibm.com>
> Cc: dev@dpdk.org, Santosh Shukla
> <santosh.shukla@caviumnetworks.com>, jerin.jacob@caviumnetworks.com,
> hemant.agrawal@nxp.com, olivier.matz@6wind.com,
> maxime.coquelin@redhat.com, sergio.gonzalez.monroy@intel.com,
> bruce.richardson@intel.com, shreyansh.jain@nxp.com,
> gaetan.rivet@6wind.com, anatoly.burakov@intel.com,
> stephen@networkplumber.org, aconole@redhat.com
> Date: 11/02/2017 11:17 AM
> Subject: Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> mode before mapping
>
> Hi
>
> 26/10/2017 14:57, Jonas Pfefferle1:
> >
> > Hi @all
> >
> > I just stumbled upon this patch while testing on POWER. RTE_IOVA_VA
will
> > not work for the sPAPR code since the dma window size is currently
> > determined by the physical address only.
>
> Is it affecting POWER8?
It is.
>
> > I'm preparing a patch to address this.
>
> Any news?
> Can you use virtual addresses?
After a long discussion with Alexey (CC) we came to the conclusion that
with the current sPAPR iommu driver we cannot use virtual addresses since
the iova is restricted to lay in the DMA window which itself is restricted
to physical RAM addresses resp. with the current code 0 to hotplug memory
max. However, Alexey is working on a patch to lift this restriction on the
DMA window size which should allow us to do VA:VA mappings in the future.
For now we should fall back to PA in the dynamic iova mode check. I will
send an according patch later today.
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-02 10:26 ` Jonas Pfefferle1
@ 2017-11-03 9:56 ` Jonas Pfefferle1
2017-11-03 10:28 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: Jonas Pfefferle1 @ 2017-11-03 9:56 UTC (permalink / raw)
To: Jonas Pfefferle1
Cc: Thomas Monjalon, aconole, anatoly.burakov, bruce.richardson, dev,
gaetan.rivet, hemant.agrawal, jerin.jacob, maxime.coquelin,
olivier.matz, Santosh Shukla, sergio.gonzalez.monroy,
shreyansh.jain, stephen, Alexey Kardashevskiy
"dev" <dev-bounces@dpdk.org> wrote on 11/02/2017 11:26:57 AM:
> From: "Jonas Pfefferle1" <JPF@zurich.ibm.com>
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: aconole@redhat.com, anatoly.burakov@intel.com,
> bruce.richardson@intel.com, dev@dpdk.org, gaetan.rivet@6wind.com,
> hemant.agrawal@nxp.com, jerin.jacob@caviumnetworks.com,
> maxime.coquelin@redhat.com, olivier.matz@6wind.com, Santosh Shukla
> <santosh.shukla@caviumnetworks.com>,
> sergio.gonzalez.monroy@intel.com, shreyansh.jain@nxp.com,
> stephen@networkplumber.org, "Alexey Kardashevskiy" <aik@ozlabs.ru>
> Date: 11/02/2017 11:27 AM
> Subject: Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> mode before mapping
> Sent by: "dev" <dev-bounces@dpdk.org>
>
>
> Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10 AM:
>
> > From: Thomas Monjalon <thomas@monjalon.net>
> > To: Jonas Pfefferle1 <JPF@zurich.ibm.com>
> > Cc: dev@dpdk.org, Santosh Shukla
> > <santosh.shukla@caviumnetworks.com>, jerin.jacob@caviumnetworks.com,
> > hemant.agrawal@nxp.com, olivier.matz@6wind.com,
> > maxime.coquelin@redhat.com, sergio.gonzalez.monroy@intel.com,
> > bruce.richardson@intel.com, shreyansh.jain@nxp.com,
> > gaetan.rivet@6wind.com, anatoly.burakov@intel.com,
> > stephen@networkplumber.org, aconole@redhat.com
> > Date: 11/02/2017 11:17 AM
> > Subject: Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> > mode before mapping
> >
> > Hi
> >
> > 26/10/2017 14:57, Jonas Pfefferle1:
> > >
> > > Hi @all
> > >
> > > I just stumbled upon this patch while testing on POWER. RTE_IOVA_VA
> will
> > > not work for the sPAPR code since the dma window size is currently
> > > determined by the physical address only.
> >
> > Is it affecting POWER8?
>
> It is.
>
> >
> > > I'm preparing a patch to address this.
> >
> > Any news?
> > Can you use virtual addresses?
>
> After a long discussion with Alexey (CC) we came to the conclusion that
> with the current sPAPR iommu driver we cannot use virtual addresses since
> the iova is restricted to lay in the DMA window which itself is
restricted
> to physical RAM addresses resp. with the current code 0 to hotplug memory
> max. However, Alexey is working on a patch to lift this restriction on
the
> DMA window size which should allow us to do VA:VA mappings in the future.
> For now we should fall back to PA in the dynamic iova mode check. I will
> send an according patch later today.
I looked into this yesterday but I'm not sure what the right solution is
here.
At the time rte_pci_get_iommu_class is called we already know which IOMMU
types are supported because vfio_get_container_fd resp.
vfio_has_supported_extensions has been called however we do not know which
one is going to be used (Decided later in vfio_setup_device resp.
vfio_set_iommu_type). We can choose a iova mode which is supported by all
types but if the modes are exclusive to the types we have to guess which
one is going to be used. Or let the user decide?
Thanks,
Jonas
>
> >
> >
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-03 9:56 ` Jonas Pfefferle1
@ 2017-11-03 10:28 ` Thomas Monjalon
2017-11-03 10:44 ` Jonas Pfefferle1
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-11-03 10:28 UTC (permalink / raw)
To: Jonas Pfefferle1
Cc: aconole, anatoly.burakov, bruce.richardson, dev, gaetan.rivet,
hemant.agrawal, jerin.jacob, maxime.coquelin, olivier.matz,
Santosh Shukla, sergio.gonzalez.monroy, shreyansh.jain, stephen,
Alexey Kardashevskiy
03/11/2017 10:56, Jonas Pfefferle1:
> Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10 AM:
> > > 26/10/2017 14:57, Jonas Pfefferle1:
> > > >
> > > > Hi @all
> > > >
> > > > I just stumbled upon this patch while testing on POWER. RTE_IOVA_VA
> > will
> > > > not work for the sPAPR code since the dma window size is currently
> > > > determined by the physical address only.
> > >
> > > Is it affecting POWER8?
> >
> > It is.
> >
> > >
> > > > I'm preparing a patch to address this.
> > >
> > > Any news?
> > > Can you use virtual addresses?
> >
> > After a long discussion with Alexey (CC) we came to the conclusion that
> > with the current sPAPR iommu driver we cannot use virtual addresses since
> > the iova is restricted to lay in the DMA window which itself is
> restricted
> > to physical RAM addresses resp. with the current code 0 to hotplug memory
> > max. However, Alexey is working on a patch to lift this restriction on
> the
> > DMA window size which should allow us to do VA:VA mappings in the future.
> > For now we should fall back to PA in the dynamic iova mode check. I will
> > send an according patch later today.
>
> I looked into this yesterday but I'm not sure what the right solution is
> here.
> At the time rte_pci_get_iommu_class is called we already know which IOMMU
> types are supported because vfio_get_container_fd resp.
> vfio_has_supported_extensions has been called however we do not know which
> one is going to be used (Decided later in vfio_setup_device resp.
> vfio_set_iommu_type). We can choose a iova mode which is supported by all
> types but if the modes are exclusive to the types we have to guess which
> one is going to be used. Or let the user decide?
You can keep the old behaviour, restricting to physical memory,
until you support virtual addressing.
It can be just a #ifdef RTE_ARCH_PPC_64.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-03 10:28 ` Thomas Monjalon
@ 2017-11-03 10:44 ` Jonas Pfefferle1
2017-11-03 10:54 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: Jonas Pfefferle1 @ 2017-11-03 10:44 UTC (permalink / raw)
To: Thomas Monjalon
Cc: aconole, Alexey Kardashevskiy, anatoly.burakov, bruce.richardson,
dev, gaetan.rivet, hemant.agrawal, jerin.jacob, maxime.coquelin,
olivier.matz, Santosh Shukla, sergio.gonzalez.monroy,
shreyansh.jain, stephen
Thomas Monjalon <thomas@monjalon.net> wrote on 11/03/2017 11:28:10 AM:
> From: Thomas Monjalon <thomas@monjalon.net>
> To: Jonas Pfefferle1 <JPF@zurich.ibm.com>
> Cc: aconole@redhat.com, anatoly.burakov@intel.com,
> bruce.richardson@intel.com, dev@dpdk.org, gaetan.rivet@6wind.com,
> hemant.agrawal@nxp.com, jerin.jacob@caviumnetworks.com,
> maxime.coquelin@redhat.com, olivier.matz@6wind.com, Santosh Shukla
> <santosh.shukla@caviumnetworks.com>,
> sergio.gonzalez.monroy@intel.com, shreyansh.jain@nxp.com,
> stephen@networkplumber.org, Alexey Kardashevskiy <aik@ozlabs.ru>
> Date: 11/03/2017 11:28 AM
> Subject: Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> mode before mapping
>
> 03/11/2017 10:56, Jonas Pfefferle1:
> > Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10 AM:
> > > > 26/10/2017 14:57, Jonas Pfefferle1:
> > > > >
> > > > > Hi @all
> > > > >
> > > > > I just stumbled upon this patch while testing on POWER.
RTE_IOVA_VA
> > > will
> > > > > not work for the sPAPR code since the dma window size is
currently
> > > > > determined by the physical address only.
> > > >
> > > > Is it affecting POWER8?
> > >
> > > It is.
> > >
> > > >
> > > > > I'm preparing a patch to address this.
> > > >
> > > > Any news?
> > > > Can you use virtual addresses?
> > >
> > > After a long discussion with Alexey (CC) we came to the conclusion
that
> > > with the current sPAPR iommu driver we cannot use virtual addresses
since
> > > the iova is restricted to lay in the DMA window which itself is
> > restricted
> > > to physical RAM addresses resp. with the current code 0 to hotplug
memory
> > > max. However, Alexey is working on a patch to lift this restriction
on
> > the
> > > DMA window size which should allow us to do VA:VA mappings in the
future.
> > > For now we should fall back to PA in the dynamic iova mode check. I
will
> > > send an according patch later today.
> >
> > I looked into this yesterday but I'm not sure what the right solution
is
> > here.
> > At the time rte_pci_get_iommu_class is called we already know which
IOMMU
> > types are supported because vfio_get_container_fd resp.
> > vfio_has_supported_extensions has been called however we do not know
which
> > one is going to be used (Decided later in vfio_setup_device resp.
> > vfio_set_iommu_type). We can choose a iova mode which is supported by
all
> > types but if the modes are exclusive to the types we have to guess
which
> > one is going to be used. Or let the user decide?
>
> You can keep the old behaviour, restricting to physical memory,
> until you support virtual addressing.
> It can be just a #ifdef RTE_ARCH_PPC_64.
>
Ok but we might want to refine this in the future. IMO It looks much
cleaner
to decide this on the iommu type plus this would also cover the noiommu
case without having this extra check reading the sysfs variable.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-03 10:44 ` Jonas Pfefferle1
@ 2017-11-03 10:54 ` Thomas Monjalon
2017-11-03 11:28 ` Jonas Pfefferle1
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-11-03 10:54 UTC (permalink / raw)
To: Jonas Pfefferle1
Cc: aconole, Alexey Kardashevskiy, anatoly.burakov, bruce.richardson,
dev, gaetan.rivet, hemant.agrawal, jerin.jacob, maxime.coquelin,
olivier.matz, Santosh Shukla, sergio.gonzalez.monroy,
shreyansh.jain, stephen
03/11/2017 11:44, Jonas Pfefferle1:
> Thomas Monjalon <thomas@monjalon.net> wrote on 11/03/2017 11:28:10 AM:
> > 03/11/2017 10:56, Jonas Pfefferle1:
> > > Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10 AM:
> > > > > 26/10/2017 14:57, Jonas Pfefferle1:
> > > > > >
> > > > > > Hi @all
> > > > > >
> > > > > > I just stumbled upon this patch while testing on POWER.
> RTE_IOVA_VA
> > > > will
> > > > > > not work for the sPAPR code since the dma window size is
> currently
> > > > > > determined by the physical address only.
> > > > >
> > > > > Is it affecting POWER8?
> > > >
> > > > It is.
> > > >
> > > > >
> > > > > > I'm preparing a patch to address this.
> > > > >
> > > > > Any news?
> > > > > Can you use virtual addresses?
> > > >
> > > > After a long discussion with Alexey (CC) we came to the conclusion
> that
> > > > with the current sPAPR iommu driver we cannot use virtual addresses
> since
> > > > the iova is restricted to lay in the DMA window which itself is
> > > restricted
> > > > to physical RAM addresses resp. with the current code 0 to hotplug
> memory
> > > > max. However, Alexey is working on a patch to lift this restriction
> on
> > > the
> > > > DMA window size which should allow us to do VA:VA mappings in the
> future.
> > > > For now we should fall back to PA in the dynamic iova mode check. I
> will
> > > > send an according patch later today.
> > >
> > > I looked into this yesterday but I'm not sure what the right solution
> is
> > > here.
> > > At the time rte_pci_get_iommu_class is called we already know which
> IOMMU
> > > types are supported because vfio_get_container_fd resp.
> > > vfio_has_supported_extensions has been called however we do not know
> which
> > > one is going to be used (Decided later in vfio_setup_device resp.
> > > vfio_set_iommu_type). We can choose a iova mode which is supported by
> all
> > > types but if the modes are exclusive to the types we have to guess
> which
> > > one is going to be used. Or let the user decide?
> >
> > You can keep the old behaviour, restricting to physical memory,
> > until you support virtual addressing.
> > It can be just a #ifdef RTE_ARCH_PPC_64.
> >
>
> Ok but we might want to refine this in the future. IMO It looks much
> cleaner
> to decide this on the iommu type plus this would also cover the noiommu
> case without having this extra check reading the sysfs variable.
You are using the word "this" too many times to help me understand :)
Anyway, please send a quick fix today for 17.11.
The RC3 will be probably closed before Monday.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-11-03 10:54 ` Thomas Monjalon
@ 2017-11-03 11:28 ` Jonas Pfefferle1
0 siblings, 0 replies; 248+ messages in thread
From: Jonas Pfefferle1 @ 2017-11-03 11:28 UTC (permalink / raw)
To: Thomas Monjalon
Cc: aconole, Alexey Kardashevskiy, anatoly.burakov, bruce.richardson,
dev, gaetan.rivet, hemant.agrawal, jerin.jacob, maxime.coquelin,
olivier.matz, Santosh Shukla, sergio.gonzalez.monroy,
shreyansh.jain, stephen
Thomas Monjalon <thomas@monjalon.net> wrote on 11/03/2017 11:54:45 AM:
> From: Thomas Monjalon <thomas@monjalon.net>
> To: Jonas Pfefferle1 <JPF@zurich.ibm.com>
> Cc: aconole@redhat.com, Alexey Kardashevskiy <aik@ozlabs.ru>,
> anatoly.burakov@intel.com, bruce.richardson@intel.com, dev@dpdk.org,
> gaetan.rivet@6wind.com, hemant.agrawal@nxp.com,
> jerin.jacob@caviumnetworks.com, maxime.coquelin@redhat.com,
> olivier.matz@6wind.com, Santosh Shukla
> <santosh.shukla@caviumnetworks.com>,
> sergio.gonzalez.monroy@intel.com, shreyansh.jain@nxp.com,
> stephen@networkplumber.org
> Date: 11/03/2017 11:55 AM
> Subject: Re: [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova
> mode before mapping
>
> 03/11/2017 11:44, Jonas Pfefferle1:
> > Thomas Monjalon <thomas@monjalon.net> wrote on 11/03/2017 11:28:10 AM:
> > > 03/11/2017 10:56, Jonas Pfefferle1:
> > > > Thomas Monjalon <thomas@monjalon.net> wrote on 11/02/2017 11:17:10
AM:
> > > > > > 26/10/2017 14:57, Jonas Pfefferle1:
> > > > > > >
> > > > > > > Hi @all
> > > > > > >
> > > > > > > I just stumbled upon this patch while testing on POWER.
> > RTE_IOVA_VA
> > > > > will
> > > > > > > not work for the sPAPR code since the dma window size is
> > currently
> > > > > > > determined by the physical address only.
> > > > > >
> > > > > > Is it affecting POWER8?
> > > > >
> > > > > It is.
> > > > >
> > > > > >
> > > > > > > I'm preparing a patch to address this.
> > > > > >
> > > > > > Any news?
> > > > > > Can you use virtual addresses?
> > > > >
> > > > > After a long discussion with Alexey (CC) we came to the
conclusion
> > that
> > > > > with the current sPAPR iommu driver we cannot use virtual
addresses
> > since
> > > > > the iova is restricted to lay in the DMA window which itself is
> > > > restricted
> > > > > to physical RAM addresses resp. with the current code 0 to
hotplug
> > memory
> > > > > max. However, Alexey is working on a patch to lift this
restriction
> > on
> > > > the
> > > > > DMA window size which should allow us to do VA:VA mappings in the
> > future.
> > > > > For now we should fall back to PA in the dynamic iova mode check.
I
> > will
> > > > > send an according patch later today.
> > > >
> > > > I looked into this yesterday but I'm not sure what the right
solution
> > is
> > > > here.
> > > > At the time rte_pci_get_iommu_class is called we already know which
> > IOMMU
> > > > types are supported because vfio_get_container_fd resp.
> > > > vfio_has_supported_extensions has been called however we do not
know
> > which
> > > > one is going to be used (Decided later in vfio_setup_device resp.
> > > > vfio_set_iommu_type). We can choose a iova mode which is supported
by
> > all
> > > > types but if the modes are exclusive to the types we have to guess
> > which
> > > > one is going to be used. Or let the user decide?
> > >
> > > You can keep the old behaviour, restricting to physical memory,
> > > until you support virtual addressing.
> > > It can be just a #ifdef RTE_ARCH_PPC_64.
> > >
> >
> > Ok but we might want to refine this in the future. IMO It looks much
> > cleaner
> > to decide this on the iommu type plus this would also cover the noiommu
> > case without having this extra check reading the sysfs variable.
>
> You are using the word "this" too many times to help me understand :)
What I meant is a fix that decides which iova mode to use based on the
iommu types supported (determined by vfio_get_container_fd) instead of
another extra case for PPC much like the noiommu check. Both should
be covered by the supported types based check.
IMO much cleaner and simpler to support new iommu types.
>
> Anyway, please send a quick fix today for 17.11.
> The RC3 will be probably closed before Monday.
>
Will do.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (6 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:42 ` Burakov, Anatoly
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 9/9] eal/rte_malloc: " Santosh Shukla
` (2 subsequent siblings)
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 52791282f..2d9d7c2dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-09-04 15:42 ` Burakov, Anatoly
0 siblings, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:42 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
>
> Check iova mode and accordingly return phy addr.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v7 9/9] eal/rte_malloc: honor iova mode in virt2phy
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (7 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-08-31 3:26 ` Santosh Shukla
2017-09-04 15:44 ` Burakov, Anatoly
2017-09-05 12:28 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Hemant Agrawal
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-08-31 3:26 UTC (permalink / raw)
To: dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, sergio.gonzalez.monroy, bruce.richardson,
shreyansh.jain, gaetan.rivet, anatoly.burakov, stephen, aconole,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.13.0
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 9/9] eal/rte_malloc: honor iova mode in virt2phy
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 9/9] eal/rte_malloc: " Santosh Shukla
@ 2017-09-04 15:44 ` Burakov, Anatoly
0 siblings, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-04 15:44 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, hemant.agrawal, olivier.matz,
maxime.coquelin, Gonzalez Monroy, Sergio, Richardson, Bruce,
shreyansh.jain, gaetan.rivet, stephen, aconole
> From: Santosh Shukla [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Thursday, August 31, 2017 4:26 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; olivier.matz@6wind.com;
> maxime.coquelin@redhat.com; Gonzalez Monroy, Sergio
> <sergio.gonzalez.monroy@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; shreyansh.jain@nxp.com;
> gaetan.rivet@6wind.com; Burakov, Anatoly <anatoly.burakov@intel.com>;
> stephen@networkplumber.org; aconole@redhat.com; Santosh Shukla
> <santosh.shukla@caviumnetworks.com>
> Subject: [PATCH v7 9/9] eal/rte_malloc: honor iova mode in virt2phy
>
> Check iova mode and accordingly return phy addr.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/common/rte_malloc.c
> b/lib/librte_eal/common/rte_malloc.c
> index 5c0627bf4..d65c05a4d 100644
> --- a/lib/librte_eal/common/rte_malloc.c
> +++ b/lib/librte_eal/common/rte_malloc.c
> @@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char
> *type, phys_addr_t rte_malloc_virt2phy(const void *addr) {
> + phys_addr_t paddr;
> const struct malloc_elem *elem = malloc_elem_from_data(addr);
> if (elem == NULL)
> return RTE_BAD_PHYS_ADDR;
> if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
> return RTE_BAD_PHYS_ADDR;
> - return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem-
> >ms->addr);
> +
> + if (rte_eal_iova_mode() == RTE_IOVA_VA)
> + paddr = (uintptr_t)addr;
> + else
> + paddr = elem->ms->phys_addr +
> + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
> + return paddr;
> }
Hi Santosh,
I think there's a RTE_PTR_DIFF macro for stuff like this, but otherwise
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (8 preceding siblings ...)
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 9/9] eal/rte_malloc: " Santosh Shukla
@ 2017-09-05 12:28 ` Hemant Agrawal
2017-09-05 12:30 ` Hemant Agrawal
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
10 siblings, 1 reply; 248+ messages in thread
From: Hemant Agrawal @ 2017-09-05 12:28 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, olivier.matz, maxime.coquelin,
sergio.gonzalez.monroy, bruce.richardson, shreyansh.jain,
gaetan.rivet, anatoly.burakov, stephen, aconole
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
On 8/31/2017 8:56 AM, Santosh Shukla wrote:
> v7:
> Includes no major change, minor change detailing:
> - patch sqashing (Aaron suggestion)
> - added run_once for device_parse() and bus_scan() in eal init
> (Aaron suggestion)
> - Moved rte_eal_device_parse() up in eal initialization order.
> - Patches rebased on top of version: 17.11-rc0
> For v6 info refer [11].
>
> v6:
> Sending v5 series rebased on top of version: 17.11-rc0.
>
> v5:
> Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va mapping.
> If a PCI driver demand for IOVA as VA scheme then the driver can add it in the
> PCI driver registration function.
>
> Algorithm to select IOVA as VA for PCI bus case:
> 0. If no device bound then return with RTE_IOVA_DC mapping mode,
> else goto 1).
> 1. Look for device attached to vfio kdrv and has .drv_flag set
> to RTE_PCI_DRV_IOVA_AS_VA.
> 2. Look for any device attached to UIO class of driver.
> 3. Check for vfio-noiommu mode enabled.
>
> If 2) & 3) is false and 1) is true then select
> mapping scheme as RTE_IOVA_VA. Otherwise use default
> mapping scheme (RTE_IOVA_PA).
>
> That way, Bus can truly autodetect the iova mapping mode for
> a device Or a set of the device.
>
> v6 --> v7:
> - Patches squashed per v6.
> - Added run_once in eal per v6.
> - Moved rte_eal_device_parse() up in eal init oder.
>
> v5 --> v6:
> - Added api info in eal's versiom.map (release DPDK_v17.11).
>
> v4 --> v5:
> - Change DPDK_17.08 to DPDK_17.11 in _version.map.
> - Reworded bus api description (suggested by Hemant).
> - Added reviewed-by from Maxime in v5.
> - Added acked-by from Hemant for pci and bus patches.
>
> v3 --> v4:
> - Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
> - Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
> - Reworded WARNING message(suggested by Maxime[7]).
> - Created a separate patch for rte_pci_get_iommu_class (suggested by Maxime[]).
> - Added VFIO_PRESENT ifdef build fix.
>
> v2 --> v3:
> - Removed rte_mempool_virt2phy (suggested by Olivier [4])
>
> v1 --> v2:
> - Removed override eal option i.e. (--iova-mode=<>) Because we have means to
> truly autodetect the iova mode.
> - Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
> - Using NEED_IOVA_VA drv_flag in autodetection logic.
> - Removed Linux version check macro in vfio code, As per Maxime feedback.
> - Moved rte_pci_match API from local to global.
>
> Patch Summary:
> 1) 1nd: declare rte_pci_match api in pci header. Required for autodetection in
> follow up patches.
> 2) 2nd - 3rd - 4th : autodetection mapping infrastructure for Linux/bsdapp.
> 3) 5th: iova mode helper API.
> 4) 6th: Infra to detect iova mode.
> 5) 7th: make vfio mapping iova aware.
> 6) 8th - 9th : Check for IOVA_VA mode in below APIs
> - rte_mem_virt2phy
> - rte_malloc_virt2phy
>
> Test History:
> - Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
> - Tested for arm64/thunderx vNIC Integrated NIC for both modes
> - Tested for arm64/Octeontx integrated NICs for only
> Iova_va mode(It supports only one mode.)
> - Ran standalone tests like mempool_autotest, mbuf_autotest.
> - Verified for Doxygen.
>
> Work History:
> For v1, Refer [1].
> For v2, Refer [2].
> For v3, Refer [9].
> For v4, refer [10].
> for v6, refer [11].
>
> Checkpatch result:
> * Debug message - WARNING: line over 80 characters
>
> Thanks.,
> [1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
> [2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
> [3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
> [4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
> [5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
> [6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
> [7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
> [8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
> [9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
> [10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
> [11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
>
>
> Santosh Shukla (9):
> eal/pci: export match function
> eal/pci: get iommu class
> linuxapp/eal_pci: get iommu class
> bus: get iommu class
> eal: introduce iova mode helper api
> eal: auto detect iova mode
> linuxapp/eal_vfio: honor iova mode before mapping
> linuxapp/eal_memory: honor iova mode in virt2phy
> eal/rte_malloc: honor iova mode in virt2phy
>
> lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
> lib/librte_eal/common/eal_common_bus.c | 23 ++++++
> lib/librte_eal/common/eal_common_pci.c | 11 +--
> lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
> lib/librte_eal/common/include/rte_eal.h | 12 ++++
> lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
> lib/librte_eal/common/rte_malloc.c | 9 ++-
> lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
> lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
> lib/librte_eal/linuxapp/eal/eal_pci.c | 95 +++++++++++++++++++++++++
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
> 15 files changed, 311 insertions(+), 34 deletions(-)
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus
2017-09-05 12:28 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Hemant Agrawal
@ 2017-09-05 12:30 ` Hemant Agrawal
0 siblings, 0 replies; 248+ messages in thread
From: Hemant Agrawal @ 2017-09-05 12:30 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: thomas, jerin.jacob, olivier.matz, maxime.coquelin,
sergio.gonzalez.monroy, bruce.richardson, shreyansh.jain,
gaetan.rivet, anatoly.burakov, stephen, aconole
Please note that this series break the DPAA2 BUS.
Following patch series (Shreyansh) is required to fix DPAA2 bus working
with this patch series:
http://dpdk.org/dev/patchwork/patch/27950/
On 9/5/2017 5:58 PM, Hemant Agrawal wrote:
> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>
> On 8/31/2017 8:56 AM, Santosh Shukla wrote:
>> v7:
>> Includes no major change, minor change detailing:
>> - patch sqashing (Aaron suggestion)
>> - added run_once for device_parse() and bus_scan() in eal init
>> (Aaron suggestion)
>> - Moved rte_eal_device_parse() up in eal initialization order.
>> - Patches rebased on top of version: 17.11-rc0
>> For v6 info refer [11].
>>
>> v6:
>> Sending v5 series rebased on top of version: 17.11-rc0.
>>
>> v5:
>> Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va
>> mapping.
>> If a PCI driver demand for IOVA as VA scheme then the driver can add
>> it in the
>> PCI driver registration function.
>>
>> Algorithm to select IOVA as VA for PCI bus case:
>> 0. If no device bound then return with RTE_IOVA_DC mapping mode,
>> else goto 1).
>> 1. Look for device attached to vfio kdrv and has .drv_flag set
>> to RTE_PCI_DRV_IOVA_AS_VA.
>> 2. Look for any device attached to UIO class of driver.
>> 3. Check for vfio-noiommu mode enabled.
>>
>> If 2) & 3) is false and 1) is true then select
>> mapping scheme as RTE_IOVA_VA. Otherwise use default
>> mapping scheme (RTE_IOVA_PA).
>>
>> That way, Bus can truly autodetect the iova mapping mode for
>> a device Or a set of the device.
>>
>> v6 --> v7:
>> - Patches squashed per v6.
>> - Added run_once in eal per v6.
>> - Moved rte_eal_device_parse() up in eal init oder.
>>
>> v5 --> v6:
>> - Added api info in eal's versiom.map (release DPDK_v17.11).
>>
>> v4 --> v5:
>> - Change DPDK_17.08 to DPDK_17.11 in _version.map.
>> - Reworded bus api description (suggested by Hemant).
>> - Added reviewed-by from Maxime in v5.
>> - Added acked-by from Hemant for pci and bus patches.
>>
>> v3 --> v4:
>> - Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
>> - Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
>> - Reworded WARNING message(suggested by Maxime[7]).
>> - Created a separate patch for rte_pci_get_iommu_class (suggested by
>> Maxime[]).
>> - Added VFIO_PRESENT ifdef build fix.
>>
>> v2 --> v3:
>> - Removed rte_mempool_virt2phy (suggested by Olivier [4])
>>
>> v1 --> v2:
>> - Removed override eal option i.e. (--iova-mode=<>) Because we have
>> means to
>> truly autodetect the iova mode.
>> - Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
>> - Using NEED_IOVA_VA drv_flag in autodetection logic.
>> - Removed Linux version check macro in vfio code, As per Maxime feedback.
>> - Moved rte_pci_match API from local to global.
>>
>> Patch Summary:
>> 1) 1nd: declare rte_pci_match api in pci header. Required for
>> autodetection in
>> follow up patches.
>> 2) 2nd - 3rd - 4th : autodetection mapping infrastructure for
>> Linux/bsdapp.
>> 3) 5th: iova mode helper API.
>> 4) 6th: Infra to detect iova mode.
>> 5) 7th: make vfio mapping iova aware.
>> 6) 8th - 9th : Check for IOVA_VA mode in below APIs
>> - rte_mem_virt2phy
>> - rte_malloc_virt2phy
>>
>> Test History:
>> - Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
>> - Tested for arm64/thunderx vNIC Integrated NIC for both modes
>> - Tested for arm64/Octeontx integrated NICs for only
>> Iova_va mode(It supports only one mode.)
>> - Ran standalone tests like mempool_autotest, mbuf_autotest.
>> - Verified for Doxygen.
>>
>> Work History:
>> For v1, Refer [1].
>> For v2, Refer [2].
>> For v3, Refer [9].
>> For v4, refer [10].
>> for v6, refer [11].
>>
>> Checkpatch result:
>> * Debug message - WARNING: line over 80 characters
>>
>> Thanks.,
>> [1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
>> [2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
>> [3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
>> [4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
>> [5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
>> [6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
>> [7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
>> [8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
>> [9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
>> [10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
>> [11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
>>
>>
>> Santosh Shukla (9):
>> eal/pci: export match function
>> eal/pci: get iommu class
>> linuxapp/eal_pci: get iommu class
>> bus: get iommu class
>> eal: introduce iova mode helper api
>> eal: auto detect iova mode
>> linuxapp/eal_vfio: honor iova mode before mapping
>> linuxapp/eal_memory: honor iova mode in virt2phy
>> eal/rte_malloc: honor iova mode in virt2phy
>>
>> lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
>> lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
>> lib/librte_eal/common/eal_common_bus.c | 23 ++++++
>> lib/librte_eal/common/eal_common_pci.c | 11 +--
>> lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
>> lib/librte_eal/common/include/rte_eal.h | 12 ++++
>> lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
>> lib/librte_eal/common/rte_malloc.c | 9 ++-
>> lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
>> lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
>> lib/librte_eal/linuxapp/eal/eal_pci.c | 95
>> +++++++++++++++++++++++++
>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
>> lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
>> 15 files changed, 311 insertions(+), 34 deletions(-)
>>
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 0/9] Infrastructure to detect iova mapping on the bus
2017-08-31 3:26 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (9 preceding siblings ...)
2017-09-05 12:28 ` [dpdk-dev] [PATCH v7 0/9] Infrastructure to detect iova mapping on the bus Hemant Agrawal
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 1/9] eal/pci: export match function Santosh Shukla
` (9 more replies)
10 siblings, 10 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
v8:
Includes minor review changes per v7 review comment from Anatoly.
Patches rebased on Tip commit:3d2e0448eb.
v7:
Includes no major change, minor change detailing:
- patch sqashing (Aaron suggestion)
- added run_once for device_parse() and bus_scan() in eal init
(Aaron suggestion)
- Moved rte_eal_device_parse() up in eal initialization order.
- Patches rebased on top of version: 17.11-rc0
For v6 info refer [11].
v6:
Sending v5 series rebased on top of version: 17.11-rc0.
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va
mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add
it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
Change History:
v7 --> v8:
- Replace 0 / 1 with true/false boolean values (Suggested by Anatoly).
v6 --> v7:
- Patches squashed per v6.
- Added run_once in eal per v6.
- Moved rte_eal_device_parse() up in eal init oder.
v5 --> v6:
- Added api info in eal's versiom.map (release DPDK_v17.11).
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by
Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have
means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
1) 1nd: declare rte_pci_match api in pci header. Required for
autodetection in
follow up patches.
2) 2nd - 3rd - 4th : autodetection mapping infrastructure for
Linux/bsdapp.
3) 5th: iova mode helper API.
4) 6th: Infra to detect iova mode.
5) 7th: make vfio mapping iova aware.
6) 8th - 9th : Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
for v6, refer [11].
Checkpatch result:
* None
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
[11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
Santosh Shukla (9):
eal/pci: export match function
eal/pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce iova mode helper api
eal: auto detect iova mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 96 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 312 insertions(+), 34 deletions(-)
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 1/9] eal/pci: export match function
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class Santosh Shukla
` (8 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 47a09ea7f..cfbf8fbd0 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -238,3 +238,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..eab84c7a4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -366,6 +366,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 8c08b8d1e..287cc75cd 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -243,3 +243,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 1/9] eal/pci: export match function Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-19 16:37 ` Burakov, Anatoly
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 3/9] linuxapp/eal_pci: " Santosh Shukla
` (7 subsequent siblings)
9 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also add rte_pci_get_iommu_class definition for bsdapp,
in bsdapp case - api returns default iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v6 --> v7:
- squashed v6 series patch [02/12] & [03/12] (Aaron comment).
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
4 files changed, 32 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index cfbf8fbd0..c6ffd9399 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index eab84c7a4..0e36de093 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -381,6 +381,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-09-19 16:37 ` Burakov, Anatoly
2017-09-19 17:29 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-19 16:37 UTC (permalink / raw)
To: Santosh Shukla, dev
On 18-Sep-17 11:42 AM, Santosh Shukla wrote:
> Introducing rte_pci_get_iommu_class API which helps to get iommu class
> of PCI device on the bus and returns preferred iova mapping mode for
> PCI bus.
>
> Patch also add rte_pci_get_iommu_class definition for bsdapp,
> in bsdapp case - api returns default iova mode.
>
> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> ---
Hi Santosh,
You have probably missed my comment on previous version of this patch,
but for commit history reasons i really think you should add a linuxapp
stub in this commit as well as a FreeBSD stub, even though you are
adding a linuxapp function in the next commit. Any linuxapp application
using that function will fail to compile with this commit, despite this
API being already present and declared as public.
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class
2017-09-19 16:37 ` Burakov, Anatoly
@ 2017-09-19 17:29 ` santosh
2017-09-20 9:09 ` Burakov, Anatoly
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-09-19 17:29 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Hi Anatoly,
On Tuesday 19 September 2017 10:07 PM, Burakov, Anatoly wrote:
> On 18-Sep-17 11:42 AM, Santosh Shukla wrote:
>> Introducing rte_pci_get_iommu_class API which helps to get iommu class
>> of PCI device on the bus and returns preferred iova mapping mode for
>> PCI bus.
>>
>> Patch also add rte_pci_get_iommu_class definition for bsdapp,
>> in bsdapp case - api returns default iova mode.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>> ---
>
> Hi Santosh,
>
> You have probably missed my comment on previous version of this patch, but for commit history reasons i really think you should add a linuxapp stub in this commit as well as a FreeBSD stub, even though you are adding a linuxapp function in the next commit. Any linuxapp application using that function will fail to compile with this commit, despite this API being already present and declared as public.
>
First, apologies for not following up on your note:
I prefer to keep less context in each patch and
for [03/9], its already has _IOVA_AS_VA flag + whole autodetection
algo inside (squashed per Aron suggestion).
Now if I squash [2/9] into [3/9], then would be too much info
for future reader to digest for (imo). Its a kind of trade-off.
On any linuxapp appl breaking with this commit:
This series exposes eal api for application to use and identify iova mode.
If you still feel not convinced with my explanation then I'll spin v9 and squash
[02/09], [03/09] in v9.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class
2017-09-19 17:29 ` santosh
@ 2017-09-20 9:09 ` Burakov, Anatoly
2017-09-20 10:24 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-20 9:09 UTC (permalink / raw)
To: santosh, dev
Hi Santosh,
On 19-Sep-17 6:29 PM, santosh wrote:
> Hi Anatoly,
>
>
> On Tuesday 19 September 2017 10:07 PM, Burakov, Anatoly wrote:
>> On 18-Sep-17 11:42 AM, Santosh Shukla wrote:
>>> Introducing rte_pci_get_iommu_class API which helps to get iommu class
>>> of PCI device on the bus and returns preferred iova mapping mode for
>>> PCI bus.
>>>
>>> Patch also add rte_pci_get_iommu_class definition for bsdapp,
>>> in bsdapp case - api returns default iova mode.
>>>
>>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
>>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>>> Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>> ---
>>
>> Hi Santosh,
>>
>> You have probably missed my comment on previous version of this patch, but for commit history reasons i really think you should add a linuxapp stub in this commit as well as a FreeBSD stub, even though you are adding a linuxapp function in the next commit. Any linuxapp application using that function will fail to compile with this commit, despite this API being already present and declared as public.
>>
> First, apologies for not following up on your note:
>
> I prefer to keep less context in each patch and
> for [03/9], its already has _IOVA_AS_VA flag + whole autodetection
> algo inside (squashed per Aron suggestion).
>
> Now if I squash [2/9] into [3/9], then would be too much info
> for future reader to digest for (imo). Its a kind of trade-off.
>
> On any linuxapp appl breaking with this commit:
> This series exposes eal api for application to use and identify iova mode.
>
> If you still feel not convinced with my explanation then I'll spin v9 and squash
> [02/09], [03/09] in v9.
No, i don't mean squashing these two patches into one. I mean, provide a
stub like for FreeBSD, and then edit it to be a proper implementation in
the next commit.
I.e. in this commit, add a stub that just returns 0, like for FreeBSD.
Next commit, instead of starting from scratch, start from this stub.
Thanks,
Anatoly
>
> Thanks.
>
>
>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class
2017-09-20 9:09 ` Burakov, Anatoly
@ 2017-09-20 10:24 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-09-20 10:24 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Hi Anatoly,
On Wednesday 20 September 2017 02:39 PM, Burakov, Anatoly wrote:
> Hi Santosh,
>
> On 19-Sep-17 6:29 PM, santosh wrote:
>> Hi Anatoly,
>>
>>
>> On Tuesday 19 September 2017 10:07 PM, Burakov, Anatoly wrote:
>>> On 18-Sep-17 11:42 AM, Santosh Shukla wrote:
>>>> Introducing rte_pci_get_iommu_class API which helps to get iommu class
>>>> of PCI device on the bus and returns preferred iova mapping mode for
>>>> PCI bus.
>>>>
>>>> Patch also add rte_pci_get_iommu_class definition for bsdapp,
>>>> in bsdapp case - api returns default iova mode.
>>>>
>>>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
>>>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>> ---
>>>
>>> Hi Santosh,
>>>
>>> You have probably missed my comment on previous version of this patch, but for commit history reasons i really think you should add a linuxapp stub in this commit as well as a FreeBSD stub, even though you are adding a linuxapp function in the next commit. Any linuxapp application using that function will fail to compile with this commit, despite this API being already present and declared as public.
>>>
>> First, apologies for not following up on your note:
>>
>> I prefer to keep less context in each patch and
>> for [03/9], its already has _IOVA_AS_VA flag + whole autodetection
>> algo inside (squashed per Aron suggestion).
>> Now if I squash [2/9] into [3/9], then would be too much info
>> for future reader to digest for (imo). Its a kind of trade-off.
>>
>> On any linuxapp appl breaking with this commit:
>> This series exposes eal api for application to use and identify iova mode.
>>
>> If you still feel not convinced with my explanation then I'll spin v9 and squash
>> [02/09], [03/09] in v9.
>
> No, i don't mean squashing these two patches into one. I mean, provide a stub like for FreeBSD, and then edit it to be a proper implementation in the next commit.
>
> I.e. in this commit, add a stub that just returns 0, like for FreeBSD. Next commit, instead of starting from scratch, start from this stub.
>
+1, Sending v9.
Thanks.
> Thanks,
> Anatoly
>
>>
>> Thanks.
>>
>>
>>
>
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 3/9] linuxapp/eal_pci: get iommu class
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 1/9] eal/pci: export match function Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 4/9] bus: " Santosh Shukla
` (6 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
v7 --> v8:
- Replaced 0/1 with false/true boolean value (Suggested by Anatoly)
v6 --> v7:
- squashed v6 series patch no [01/12] & [05/12]..
i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron comment).
lib/librte_eal/common/include/rte_pci.h | 2 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 96 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 122 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 0e36de093..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..2971f1d4f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -487,6 +488,101 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == true ?
+ true : false;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 287cc75cd..a8c8ea4f4 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 4/9] bus: get iommu class
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (2 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 5/9] eal: introduce iova mode helper api Santosh Shukla
` (5 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c6ffd9399..3466eaf20 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -244,5 +244,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 9e40687e5..70a291a4d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -178,6 +178,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -191,6 +205,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -290,6 +305,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a8c8ea4f4..9115aa3e9 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -249,5 +249,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 5/9] eal: introduce iova mode helper api
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (3 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 4/9] bus: " Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 6/9] eal: auto detect iova mode Santosh Shukla
` (4 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 5fa598842..07e72203f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 3466eaf20..6bed74dff 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -245,5 +245,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 48f12f44c..febbafdb3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 9115aa3e9..8e49bf5fa 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -250,5 +250,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 6/9] eal: auto detect iova mode
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (4 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 5/9] eal: introduce iova mode helper api Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (3 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
For auto detection purpose:
* Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
Based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
v6 --> v7:
- Moved eal_option_device_parse() up in then order of eal init.
- Added run_once. (aaron suggestion).
- squashed v6 series patch no. [08/12] & [09/12] into one patch (Aaron
comment)
lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 07e72203f..f003f4c04 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,17 +636,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index febbafdb3..f4901ffb6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,17 +911,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (5 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 6/9] eal: auto detect iova mode Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (2 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (6 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 9/9] eal/rte_malloc: " Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 52791282f..2d9d7c2dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v8 9/9] eal/rte_malloc: honor iova mode in virt2phy
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (7 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-09-18 10:42 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-18 10:42 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 " Santosh Shukla
` (8 preceding siblings ...)
2017-09-18 10:42 ` [dpdk-dev] [PATCH v8 9/9] eal/rte_malloc: " Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 1/9] eal/pci: export match function Santosh Shukla
` (10 more replies)
9 siblings, 11 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
v9:
- Added Tested-By: to series.
- Includes minor changes related to linuxapp api stub in [02/09]
(Suggested by Anatoly)
- Series rebased on tip commit : aee62e90
v8:
Includes minor review changes per v7 review comment from Anatoly.
Patches rebased on Tip commit:3d2e0448eb.
v7:
Includes no major change, minor change detailing:
- patch sqashing (Aaron suggestion)
- added run_once for device_parse() and bus_scan() in eal init
(Aaron suggestion)
- Moved rte_eal_device_parse() up in eal initialization order.
- Patches rebased on top of version: 17.11-rc0
For v6 info refer [11].
v6:
Sending v5 series rebased on top of version: 17.11-rc0.
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va
mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add
it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
Change History:
v8 --> v9:
- Added Tested-by: signature of Hemant.
- Added linuxapp stub api definition in [02/09] (Suggested by Amatoly)
v7 --> v8:
- Replace 0 / 1 with true/false boolean values (Suggested by Anatoly).
v6 --> v7:
- Patches squashed per v6.
- Added run_once in eal per v6.
- Moved rte_eal_device_parse() up in eal init oder.
v5 --> v6:
- Added api info in eal's versiom.map (release DPDK_v17.11).
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by
Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have
means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
1) 1nd: declare rte_pci_match api in pci header. Required for
autodetection in
follow up patches.
2) 2nd - 3rd - 4th : autodetection mapping infrastructure for
Linux/bsdapp.
3) 5th: iova mode helper API.
4) 6th: Infra to detect iova mode.
5) 7th: make vfio mapping iova aware.
6) 8th - 9th : Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
for v6, refer [11].
Checkpatch result:
* None
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
[11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
Santosh Shukla (9):
eal/pci: export match function
eal/pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce helper API for iova mode
eal: auto detect iova mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 35 +++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 96 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 312 insertions(+), 34 deletions(-)
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 1/9] eal/pci: export match function
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class Santosh Shukla
` (9 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 47a09ea7f..cfbf8fbd0 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -238,3 +238,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..eab84c7a4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -366,6 +366,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 8c08b8d1e..287cc75cd 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -243,3 +243,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 1/9] eal/pci: export match function Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:39 ` Burakov, Anatoly
2017-10-05 23:58 ` Thomas Monjalon
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: " Santosh Shukla
` (8 subsequent siblings)
10 siblings, 2 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also adds rte_pci_get_iommu_class definition for:
- bsdapp: api returns default iova mode.
- linuxapp: Has stub implementation, Followup patch has complete
implementation.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v8 --> v9:
- Added linuxapp iova stub definition (Suugested by Anatoly)
v6 --> v7:
- squashed v6 series patch [02/12] & [03/12] (Aaron comment).
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_bus.h | 10 ++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
lib/librte_eal/linuxapp/eal/eal_pci.c | 9 +++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
6 files changed, 42 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index cfbf8fbd0..c6ffd9399 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0),
+ RTE_IOVA_VA = (1 << 1)
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index eab84c7a4..0e36de093 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -381,6 +381,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..26f2be822 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -487,6 +487,15 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 287cc75cd..a8c8ea4f4 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-09-20 11:39 ` Burakov, Anatoly
2017-10-05 23:58 ` Thomas Monjalon
1 sibling, 0 replies; 248+ messages in thread
From: Burakov, Anatoly @ 2017-09-20 11:39 UTC (permalink / raw)
To: Santosh Shukla, dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On 20-Sep-17 12:23 PM, Santosh Shukla wrote:
> Introducing rte_pci_get_iommu_class API which helps to get iommu class
> of PCI device on the bus and returns preferred iova mapping mode for
> PCI bus.
>
> Patch also adds rte_pci_get_iommu_class definition for:
> - bsdapp: api returns default iova mode.
> - linuxapp: Has stub implementation, Followup patch has complete
> implementation.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class Santosh Shukla
2017-09-20 11:39 ` Burakov, Anatoly
@ 2017-10-05 23:58 ` Thomas Monjalon
2017-10-06 3:04 ` santosh
1 sibling, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-05 23:58 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
This patch is introducing a new abstraction.
It is important to explain it for future readers of this code.
20/09/2017 13:23, Santosh Shukla:
> +/**
> + * IOVA mapping mode.
> + */
Please explain what IOVA means and what is the purpose of
distinguish the different modes.
> +enum rte_iova_mode {
> + RTE_IOVA_DC = 0, /* Don't care mode */
> + RTE_IOVA_PA = (1 << 0),
> + RTE_IOVA_VA = (1 << 1)
> +};
You should explain each value of the enum.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-10-05 23:58 ` Thomas Monjalon
@ 2017-10-06 3:04 ` santosh
2017-10-06 7:24 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-06 3:04 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
Thomas,
You comment is annoying and infuriating both.
Patch is their for more than 4month, had enough time for you to comment
and understand the topic. Thorough review and testing has happened both.
NOTE: You have already delayed this series by one release and
I'm guessing that you intent to push by one more, if you had such
mundane question then why not ask before? Make me think that you are
wasting my time and effort both.
On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:
> This patch is introducing a new abstraction.
> It is important to explain it for future readers of this code.
If you don't know - What is iova? How to program iova?
purpose of iova then should read and educate your know - how first.
Yes, its is introducing new abstraction, because dpdk from
ancient days does only one programming mode aka iova=pa.
note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
which is one of iova mode too!.
However, IOMMU can also generate _va address too called iova=_va mode..
which is also correct/viable/applicable/Okiesh programming mode
for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).
Now Why dpdk needs to understand IOVA programming philosophy:
Though DPDK was _silenty_ using iova as pa mode but then there
is a need arise to make mapping mode explicit and for that we need
abstraction since there wasn't one existed.
Reason:
Because From last few years,.ONA participants like Cavium, nxp
added ARM arch support in dpdk and included drivers for their HW..
and their hw has use-case (example external mempool), such a way that
programming those HW in iova as va mode would save cycle in fast path
(this part, we explained so many-1000 time in series and same understood by reviewer)
thus its vital to introduce iova infra in dpdk.
Same applicable for intel HW blocks too. Its works for intel too!
> 20/09/2017 13:23, Santosh Shukla:
>> +/**
>> + * IOVA mapping mode.
>> + */
> Please explain what IOVA means and what is the purpose of
> distinguish the different modes.
>
IOVA mapping mode is device aka iommu programming mode by which
HW(iommu) will generate _pa or _va address accordingly.
>> +enum rte_iova_mode {
>> + RTE_IOVA_DC = 0, /* Don't care mode */
>> + RTE_IOVA_PA = (1 << 0),
>> + RTE_IOVA_VA = (1 << 1)
>> +};
> You should explain each value of the enum.
Aren't naming choice for each member of enum is self-explanatory?
I don't find logic anymore in your question? are you asking about side commenting?
if not then IFAIU, you question is basically about what is _pa and _va? if so then
reader should have little know-how before they intent to do fast-path programming.
Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
incase any user want to program device in _pa or _va. I'm at loss with you question,
I don;t see logic and it is frustrating to me. You had enough time for all this
in case you had really cared,, we have series for external PMD and drivers waiting
for iova infra, I see it a your move nothing bu blocking ONA series progress
Don;t you trust Reviewer in case you have hard time understaing topic and that
makese me to ask - Are you willing to accept this feature or not? if not then
I'm wasting my energy on it.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-10-06 3:04 ` santosh
@ 2017-10-06 7:24 ` Thomas Monjalon
2017-10-06 9:13 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 7:24 UTC (permalink / raw)
To: santosh
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
06/10/2017 05:04, santosh:
> Thomas,
>
> You comment is annoying and infuriating both.
> Patch is their for more than 4month, had enough time for you to comment
> and understand the topic. Thorough review and testing has happened both.
>
> NOTE: You have already delayed this series by one release and
> I'm guessing that you intent to push by one more, if you had such
> mundane question then why not ask before? Make me think that you are
> wasting my time and effort both.
You misunderstand me.
My intent is to push this patch.
A lot of people have reviewed it during this cycle.
I was just looking for wording details in order to ease people
when they will see this abstraction in the code base.
> On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:
>
> > This patch is introducing a new abstraction.
> > It is important to explain it for future readers of this code.
>
> If you don't know - What is iova? How to program iova?
> purpose of iova then should read and educate your know - how first.
>
> Yes, its is introducing new abstraction, because dpdk from
> ancient days does only one programming mode aka iova=pa.
>
> note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
> which is one of iova mode too!.
>
> However, IOMMU can also generate _va address too called iova=_va mode..
> which is also correct/viable/applicable/Okiesh programming mode
> for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).
>
> Now Why dpdk needs to understand IOVA programming philosophy:
>
> Though DPDK was _silenty_ using iova as pa mode but then there
> is a need arise to make mapping mode explicit and for that we need
> abstraction since there wasn't one existed.
>
> Reason:
> Because From last few years,.ONA participants like Cavium, nxp
> added ARM arch support in dpdk and included drivers for their HW..
> and their hw has use-case (example external mempool), such a way that
> programming those HW in iova as va mode would save cycle in fast path
> (this part, we explained so many-1000 time in series and same understood by reviewer)
> thus its vital to introduce iova infra in dpdk.
>
> Same applicable for intel HW blocks too. Its works for intel too!
I know all of that!
I was just thinking that you could add more explanations somewhere
in the code or the doc.
> > 20/09/2017 13:23, Santosh Shukla:
> >> +/**
> >> + * IOVA mapping mode.
> >> + */
> > Please explain what IOVA means and what is the purpose of
> > distinguish the different modes.
> >
> IOVA mapping mode is device aka iommu programming mode by which
> HW(iommu) will generate _pa or _va address accordingly.
In this doxygen block, it would be the right place to explain how the
IOVA mode will impact the rest of DPDK.
> >> +enum rte_iova_mode {
> >> + RTE_IOVA_DC = 0, /* Don't care mode */
> >> + RTE_IOVA_PA = (1 << 0),
> >> + RTE_IOVA_VA = (1 << 1)
> >> +};
> > You should explain each value of the enum.
>
> Aren't naming choice for each member of enum is self-explanatory?
> I don't find logic anymore in your question? are you asking about side commenting?
> if not then IFAIU, you question is basically about what is _pa and _va? if so then
> reader should have little know-how before they intent to do fast-path programming.
> Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
> incase any user want to program device in _pa or _va. I'm at loss with you question,
> I don;t see logic and it is frustrating to me. You had enough time for all this
> in case you had really cared,, we have series for external PMD and drivers waiting
> for iova infra, I see it a your move nothing bu blocking ONA series progress
> Don;t you trust Reviewer in case you have hard time understaing topic and that
> makese me to ask - Are you willing to accept this feature or not? if not then
> I'm wasting my energy on it.
Santosh, I'm sorry if you don't understand that I was just asking for
a bit more doc.
You could just add something like
/* DMA using physical address */
/* DMA using virtual address */
Anyway, if you don't want to add any explanation, it won't prevent
pushing this patch.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class
2017-10-06 7:24 ` Thomas Monjalon
@ 2017-10-06 9:13 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-10-06 9:13 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Friday 06 October 2017 12:54 PM, Thomas Monjalon wrote:
> 06/10/2017 05:04, santosh:
>> Thomas,
>>
>> You comment is annoying and infuriating both.
>> Patch is their for more than 4month, had enough time for you to comment
>> and understand the topic. Thorough review and testing has happened both.
>>
>> NOTE: You have already delayed this series by one release and
>> I'm guessing that you intent to push by one more, if you had such
>> mundane question then why not ask before? Make me think that you are
>> wasting my time and effort both.
> You misunderstand me.
> My intent is to push this patch.
> A lot of people have reviewed it during this cycle.
> I was just looking for wording details in order to ease people
> when they will see this abstraction in the code base.
>
>> On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:
>>
>>> This patch is introducing a new abstraction.
>>> It is important to explain it for future readers of this code.
>> If you don't know - What is iova? How to program iova?
>> purpose of iova then should read and educate your know - how first.
>>
>> Yes, its is introducing new abstraction, because dpdk from
>> ancient days does only one programming mode aka iova=pa.
>>
>> note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
>> which is one of iova mode too!.
>>
>> However, IOMMU can also generate _va address too called iova=_va mode..
>> which is also correct/viable/applicable/Okiesh programming mode
>> for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).
>>
>> Now Why dpdk needs to understand IOVA programming philosophy:
>>
>> Though DPDK was _silenty_ using iova as pa mode but then there
>> is a need arise to make mapping mode explicit and for that we need
>> abstraction since there wasn't one existed.
>>
>> Reason:
>> Because From last few years,.ONA participants like Cavium, nxp
>> added ARM arch support in dpdk and included drivers for their HW..
>> and their hw has use-case (example external mempool), such a way that
>> programming those HW in iova as va mode would save cycle in fast path
>> (this part, we explained so many-1000 time in series and same understood by reviewer)
>> thus its vital to introduce iova infra in dpdk.
>>
>> Same applicable for intel HW blocks too. Its works for intel too!
> I know all of that!
> I was just thinking that you could add more explanations somewhere
> in the code or the doc.
>
>>> 20/09/2017 13:23, Santosh Shukla:
>>>> +/**
>>>> + * IOVA mapping mode.
>>>> + */
>>> Please explain what IOVA means and what is the purpose of
>>> distinguish the different modes.
>>>
>> IOVA mapping mode is device aka iommu programming mode by which
>> HW(iommu) will generate _pa or _va address accordingly.
sending v10 with doc changes.
> In this doxygen block, it would be the right place to explain how the
> IOVA mode will impact the rest of DPDK.
>
>>>> +enum rte_iova_mode {
>>>> + RTE_IOVA_DC = 0, /* Don't care mode */
>>>> + RTE_IOVA_PA = (1 << 0),
>>>> + RTE_IOVA_VA = (1 << 1)
>>>> +};
>>> You should explain each value of the enum.
>> Aren't naming choice for each member of enum is self-explanatory?
>> I don't find logic anymore in your question? are you asking about side commenting?
>> if not then IFAIU, you question is basically about what is _pa and _va? if so then
>> reader should have little know-how before they intent to do fast-path programming.
>> Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
>> incase any user want to program device in _pa or _va. I'm at loss with you question,
>> I don;t see logic and it is frustrating to me. You had enough time for all this
>> in case you had really cared,, we have series for external PMD and drivers waiting
>> for iova infra, I see it a your move nothing bu blocking ONA series progress
>> Don;t you trust Reviewer in case you have hard time understaing topic and that
>> makese me to ask - Are you willing to accept this feature or not? if not then
>> I'm wasting my energy on it.
> Santosh, I'm sorry if you don't understand that I was just asking for
> a bit more doc.
> You could just add something like
> /* DMA using physical address */
> /* DMA using virtual address */
in v10.
Thanks.
> Anyway, if you don't want to add any explanation, it won't prevent
> pushing this patch.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 1/9] eal/pci: export match function Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-10-06 0:17 ` Thomas Monjalon
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 4/9] bus: " Santosh Shukla
` (7 subsequent siblings)
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v7 --> v8:
- Replaced 0/1 with false/true boolean value (Suggested by Anatoly)
v6 --> v7:
- squashed v6 series patch no [01/12] & [05/12]..
i.e.. moved RTE_PCI_DRV_IOVA_AS_VA flag into this patch. (Aaron comment).
lib/librte_eal/common/include/rte_pci.h | 2 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 89 ++++++++++++++++++++++++++++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
4 files changed, 113 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 0e36de093..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 26f2be822..2971f1d4f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -488,11 +489,97 @@ rte_pci_scan(void)
}
/*
- * Get iommu class of pci devices on the bus.
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
*/
enum rte_iova_mode
rte_pci_get_iommu_class(void)
{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_device_has_iova_va();
+ is_bound_uio = pci_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == true ?
+ true : false;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
return RTE_IOVA_PA;
}
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-10-06 0:17 ` Thomas Monjalon
2017-10-06 3:22 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 0:17 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
20/09/2017 13:23, Santosh Shukla:
> +/** Device driver supports iova as va */
> +#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
This flag name is surprizing and the comment does not help.
For the comment:
"Device driver supports I/O virtual addressing" ?
For the flag:
RTE_PCI_DRV_IOVA ?
[...]
> /*
> - * Get iommu class of pci devices on the bus.
This line has been added in previous patch.
Please fix it earlier.
[...]
> +/*
> + * Any one of the device has iova as va
> + */
> +static inline int
> +pci_device_has_iova_va(void)
The name of this function does not suggest that it scans
every devices.
> +{
> + struct rte_pci_device *dev = NULL;
> + struct rte_pci_driver *drv = NULL;
> +
> + FOREACH_DRIVER_ON_PCIBUS(drv) {
> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> + if (dev->kdrv == RTE_KDRV_VFIO &&
> + rte_pci_match(drv, dev))
> + return 1;
> + }
This is the reason of exporting the match function?
(note: match() is bus driver function, so it should not be exported)
Just because you get every devices without driver filtering?
There should be a better solution.
Please try to compare drv with dev->driver.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: get iommu class
2017-10-06 0:17 ` Thomas Monjalon
@ 2017-10-06 3:22 ` santosh
2017-10-06 7:56 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-06 3:22 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Friday 06 October 2017 05:47 AM, Thomas Monjalon wrote:
> 20/09/2017 13:23, Santosh Shukla:
>> +/** Device driver supports iova as va */
>> +#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
> This flag name is surprizing and the comment does not help.
> For the comment:
> "Device driver supports I/O virtual addressing" ?
> For the flag:
> RTE_PCI_DRV_IOVA ?
Read [1].
V9 series went through evolution as a result of thorough review process.
That name kept like above is - "Not for FUN", its for reason and its purpose
to be explicit by saying that "driver need iova as va" mode. So as comment
aligned on top says so.
Aron suggested to remove [1] and squash into this patch and that I did.
Your proposition is incorrect, Should says IOVA_AS_VA explicitly!.
Request to follow work history, sorry I agains can't find you comment
logical.
[1] http://dpdk.org/dev/patchwork/patch/27000/
> [...]
>> /*
>> - * Get iommu class of pci devices on the bus.
> This line has been added in previous patch.
> Please fix it earlier.
What to fix? Be more explicit, can;t understand your comment.
> [...]
>> +/*
>> + * Any one of the device has iova as va
>> + */
>> +static inline int
>> +pci_device_has_iova_va(void)
> The name of this function does not suggest that it scans
> every devices.
Its not scanning, It search for kdrv match. You misunderstood.
disagree.
>> +{
>> + struct rte_pci_device *dev = NULL;
>> + struct rte_pci_driver *drv = NULL;
>> +
>> + FOREACH_DRIVER_ON_PCIBUS(drv) {
>> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
>> + FOREACH_DEVICE_ON_PCIBUS(dev) {
>> + if (dev->kdrv == RTE_KDRV_VFIO &&
>> + rte_pci_match(drv, dev))
>> + return 1;
>> + }
> This is the reason of exporting the match function?
> (note: match() is bus driver function, so it should not be exported)
> Just because you get every devices without driver filtering?
I disagree, It is a bus function abstraction code w.r.t iommu class of device,
in case you missed reading source code and Implementation is correct.
That needs exporting rte_pci_match(). Or else
write code and show your code snippet as illustration, I doubt that you really
understood this whole topic and its design theme.
Thanks.
> There should be a better solution.
> Please try to compare drv with dev->driver.
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: get iommu class
2017-10-06 3:22 ` santosh
@ 2017-10-06 7:56 ` Thomas Monjalon
0 siblings, 0 replies; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 7:56 UTC (permalink / raw)
To: santosh
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
06/10/2017 05:22, santosh:
>
> On Friday 06 October 2017 05:47 AM, Thomas Monjalon wrote:
> > 20/09/2017 13:23, Santosh Shukla:
> >> +/** Device driver supports iova as va */
> >> +#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
> > This flag name is surprizing and the comment does not help.
> > For the comment:
> > "Device driver supports I/O virtual addressing" ?
> > For the flag:
> > RTE_PCI_DRV_IOVA ?
>
> Read [1].
>
> V9 series went through evolution as a result of thorough review process.
> That name kept like above is - "Not for FUN", its for reason and its purpose
> to be explicit by saying that "driver need iova as va" mode. So as comment
> aligned on top says so.
>
> Aron suggested to remove [1] and squash into this patch and that I did.
>
> Your proposition is incorrect, Should says IOVA_AS_VA explicitly!.
> Request to follow work history, sorry I agains can't find you comment
> logical.
Yes my proposal is not good.
> [1] http://dpdk.org/dev/patchwork/patch/27000/
>
> > [...]
> >> /*
> >> - * Get iommu class of pci devices on the bus.
> > This line has been added in previous patch.
> > Please fix it earlier.
>
> What to fix? Be more explicit, can;t understand your comment.
You make this change:
- * Get iommu class of pci devices on the bus.
+ * Get iommu class of PCI devices on the bus.
It is better to write squash this uppercase change in
previous commit where you introduce this comment.
> > [...]
> >> +/*
> >> + * Any one of the device has iova as va
> >> + */
> >> +static inline int
> >> +pci_device_has_iova_va(void)
> > The name of this function does not suggest that it scans
> > every devices.
>
> Its not scanning, It search for kdrv match. You misunderstood.
> disagree.
Yes my wording is not understandable.
By "scanning", I mean interating on lists.
About the function name, it could be:
pci_one_device_has_iova_va
It better shows that the function check every devices.
> >> +{
> >> + struct rte_pci_device *dev = NULL;
> >> + struct rte_pci_driver *drv = NULL;
> >> +
> >> + FOREACH_DRIVER_ON_PCIBUS(drv) {
> >> + if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
> >> + FOREACH_DEVICE_ON_PCIBUS(dev) {
> >> + if (dev->kdrv == RTE_KDRV_VFIO &&
> >> + rte_pci_match(drv, dev))
> >> + return 1;
> >> + }
> > This is the reason of exporting the match function?
> > (note: match() is bus driver function, so it should not be exported)
> > Just because you get every devices without driver filtering?
>
> I disagree, It is a bus function abstraction code w.r.t iommu class of device,
> in case you missed reading source code and Implementation is correct.
> That needs exporting rte_pci_match(). Or else
> write code and show your code snippet as illustration, I doubt that you really
> understood this whole topic and its design theme.
OK, let's imagine I don't understand the whole topic.
> > There should be a better solution.
> > Please try to compare drv with dev->driver.
You could have answered that dev->driver is filled on probing
and you are doing the check before probing.
I don't want to continue this discussion.
We will rework which functions are exported when moving the PCI driver
out of EAL.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 4/9] bus: get iommu class
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (2 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 5/9] eal: introduce helper API for iova mode Santosh Shukla
` (6 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c6ffd9399..3466eaf20 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -244,5 +244,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 9e40687e5..70a291a4d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -178,6 +178,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -191,6 +205,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -290,6 +305,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a8c8ea4f4..9115aa3e9 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -249,5 +249,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 5/9] eal: introduce helper API for iova mode
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (3 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 4/9] bus: " Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 6/9] eal: auto detect " Santosh Shukla
` (5 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 5fa598842..07e72203f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 3466eaf20..6bed74dff 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -245,5 +245,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 0e7363d77..932dc1a96 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
#define RTE_INIT(func) \
static void __attribute__((constructor, used)) func(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 48f12f44c..febbafdb3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 9115aa3e9..8e49bf5fa 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -250,5 +250,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 6/9] eal: auto detect iova mode
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (4 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 5/9] eal: introduce helper API for iova mode Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-10-06 0:19 ` Thomas Monjalon
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (4 subsequent siblings)
10 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
For auto detection purpose:
* Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
Based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v6 --> v7:
- Moved eal_option_device_parse() up in then order of eal init.
- Added run_once. (aaron suggestion).
- squashed v6 series patch no. [08/12] & [09/12] into one patch (Aaron
comment)
lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 07e72203f..f003f4c04 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,17 +636,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index febbafdb3..f4901ffb6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,17 +911,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 6/9] eal: auto detect iova mode
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 6/9] eal: auto detect " Santosh Shukla
@ 2017-10-06 0:19 ` Thomas Monjalon
2017-10-06 3:25 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 0:19 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
20/09/2017 13:23, Santosh Shukla:
> For auto detection purpose:
> * Below calls moved up in the eal initialization order:
> - eal_option_device_parse
> - rte_bus_scan
>
> Based on the result of rte_bus_scan_iommu_class - select iova
> mapping mode.
It does not explain why you need to move things up.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 6/9] eal: auto detect iova mode
2017-10-06 0:19 ` Thomas Monjalon
@ 2017-10-06 3:25 ` santosh
2017-10-06 8:11 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-06 3:25 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Friday 06 October 2017 05:49 AM, Thomas Monjalon wrote:
> 20/09/2017 13:23, Santosh Shukla:
>> For auto detection purpose:
>> * Below calls moved up in the eal initialization order:
>> - eal_option_device_parse
>> - rte_bus_scan
>>
>> Based on the result of rte_bus_scan_iommu_class - select iova
>> mapping mode.
> It does not explain why you need to move things up.
For that one should understand eal_init sequence first.
Should know about _option_device_parse and rte_bus_scan() dependency.
After that bus_scan is a need for _get_iommu_class() of api to know that
- kdrv is igb/uio/vfio etc.. That's why. Refer work history.
Again V9 series happened not for fun. I diagress on your comment.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 6/9] eal: auto detect iova mode
2017-10-06 3:25 ` santosh
@ 2017-10-06 8:11 ` Thomas Monjalon
2017-10-06 9:11 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 8:11 UTC (permalink / raw)
To: santosh
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
06/10/2017 05:25, santosh:
>
> On Friday 06 October 2017 05:49 AM, Thomas Monjalon wrote:
> > 20/09/2017 13:23, Santosh Shukla:
> >> For auto detection purpose:
> >> * Below calls moved up in the eal initialization order:
> >> - eal_option_device_parse
> >> - rte_bus_scan
> >>
> >> Based on the result of rte_bus_scan_iommu_class - select iova
> >> mapping mode.
> > It does not explain why you need to move things up.
>
> For that one should understand eal_init sequence first.
> Should know about _option_device_parse and rte_bus_scan() dependency.
>
> After that bus_scan is a need for _get_iommu_class() of api to know that
> - kdrv is igb/uio/vfio etc.. That's why. Refer work history.
> Again V9 series happened not for fun. I diagress on your comment.
This is the basics of writing a commit log.
You have to explain why things are done.
You move things because of dependencies without explaining them.
And I'm pretty sure this move will cause big troubles.
For instance, have you tried shared library mode?
One more comment, you are considering only devices scanned at initialization.
What happens when a new device is plugged in?
I can push it as is, given there are some Reviewed-by and Tested-by.
I am trying to avoid you a revert of this patch when one will discover
some major bugs.
But I wonder whether it's worth given how you welcome it.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 6/9] eal: auto detect iova mode
2017-10-06 8:11 ` Thomas Monjalon
@ 2017-10-06 9:11 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-10-06 9:11 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Friday 06 October 2017 01:41 PM, Thomas Monjalon wrote:
> 06/10/2017 05:25, santosh:
>> On Friday 06 October 2017 05:49 AM, Thomas Monjalon wrote:
>>> 20/09/2017 13:23, Santosh Shukla:
>>>> For auto detection purpose:
>>>> * Below calls moved up in the eal initialization order:
>>>> - eal_option_device_parse
>>>> - rte_bus_scan
>>>>
>>>> Based on the result of rte_bus_scan_iommu_class - select iova
>>>> mapping mode.
>>> It does not explain why you need to move things up.
>> For that one should understand eal_init sequence first.
>> Should know about _option_device_parse and rte_bus_scan() dependency.
>>
>> After that bus_scan is a need for _get_iommu_class() of api to know that
>> - kdrv is igb/uio/vfio etc.. That's why. Refer work history.
>> Again V9 series happened not for fun. I diagress on your comment.
> This is the basics of writing a commit log.
> You have to explain why things are done.
> You move things because of dependencies without explaining them.
Agree, But if reader does reading from 0..5, by then he could understand
" auto detection purpose" reasoning.
Anyways, I'll add more context in patch summary in v10...sending..
> And I'm pretty sure this move will cause big troubles.
> For instance, have you tried shared library mode?
Its builds, also testpmd works.
> One more comment, you are considering only devices scanned at initialization.
> What happens when a new device is plugged in?
Should work.
in vfio mode: if PMDs(for that device) flag set to IOVA_AS_VA flag then newly
bound device will have iova=_va mapping mode.
Or else iova=_pa.
Thanks.
> I can push it as is, given there are some Reviewed-by and Tested-by.
> I am trying to avoid you a revert of this patch when one will discover
> some major bugs.
> But I wonder whether it's worth given how you welcome it.
>
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (5 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 6/9] eal: auto detect " Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (3 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (6 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 9/9] eal/rte_malloc: " Santosh Shukla
` (2 subsequent siblings)
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 52791282f..2d9d7c2dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v9 9/9] eal/rte_malloc: honor iova mode in virt2phy
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (7 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-09-20 11:23 ` Santosh Shukla
2017-09-26 4:02 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus santosh
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
10 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-09-20 11:23 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin,
Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (8 preceding siblings ...)
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 9/9] eal/rte_malloc: " Santosh Shukla
@ 2017-09-26 4:02 ` santosh
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
10 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-09-26 4:02 UTC (permalink / raw)
To: dev
Cc: olivier.matz, thomas, jerin.jacob, hemant.agrawal, aconole,
stephen, anatoly.burakov, gaetan.rivet, shreyansh.jain,
bruce.richardson, sergio.gonzalez.monroy, maxime.coquelin
Hi Thomas,
On Wednesday 20 September 2017 12:23 PM, Santosh Shukla wrote:
> v9:
> - Added Tested-By: to series.
> - Includes minor changes related to linuxapp api stub in [02/09]
> (Suggested by Anatoly)
> - Series rebased on tip commit : aee62e90
imo, series is ready to merge, note that octeontx pmd needs this + other mempool series,
we need them in -rc1 release. Can you pl. plan to merge this series in -rc1?
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 0/9] Infrastructure to detect iova mapping on the bus
2017-09-20 11:23 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus Santosh Shukla
` (9 preceding siblings ...)
2017-09-26 4:02 ` [dpdk-dev] [PATCH v9 0/9] Infrastructure to detect iova mapping on the bus santosh
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 1/9] eal/pci: export match function Santosh Shukla
` (9 more replies)
10 siblings, 10 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
v10:
- Added doxygen specific comment for iova mapping mode in patch [2/09]
(Suggested by Olivier)
- Added pci_one_ for pci_device_has_iova_va and other api for patch [3/9]
(Suggested by Olivier)
- Added More verbose description in patch summary for patch [6/09]
(Suggested by Olivier)
v9:
- Added Tested-By: to series.
- Includes minor changes related to linuxapp api stub in [02/09]
(Suggested by Anatoly)
- Series rebased on tip commit : aee62e90
v8:
Includes minor review changes per v7 review comment from Anatoly.
Patches rebased on Tip commit:3d2e0448eb.
v7:
Includes no major change, minor change detailing:
- patch sqashing (Aaron suggestion)
- added run_once for device_parse() and bus_scan() in eal init
(Aaron suggestion)
- Moved rte_eal_device_parse() up in eal initialization order.
- Patches rebased on top of version: 17.11-rc0
For v6 info refer [11].
v6:
Sending v5 series rebased on top of version: 17.11-rc0.
v5:
Introducing RTE_PCI_DRV_IOVA_AS_VA flag for autodetection of iova va
mapping.
If a PCI driver demand for IOVA as VA scheme then the driver can add
it in the
PCI driver registration function.
Algorithm to select IOVA as VA for PCI bus case:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
That way, Bus can truly autodetect the iova mapping mode for
a device Or a set of the device.
Change History:
v9 --> v10:
- Refer top description.
v8 --> v9:
- Added Tested-by: signature of Hemant.
- Added linuxapp stub api definition in [02/09] (Suggested by Amatoly)
v7 --> v8:
- Replace 0 / 1 with true/false boolean values (Suggested by Anatoly).
v6 --> v7:
- Patches squashed per v6.
- Added run_once in eal per v6.
- Moved rte_eal_device_parse() up in eal init oder.
v5 --> v6:
- Added api info in eal's versiom.map (release DPDK_v17.11).
v4 --> v5:
- Change DPDK_17.08 to DPDK_17.11 in _version.map.
- Reworded bus api description (suggested by Hemant).
- Added reviewed-by from Maxime in v5.
- Added acked-by from Hemant for pci and bus patches.
v3 --> v4:
- Re-introduced RTE_IOVA_DEC mode (Suggested by Hemant [5]).
- Renamed flag to RTE_PCI_DRV_IOVA_AS_VA (Suggested by Maxime).
- Reworded WARNING message(suggested by Maxime[7]).
- Created a separate patch for rte_pci_get_iommu_class (suggested by
Maxime[]).
- Added VFIO_PRESENT ifdef build fix.
v2 --> v3:
- Removed rte_mempool_virt2phy (suggested by Olivier [4])
v1 --> v2:
- Removed override eal option i.e. (--iova-mode=<>) Because we have
means to
truly autodetect the iova mode.
- Introduced RTE_PCI_DRV_NEED_IOVA_VA drv_flag (Suggested by Maxime [3]).
- Using NEED_IOVA_VA drv_flag in autodetection logic.
- Removed Linux version check macro in vfio code, As per Maxime feedback.
- Moved rte_pci_match API from local to global.
Patch Summary:
1) 1nd: declare rte_pci_match api in pci header. Required for
autodetection in
follow up patches.
2) 2nd - 3rd - 4th : autodetection mapping infrastructure for
Linux/bsdapp.
3) 5th: iova mode helper API.
4) 6th: Infra to detect iova mode.
5) 7th: make vfio mapping iova aware.
6) 8th - 9th : Check for IOVA_VA mode in below APIs
- rte_mem_virt2phy
- rte_malloc_virt2phy
Test History:
- Tested for x86/XL710 40G NIC card for both modes (iova_va/pa).
- Tested for arm64/thunderx vNIC Integrated NIC for both modes
- Tested for arm64/Octeontx integrated NICs for only
Iova_va mode(It supports only one mode.)
- Ran standalone tests like mempool_autotest, mbuf_autotest.
- Verified for Doxygen.
Work History:
For v1, Refer [1].
For v2, Refer [2].
For v3, Refer [9].
For v4, refer [10].
for v6, refer [11].
Checkpatch result:
* None
Thanks.,
[1] https://www.mail-archive.com/dev@dpdk.org/msg67438.html
[2] https://www.mail-archive.com/dev@dpdk.org/msg70674.html
[3] https://www.mail-archive.com/dev@dpdk.org/msg70279.html
[4] https://www.mail-archive.com/dev@dpdk.org/msg70692.html
[5] http://dpdk.org/ml/archives/dev/2017-July/071282.html
[6] http://dpdk.org/ml/archives/dev/2017-July/070951.html
[7] http://dpdk.org/ml/archives/dev/2017-July/070941.html
[8] http://dpdk.org/ml/archives/dev/2017-July/070952.html
[9] http://dpdk.org/ml/archives/dev/2017-July/070918.html
[10] http://dpdk.org/ml/archives/dev/2017-July/071754.html
[11] http://dpdk.org/ml/archives/dev/2017-August/072871.html
Santosh Shukla (9):
eal/pci: export match function
eal/pci: get iommu class
linuxapp/eal_pci: get iommu class
bus: get iommu class
eal: introduce helper API for iova mode
eal: auto detect iova mode
linuxapp/eal_vfio: honor iova mode before mapping
linuxapp/eal_memory: honor iova mode in virt2phy
eal/rte_malloc: honor iova mode in virt2phy
lib/librte_eal/bsdapp/eal/eal.c | 33 ++++++---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 +++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 10 +++
lib/librte_eal/common/eal_common_bus.c | 23 ++++++
lib/librte_eal/common/eal_common_pci.c | 11 +--
lib/librte_eal/common/include/rte_bus.h | 40 +++++++++++
lib/librte_eal/common/include/rte_eal.h | 12 ++++
lib/librte_eal/common/include/rte_pci.h | 28 ++++++++
lib/librte_eal/common/rte_malloc.c | 9 ++-
lib/librte_eal/linuxapp/eal/eal.c | 33 ++++++---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 96 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_vfio.c | 29 +++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 10 +++
15 files changed, 317 insertions(+), 34 deletions(-)
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 1/9] eal/pci: export match function
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 2/9] eal/pci: get iommu class Santosh Shukla
` (8 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 +++++++
lib/librte_eal/common/eal_common_pci.c | 10 +---------
lib/librte_eal/common/include/rte_pci.h | 15 +++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
4 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 47a09ea7f..cfbf8fbd0 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -238,3 +238,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 52fd38cdd..3b7d0a0ee 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -150,16 +150,8 @@ pci_unmap_resource(void *requested_addr, size_t size)
/*
* Match the PCI Driver and Device using the ID Table
- *
- * @param pci_drv
- * PCI driver from which ID table would be extracted
- * @param pci_dev
- * PCI device to match against the driver
- * @return
- * 1 for successful match
- * 0 for unsuccessful match
*/
-static int
+int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev)
{
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 8b123391c..eab84c7a4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -366,6 +366,21 @@ int rte_pci_scan(void);
int
rte_pci_probe(void);
+/*
+ * Match the PCI Driver and Device using the ID Table
+ *
+ * @param pci_drv
+ * PCI driver from which ID table would be extracted
+ * @param pci_dev
+ * PCI device to match against the driver
+ * @return
+ * 1 for successful match
+ * 0 for unsuccessful match
+ */
+int
+rte_pci_match(const struct rte_pci_driver *pci_drv,
+ const struct rte_pci_device *pci_dev);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 8c08b8d1e..287cc75cd 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -243,3 +243,10 @@ EXPERIMENTAL {
rte_service_start_with_defaults;
} DPDK_17.08;
+
+DPDK_17.11 {
+ global:
+
+ rte_pci_match;
+
+} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 2/9] eal/pci: get iommu class
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 1/9] eal/pci: export match function Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: " Santosh Shukla
` (7 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also adds rte_pci_get_iommu_class definition for:
- bsdapp: api returns default iova mode.
- linuxapp: Has stub implementation, Followup patch has complete
implementation.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/bsdapp/eal/eal_pci.c | 10 ++++++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_bus.h | 15 +++++++++++++++
lib/librte_eal/common/include/rte_pci.h | 11 +++++++++++
lib/librte_eal/linuxapp/eal/eal_pci.c | 9 +++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
6 files changed, 47 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ /* Supports only RTE_KDRV_NIC_UIO */
+ return RTE_IOVA_PA;
+}
+
int
pci_update_device(const struct rte_pci_addr *addr)
{
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index cfbf8fbd0..c6ffd9399 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 8f8b09954..e59c21659 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,21 @@ extern "C" {
/** Double linked list of buses */
TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * IOVA mapping mode.
+ *
+ * IOVA mapping mode is iommu programming mode of a device.
+ * That device(for example: iommu backed dma device) based
+ * on rte_iova_mode will generate physical or virtual address.
+ *
+ */
+enum rte_iova_mode {
+ RTE_IOVA_DC = 0, /* Don't care mode */
+ RTE_IOVA_PA = (1 << 0), /* DMA using physical address */
+ RTE_IOVA_VA = (1 << 1) /* DMA using virtual address */
+};
+
/**
* Bus specific scan for devices attached on the bus.
* For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index eab84c7a4..0e36de093 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -381,6 +381,17 @@ int
rte_pci_match(const struct rte_pci_driver *pci_drv,
const struct rte_pci_device *pci_dev);
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ * - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
/**
* Map the PCI device resources in user space virtual memory address
*
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..26f2be822 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -487,6 +487,15 @@ rte_pci_scan(void)
return -1;
}
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+ return RTE_IOVA_PA;
+}
+
/* Read PCI config space. */
int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 287cc75cd..a8c8ea4f4 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@ DPDK_17.11 {
global:
rte_pci_match;
+ rte_pci_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 1/9] eal/pci: export match function Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 2/9] eal/pci: get iommu class Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-11 1:47 ` Tan, Jianfeng
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 4/9] bus: " Santosh Shukla
` (6 subsequent siblings)
9 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/common/include/rte_pci.h | 2 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 89 ++++++++++++++++++++++++++++++++-
lib/librte_eal/linuxapp/eal/eal_vfio.c | 19 +++++++
lib/librte_eal/linuxapp/eal/eal_vfio.h | 4 ++
4 files changed, 113 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 0e36de093..a67d77f22 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -202,6 +202,8 @@ struct rte_pci_bus {
#define RTE_PCI_DRV_INTR_RMV 0x0010
/** Device driver needs to keep mapped resources if unsupported dev detected */
#define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
+/** Device driver supports iova as va */
+#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
/**
* A structure describing a PCI mapping.
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 26f2be822..b4dbf953a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -45,6 +45,7 @@
#include "eal_filesystem.h"
#include "eal_private.h"
#include "eal_pci_init.h"
+#include "eal_vfio.h"
/**
* @file
@@ -488,11 +489,97 @@ rte_pci_scan(void)
}
/*
- * Get iommu class of pci devices on the bus.
+ * Is pci device bound to any kdrv
+ */
+static inline int
+pci_one_device_is_bound(void)
+{
+ struct rte_pci_device *dev = NULL;
+ int ret = 0;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_UNKNOWN ||
+ dev->kdrv == RTE_KDRV_NONE) {
+ continue;
+ } else {
+ ret = 1;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Any one of the device bound to uio
+ */
+static inline int
+pci_one_device_bound_uio(void)
+{
+ struct rte_pci_device *dev = NULL;
+
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_IGB_UIO ||
+ dev->kdrv == RTE_KDRV_UIO_GENERIC) {
+ return 1;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Any one of the device has iova as va
+ */
+static inline int
+pci_one_device_has_iova_va(void)
+{
+ struct rte_pci_device *dev = NULL;
+ struct rte_pci_driver *drv = NULL;
+
+ FOREACH_DRIVER_ON_PCIBUS(drv) {
+ if (drv && drv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) {
+ FOREACH_DEVICE_ON_PCIBUS(dev) {
+ if (dev->kdrv == RTE_KDRV_VFIO &&
+ rte_pci_match(drv, dev))
+ return 1;
+ }
+ }
+ }
+ return 0;
+}
+
+/*
+ * Get iommu class of PCI devices on the bus.
*/
enum rte_iova_mode
rte_pci_get_iommu_class(void)
{
+ bool is_bound;
+ bool is_vfio_noiommu_enabled = true;
+ bool has_iova_va;
+ bool is_bound_uio;
+
+ is_bound = pci_one_device_is_bound();
+ if (!is_bound)
+ return RTE_IOVA_DC;
+
+ has_iova_va = pci_one_device_has_iova_va();
+ is_bound_uio = pci_one_device_bound_uio();
+#ifdef VFIO_PRESENT
+ is_vfio_noiommu_enabled = vfio_noiommu_is_enabled() == true ?
+ true : false;
+#endif
+
+ if (has_iova_va && !is_bound_uio && !is_vfio_noiommu_enabled)
+ return RTE_IOVA_VA;
+
+ if (has_iova_va) {
+ RTE_LOG(WARNING, EAL, "Some devices want iova as va but pa will be used because.. ");
+ if (is_vfio_noiommu_enabled)
+ RTE_LOG(WARNING, EAL, "vfio-noiommu mode configured\n");
+ if (is_bound_uio)
+ RTE_LOG(WARNING, EAL, "few device bound to UIO\n");
+ }
+
return RTE_IOVA_PA;
}
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e31..c8a97b7e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -816,4 +816,23 @@ vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
return 0;
}
+int
+vfio_noiommu_is_enabled(void)
+{
+ int fd, ret, cnt __rte_unused;
+ char c;
+
+ ret = -1;
+ fd = open(VFIO_NOIOMMU_MODE, O_RDONLY);
+ if (fd < 0)
+ return -1;
+
+ cnt = read(fd, &c, 1);
+ if (c == 'Y')
+ ret = 1;
+
+ close(fd);
+ return ret;
+}
+
#endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 5ff63e5d7..26ea8e119 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
@@ -150,6 +150,8 @@ struct vfio_config {
#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
#define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)
#define VFIO_GET_REGION_IDX(x) (x >> 40)
+#define VFIO_NOIOMMU_MODE \
+ "/sys/module/vfio/parameters/enable_unsafe_noiommu_mode"
/* DMA mapping function prototype.
* Takes VFIO container fd as a parameter.
@@ -210,6 +212,8 @@ int pci_vfio_is_enabled(void);
int vfio_mp_sync_setup(void);
+int vfio_noiommu_is_enabled(void);
+
#define SOCKET_REQ_CONTAINER 0x100
#define SOCKET_REQ_GROUP 0x200
#define SOCKET_CLR_GROUP 0x300
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-10-11 1:47 ` Tan, Jianfeng
2017-10-11 4:43 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Tan, Jianfeng @ 2017-10-11 1:47 UTC (permalink / raw)
To: Santosh Shukla, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
Hi,
Nice patch series. But I still have a small question about below flag.
On 10/6/2017 7:03 PM, Santosh Shukla wrote:
> Get iommu class of PCI device on the bus and returns preferred iova
> mapping mode for that bus.
>
> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
> Flag used when driver needs to operate in iova=va mode.
>
Does this flag indicate a must to use VA as IOVA, or a nice-to-have one?
In detail, above commit log says, "needs to operate in iova=va mode",
but the comment in the patch indicates this flag means "driver supports
IOVA as VA".
If it's the latter case, I would suppose all drivers support to use VA
as IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct
me if I'm wrong.
Thanks,
Jianfeng
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 1:47 ` Tan, Jianfeng
@ 2017-10-11 4:43 ` santosh
2017-10-11 5:31 ` Tan, Jianfeng
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-11 4:43 UTC (permalink / raw)
To: Tan, Jianfeng, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Wednesday 11 October 2017 07:17 AM, Tan, Jianfeng wrote:
> Hi,
>
> Nice patch series. But I still have a small question about below flag.
>
>
> On 10/6/2017 7:03 PM, Santosh Shukla wrote:
>> Get iommu class of PCI device on the bus and returns preferred iova
>> mapping mode for that bus.
>>
>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>> Flag used when driver needs to operate in iova=va mode.
>>
> Does this flag indicate a must to use VA as IOVA, or a nice-to-have one? In detail, above commit log says, "needs to operate in iova=va mode", but the comment in the patch indicates this flag means "driver supports IOVA as VA".
>
> If it's the latter case, I would suppose all drivers support to use VA as IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct me if I'm wrong.
>
- Any iommu backed pmd could choose to use this flag.
- Reasoning for need was performance for our external mempool pmd: avoid phy2virt translation on
mbuf thus save cycles.
> Thanks,
> Jianfeng
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 4:43 ` santosh
@ 2017-10-11 5:31 ` Tan, Jianfeng
2017-10-11 5:37 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Tan, Jianfeng @ 2017-10-11 5:31 UTC (permalink / raw)
To: santosh, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On 10/11/2017 12:43 PM, santosh wrote:
> On Wednesday 11 October 2017 07:17 AM, Tan, Jianfeng wrote:
>> Hi,
>>
>> Nice patch series. But I still have a small question about below flag.
>>
>>
>> On 10/6/2017 7:03 PM, Santosh Shukla wrote:
>>> Get iommu class of PCI device on the bus and returns preferred iova
>>> mapping mode for that bus.
>>>
>>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>>> Flag used when driver needs to operate in iova=va mode.
>>>
>> Does this flag indicate a must to use VA as IOVA, or a nice-to-have one? In detail, above commit log says, "needs to operate in iova=va mode", but the comment in the patch indicates this flag means "driver supports IOVA as VA".
>>
>> If it's the latter case, I would suppose all drivers support to use VA as IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct me if I'm wrong.
>>
> - Any iommu backed pmd could choose to use this flag.
But if this is characterized by assumption for all PMDs, why do we
trouble to introduce this flag.
> - Reasoning for need was performance for our external mempool pmd: avoid phy2virt translation on
> mbuf thus save cycles.
>
Agreed, and it's also for running DPDK without root privilege.
Thanks,
Jianfeng
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 5:31 ` Tan, Jianfeng
@ 2017-10-11 5:37 ` santosh
2017-10-11 7:04 ` Tan, Jianfeng
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-11 5:37 UTC (permalink / raw)
To: Tan, Jianfeng, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
On Wednesday 11 October 2017 11:01 AM, Tan, Jianfeng wrote:
>
>
> On 10/11/2017 12:43 PM, santosh wrote:
>> On Wednesday 11 October 2017 07:17 AM, Tan, Jianfeng wrote:
>>> Hi,
>>>
>>> Nice patch series. But I still have a small question about below flag.
>>>
>>>
>>> On 10/6/2017 7:03 PM, Santosh Shukla wrote:
>>>> Get iommu class of PCI device on the bus and returns preferred iova
>>>> mapping mode for that bus.
>>>>
>>>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>>>> Flag used when driver needs to operate in iova=va mode.
>>>>
>>> Does this flag indicate a must to use VA as IOVA, or a nice-to-have one? In detail, above commit log says, "needs to operate in iova=va mode", but the comment in the patch indicates this flag means "driver supports IOVA as VA".
>>>
>>> If it's the latter case, I would suppose all drivers support to use VA as IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct me if I'm wrong.
>>>
>> - Any iommu backed pmd could choose to use this flag.
>
> But if this is characterized by assumption for all PMDs, why do we trouble to introduce this flag.
>
to hint bus layer about iova=va mapping choice for _this_ driver and default is iova=pa.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 5:37 ` santosh
@ 2017-10-11 7:04 ` Tan, Jianfeng
2017-10-11 7:10 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Tan, Jianfeng @ 2017-10-11 7:04 UTC (permalink / raw)
To: santosh, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen, Burakov,
Anatoly, gaetan.rivet, shreyansh.jain, Richardson, Bruce,
Gonzalez Monroy, Sergio, maxime.coquelin
> -----Original Message-----
> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
> Sent: Wednesday, October 11, 2017 1:38 PM
> To: Tan, Jianfeng; olivier.matz@6wind.com; dev@dpdk.org
> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
> hemant.agrawal@nxp.com; aconole@redhat.com;
> stephen@networkplumber.org; Burakov, Anatoly; gaetan.rivet@6wind.com;
> shreyansh.jain@nxp.com; Richardson, Bruce; Gonzalez Monroy, Sergio;
> maxime.coquelin@redhat.com
> Subject: Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
>
>
> On Wednesday 11 October 2017 11:01 AM, Tan, Jianfeng wrote:
> >
> >
> > On 10/11/2017 12:43 PM, santosh wrote:
> >> On Wednesday 11 October 2017 07:17 AM, Tan, Jianfeng wrote:
> >>> Hi,
> >>>
> >>> Nice patch series. But I still have a small question about below flag.
> >>>
> >>>
> >>> On 10/6/2017 7:03 PM, Santosh Shukla wrote:
> >>>> Get iommu class of PCI device on the bus and returns preferred iova
> >>>> mapping mode for that bus.
> >>>>
> >>>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
> >>>> Flag used when driver needs to operate in iova=va mode.
> >>>>
> >>> Does this flag indicate a must to use VA as IOVA, or a nice-to-have one?
> In detail, above commit log says, "needs to operate in iova=va mode", but
> the comment in the patch indicates this flag means "driver supports IOVA as
> VA".
> >>>
> >>> If it's the latter case, I would suppose all drivers support to use VA as
> IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct me if
> I'm wrong.
> >>>
> >> - Any iommu backed pmd could choose to use this flag.
> >
> > But if this is characterized by assumption for all PMDs, why do we trouble
> to introduce this flag.
> >
> to hint bus layer about iova=va mapping choice for _this_ driver and default
> is iova=pa.
>
So that sounds if this flag is set by some PMD, we must use iova=va.
Then how about we enable this, iova=va, if only all PCI devices are binded to vfio-pci (iommu mode)?
Thanks,
Jianfeng
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 7:04 ` Tan, Jianfeng
@ 2017-10-11 7:10 ` santosh
2017-10-11 8:31 ` Tan, Jianfeng
0 siblings, 1 reply; 248+ messages in thread
From: santosh @ 2017-10-11 7:10 UTC (permalink / raw)
To: Tan, Jianfeng, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen, Burakov,
Anatoly, gaetan.rivet, shreyansh.jain, Richardson, Bruce,
Gonzalez Monroy, Sergio, maxime.coquelin
On Wednesday 11 October 2017 12:34 PM, Tan, Jianfeng wrote:
>
>> -----Original Message-----
>> From: santosh [mailto:santosh.shukla@caviumnetworks.com]
>> Sent: Wednesday, October 11, 2017 1:38 PM
>> To: Tan, Jianfeng; olivier.matz@6wind.com; dev@dpdk.org
>> Cc: thomas@monjalon.net; jerin.jacob@caviumnetworks.com;
>> hemant.agrawal@nxp.com; aconole@redhat.com;
>> stephen@networkplumber.org; Burakov, Anatoly; gaetan.rivet@6wind.com;
>> shreyansh.jain@nxp.com; Richardson, Bruce; Gonzalez Monroy, Sergio;
>> maxime.coquelin@redhat.com
>> Subject: Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
>>
>>
>> On Wednesday 11 October 2017 11:01 AM, Tan, Jianfeng wrote:
>>>
>>> On 10/11/2017 12:43 PM, santosh wrote:
>>>> On Wednesday 11 October 2017 07:17 AM, Tan, Jianfeng wrote:
>>>>> Hi,
>>>>>
>>>>> Nice patch series. But I still have a small question about below flag.
>>>>>
>>>>>
>>>>> On 10/6/2017 7:03 PM, Santosh Shukla wrote:
>>>>>> Get iommu class of PCI device on the bus and returns preferred iova
>>>>>> mapping mode for that bus.
>>>>>>
>>>>>> Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
>>>>>> Flag used when driver needs to operate in iova=va mode.
>>>>>>
>>>>> Does this flag indicate a must to use VA as IOVA, or a nice-to-have one?
>> In detail, above commit log says, "needs to operate in iova=va mode", but
>> the comment in the patch indicates this flag means "driver supports IOVA as
>> VA".
>>>>> If it's the latter case, I would suppose all drivers support to use VA as
>> IOVA, if the NICs are binded to vfio-pci (iommu mode). Please correct me if
>> I'm wrong.
>>>> - Any iommu backed pmd could choose to use this flag.
>>> But if this is characterized by assumption for all PMDs, why do we trouble
>> to introduce this flag.
>> to hint bus layer about iova=va mapping choice for _this_ driver and default
>> is iova=pa.
>>
> So that sounds if this flag is set by some PMD, we must use iova=va.
>
> Then how about we enable this, iova=va, if only all PCI devices are binded to vfio-pci (iommu mode)?
Right, same I proposed (I guess) in v2 such that iova bus autodetecting in case see all device bound
to vfio-pci then autoselect iova=va, in v3 series (I guess) discussion: it was concluded that
better to send hint from driver. Refer work history, though iova bus still does said
auto-detection.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 7:10 ` santosh
@ 2017-10-11 8:31 ` Tan, Jianfeng
2017-10-11 8:51 ` santosh
0 siblings, 1 reply; 248+ messages in thread
From: Tan, Jianfeng @ 2017-10-11 8:31 UTC (permalink / raw)
To: santosh, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen, Burakov,
Anatoly, gaetan.rivet, shreyansh.jain, Richardson, Bruce,
Gonzalez Monroy, Sergio, maxime.coquelin
> > Then how about we enable this, iova=va, if only all PCI devices are binded
> to vfio-pci (iommu mode)?
>
> Right, same I proposed (I guess) in v2 such that iova bus autodetecting in
> case see all device bound
> to vfio-pci then autoselect iova=va, in v3 series (I guess) discussion: it was
> concluded that
> better to send hint from driver. Refer work history, though iova bus still does
> said
> auto-detection.
Sorry I missed that. I tend to think that almost all PMDs for physical devices shall add this flag then.
Thanks,
Jianfeng
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: get iommu class
2017-10-11 8:31 ` Tan, Jianfeng
@ 2017-10-11 8:51 ` santosh
0 siblings, 0 replies; 248+ messages in thread
From: santosh @ 2017-10-11 8:51 UTC (permalink / raw)
To: Tan, Jianfeng, olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen, Burakov,
Anatoly, gaetan.rivet, shreyansh.jain, Richardson, Bruce,
Gonzalez Monroy, Sergio, maxime.coquelin
On Wednesday 11 October 2017 02:01 PM, Tan, Jianfeng wrote:
>>> Then how about we enable this, iova=va, if only all PCI devices are binded
>> to vfio-pci (iommu mode)?
>>
>> Right, same I proposed (I guess) in v2 such that iova bus autodetecting in
>> case see all device bound
>> to vfio-pci then autoselect iova=va, in v3 series (I guess) discussion: it was
>> concluded that
>> better to send hint from driver. Refer work history, though iova bus still does
>> said
>> auto-detection.
> Sorry I missed that. I tend to think that almost all PMDs for physical devices shall add this flag then.
IMO +1, But decision is upto PMD owner.
Thanks.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 4/9] bus: get iommu class
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (2 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 3/9] linuxapp/eal_pci: " Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 5/9] eal: introduce helper API for iova mode Santosh Shukla
` (5 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_bus.c | 23 +++++++++++++++++++++++
lib/librte_eal/common/eal_common_pci.c | 1 +
lib/librte_eal/common/include/rte_bus.h | 25 +++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 51 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c6ffd9399..3466eaf20 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -244,5 +244,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 08bec2d93..a30a8982e 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
c[0] = '\0';
return rte_bus_find(NULL, bus_can_parse, name);
}
+
+
+/*
+ * Get iommu class of devices on the bus.
+ */
+enum rte_iova_mode
+rte_bus_get_iommu_class(void)
+{
+ int mode = RTE_IOVA_DC;
+ struct rte_bus *bus;
+
+ TAILQ_FOREACH(bus, &rte_bus_list, next) {
+
+ if (bus->get_iommu_class)
+ mode |= bus->get_iommu_class();
+ }
+
+ if (mode != RTE_IOVA_VA) {
+ /* Use default IOVA mode */
+ mode = RTE_IOVA_PA;
+ }
+ return mode;
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 3b7d0a0ee..0f0e4b93b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -564,6 +564,7 @@ struct rte_pci_bus rte_pci_bus = {
.plug = pci_plug,
.unplug = pci_unplug,
.parse = pci_parse,
+ .get_iommu_class = rte_pci_get_iommu_class,
},
.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index e59c21659..3a5891595 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -183,6 +183,20 @@ struct rte_bus_conf {
enum rte_bus_scan_mode scan_mode; /**< Scan policy. */
};
+
+/**
+ * Get common iommu class of the all the devices on the bus. The bus may
+ * check that those devices are attached to iommu driver.
+ * If no devices are attached to the bus. The bus may return with don't care
+ * (_DC) value.
+ * Otherwise, The bus will return appropriate _pa or _va iova mode.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+typedef enum rte_iova_mode (*rte_bus_get_iommu_class_t)(void);
+
+
/**
* A structure describing a generic bus.
*/
@@ -196,6 +210,7 @@ struct rte_bus {
rte_bus_unplug_t unplug; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
struct rte_bus_conf conf; /**< Bus configuration */
+ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
};
/**
@@ -295,6 +310,16 @@ struct rte_bus *rte_bus_find_by_device(const struct rte_device *dev);
*/
struct rte_bus *rte_bus_find_by_name(const char *busname);
+
+/**
+ * Get the common iommu class of devices bound on to buses available in the
+ * system. The default mode is PA.
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_bus_get_iommu_class(void);
+
/**
* Helper for Bus registration.
* The constructor has higher priority than PMD constructors.
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index a8c8ea4f4..9115aa3e9 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -249,5 +249,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
+ rte_bus_get_iommu_class;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 5/9] eal: introduce helper API for iova mode
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (3 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 4/9] bus: " Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 6/9] eal: auto detect " Santosh Shukla
` (4 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 6 ++++++
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 12 ++++++++++++
lib/librte_eal/linuxapp/eal/eal.c | 6 ++++++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
5 files changed, 26 insertions(+)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 5fa598842..07e72203f 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -119,6 +119,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 3466eaf20..6bed74dff 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -245,5 +245,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 559d2308e..436094d24 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -45,6 +45,7 @@
#include <rte_per_lcore.h>
#include <rte_config.h>
+#include <rte_bus.h>
#ifdef __cplusplus
extern "C" {
@@ -87,6 +88,9 @@ struct rte_config {
/** Primary or secondary configuration */
enum rte_proc_type_t process_type;
+ /** PA or VA mapping mode */
+ enum rte_iova_mode iova_mode;
+
/**
* Pointer to memory configuration, which may be shared across multiple
* DPDK instances
@@ -287,6 +291,14 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
}
+/**
+ * Get the iova mode
+ *
+ * @return
+ * enum rte_iova_mode value.
+ */
+enum rte_iova_mode rte_eal_iova_mode(void);
+
/**
* Run function before main() with low priority.
*
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 48f12f44c..febbafdb3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -128,6 +128,12 @@ rte_eal_get_configuration(void)
return &rte_config;
}
+enum rte_iova_mode
+rte_eal_iova_mode(void)
+{
+ return rte_eal_get_configuration()->iova_mode;
+}
+
/* parse a sysfs (or other) file containing one integer value */
int
eal_parse_sysfs_value(const char *filename, unsigned long *val)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 9115aa3e9..8e49bf5fa 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -250,5 +250,6 @@ DPDK_17.11 {
rte_pci_match;
rte_pci_get_iommu_class;
rte_bus_get_iommu_class;
+ rte_eal_iova_mode;
} DPDK_17.08;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 6/9] eal: auto detect iova mode
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (4 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 5/9] eal: introduce helper API for iova mode Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-13 8:48 ` Maxime Coquelin
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
` (3 subsequent siblings)
9 siblings, 1 reply; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
iova autodetection depends on rte_bus_scan result. Result of bus scan will
have updated device_list and each device in that list has its '.kdev' state
updated. That kdrv state used to detect iova mapping mode for that device.
_device_parse() has dependency on rt_bus_scan so,
Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
And based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 07e72203f..f003f4c04 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -541,6 +541,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
eal_hugepage_info_init() < 0) {
@@ -620,17 +636,6 @@ rte_eal_init(int argc, char **argv)
rte_config.master_lcore, thread_id, cpuset,
ret == 0 ? "" : "...");
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index febbafdb3..f4901ffb6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -798,6 +798,22 @@ rte_eal_init(int argc, char **argv)
return -1;
}
+ if (eal_option_device_parse()) {
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ if (rte_bus_scan()) {
+ rte_eal_init_alert("Cannot scan the buses for devices\n");
+ rte_errno = ENODEV;
+ rte_atomic32_clear(&run_once);
+ return -1;
+ }
+
+ /* autodetect the iova mapping mode (default is iova_pa) */
+ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
@@ -895,17 +911,6 @@ rte_eal_init(int argc, char **argv)
return -1;
}
- if (eal_option_device_parse()) {
- rte_errno = ENODEV;
- return -1;
- }
-
- if (rte_bus_scan()) {
- rte_eal_init_alert("Cannot scan the buses for devices\n");
- rte_errno = ENODEV;
- return -1;
- }
-
RTE_LCORE_FOREACH_SLAVE(i) {
/*
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 6/9] eal: auto detect iova mode
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 6/9] eal: auto detect " Santosh Shukla
@ 2017-10-13 8:48 ` Maxime Coquelin
2017-10-13 9:58 ` Thomas Monjalon
0 siblings, 1 reply; 248+ messages in thread
From: Maxime Coquelin @ 2017-10-13 8:48 UTC (permalink / raw)
To: Santosh Shukla, olivier.matz, dev, thomas
Cc: jerin.jacob, hemant.agrawal, aconole, stephen, anatoly.burakov,
gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy
Hi Santosh,
On 10/06/2017 01:03 PM, Santosh Shukla wrote:
> iova autodetection depends on rte_bus_scan result. Result of bus scan will
> have updated device_list and each device in that list has its '.kdev' state
> updated. That kdrv state used to detect iova mapping mode for that device.
>
> _device_parse() has dependency on rt_bus_scan so,
> Below calls moved up in the eal initialization order:
> - eal_option_device_parse
> - rte_bus_scan
>
> And based on the result of rte_bus_scan_iommu_class - select iova
> mapping mode.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
> lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
> lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
> 2 files changed, 32 insertions(+), 22 deletions(-)
We noticed a regression on current master, which prevents to use Vhost
PMD with CONFIG_RTE_BUILD_SHARED_LIB=y:
# ./install/bin/testpmd --file-prefix=src -l 0,2 -n 4 --vdev
'net_vhost0,iface=/tmp/vhost-user2' -d ./install/lib/librte_pmd_vhost.so
-- --portmask=1 --disable-hw-vlan -i --rxq=1 --txq=1 --nb-cores=1
--eth-peer=0,52:54:00:11:22:12
EAL: Detected 4 lcore(s)
ERROR: failed to parse device "net_vhost0"
EAL: Unable to parse device 'net_vhost0,iface=/tmp/vhost-user2'
PANIC in main():
Cannot init EAL
5: [./install/bin/testpmd(_start+0x2a) [0x41e91a]]
4: [/lib64/libc.so.6(__libc_start_main+0xea) [0x7f551882550a]]
3: [./install/bin/testpmd(main+0x68e) [0x41e77e]]
2:
[/home/max/projects/src/mainline/dpdk/x86_64-native-linuxapp-gcc/lib/librte_eal.so.5.1(__rte_panic+0xba)
[0x7f551982c05a]]
1:
[/home/max/projects/src/mainline/dpdk/x86_64-native-linuxapp-gcc/lib/librte_eal.so.5.1(rte_dump_stack+0x1b)
[0x7f551983645b]]
Aborted (core dumped)
Git bisect seems to point to this patch:
$ git bisect log
git bisect start
# bad: [5518fc95427891e8bcf72f461cdaa38604226442] mempool/dpaa2: improve
error handling
git bisect bad 5518fc95427891e8bcf72f461cdaa38604226442
# good: [02657b4adcb8af773e26ec061b01cd7abdd3f0b6] version: 17.08.0
git bisect good 02657b4adcb8af773e26ec061b01cd7abdd3f0b6
# good: [4fa5e0bbc5730887a4a15b915bb15deb5ef1f607] net/dpaa: support
hashed RSS
git bisect good 4fa5e0bbc5730887a4a15b915bb15deb5ef1f607
# bad: [381acec2b1bd838c4a494b82c692db35573554da] eventdev: ease
single-link queue config requirements
git bisect bad 381acec2b1bd838c4a494b82c692db35573554da
# bad: [f1810113590373b157ebba555d6b51f38c8ca10f] config: enable igb_uio
on arm64
git bisect bad f1810113590373b157ebba555d6b51f38c8ca10f
# good: [69293c7762a0dbb3c28f5e93be00aaa49b52cb48] bus/fslmc: remove
unused funcs and align names in QBMAN
git bisect good 69293c7762a0dbb3c28f5e93be00aaa49b52cb48
# good: [f8244c6399d9fae6afab6770ae367aef38742ea5] ethdev: increase port
id range
git bisect good f8244c6399d9fae6afab6770ae367aef38742ea5
# bad: [680f6c12600f5d341c5968a1daeef7c5a055451b] mem: honor IOVA mode
in virt2phy
git bisect bad 680f6c12600f5d341c5968a1daeef7c5a055451b
# good: [a4f0a2dbe5abc2cadf0300fb4d5767b66254035d] pci: get IOMMU class
git bisect good a4f0a2dbe5abc2cadf0300fb4d5767b66254035d
# good: [93878cf0255e9dc21322ed99ad535adc048fa44f] eal: introduce helper
API for IOVA mode
git bisect good 93878cf0255e9dc21322ed99ad535adc048fa44f
# bad: [e85a919286d2543500bc384df206740845e85362] vfio: honor IOVA mode
before mapping
git bisect bad e85a919286d2543500bc384df206740845e85362
# bad: [cf408c22476c9f866deacac634dd17591e07a5c5] eal: auto detect IOVA mode
git bisect bad cf408c22476c9f866deacac634dd17591e07a5c5
# first bad commit: [cf408c22476c9f866deacac634dd17591e07a5c5] eal: auto
detect IOVA mode
This is the build commands I used to run the bisection:
sed -i 's/CONFIG_RTE_BUILD_SHARED_LIB=n/CONFIG_RTE_BUILD_SHARED_LIB=y/g'
config/common_base
make -j4 install T=x86_64-native-linuxapp-gcc DESTDIR=install
EXTRA_CFLAGS='-g'
sed -i 's/CONFIG_RTE_BUILD_SHARED_LIB=y/CONFIG_RTE_BUILD_SHARED_LIB=n/g'
config/common_base
Regards,
Maxime
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 6/9] eal: auto detect iova mode
2017-10-13 8:48 ` Maxime Coquelin
@ 2017-10-13 9:58 ` Thomas Monjalon
0 siblings, 0 replies; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-13 9:58 UTC (permalink / raw)
To: Maxime Coquelin, Santosh Shukla
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy
13/10/2017 10:48, Maxime Coquelin:
> Hi Santosh,
>
> On 10/06/2017 01:03 PM, Santosh Shukla wrote:
> > iova autodetection depends on rte_bus_scan result. Result of bus scan will
> > have updated device_list and each device in that list has its '.kdev' state
> > updated. That kdrv state used to detect iova mapping mode for that device.
> >
> > _device_parse() has dependency on rt_bus_scan so,
> > Below calls moved up in the eal initialization order:
> > - eal_option_device_parse
> > - rte_bus_scan
> >
> > And based on the result of rte_bus_scan_iommu_class - select iova
> > mapping mode.
> >
> > Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> > ---
> > lib/librte_eal/bsdapp/eal/eal.c | 27 ++++++++++++++++-----------
> > lib/librte_eal/linuxapp/eal/eal.c | 27 ++++++++++++++++-----------
> > 2 files changed, 32 insertions(+), 22 deletions(-)
>
> We noticed a regression on current master, which prevents to use Vhost
> PMD with CONFIG_RTE_BUILD_SHARED_LIB=y:
It was my guess during review:
http://dpdk.org/ml/archives/dev/2017-October/077863.html
I really don't understand how it can work,
because the bus scan is moved before shared libraries (plugins)
are loaded.
It will be even worst when PCI and vdev buses will be some
shared libraries.
Is it a chicken/egg issue?
If we cannot find a good solution, we may have to revert for RC1.
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 7/9] linuxapp/eal_vfio: honor iova mode before mapping
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (5 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 6/9] eal: auto detect " Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
` (2 subsequent siblings)
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index c8a97b7e7..b32cd09a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
@@ -792,7 +795,10 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 8/9] linuxapp/eal_memory: honor iova mode in virt2phy
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (6 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 7/9] linuxapp/eal_vfio: honor iova mode before mapping Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 9/9] eal/rte_malloc: " Santosh Shukla
2017-10-06 18:40 ` [dpdk-dev] [PATCH v10 0/9] Infrastructure to detect iova mapping on the bus Thomas Monjalon
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 52791282f..2d9d7c2dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -139,6 +139,9 @@ rte_mem_virt2phy(const void *virtaddr)
int page_size;
off_t offset;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ return (uintptr_t)virtaddr;
+
/* when using dom0, /proc/self/pagemap always returns 0, check in
* dpdk memory by browsing the memsegs */
if (rte_xen_dom0_supported()) {
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* [dpdk-dev] [PATCH v10 9/9] eal/rte_malloc: honor iova mode in virt2phy
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (7 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 8/9] linuxapp/eal_memory: honor iova mode in virt2phy Santosh Shukla
@ 2017-10-06 11:03 ` Santosh Shukla
2017-10-06 18:40 ` [dpdk-dev] [PATCH v10 0/9] Infrastructure to detect iova mapping on the bus Thomas Monjalon
9 siblings, 0 replies; 248+ messages in thread
From: Santosh Shukla @ 2017-10-06 11:03 UTC (permalink / raw)
To: olivier.matz, dev
Cc: thomas, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin, Santosh Shukla
Check iova mode and accordingly return phy addr.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
lib/librte_eal/common/rte_malloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 5c0627bf4..d65c05a4d 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -251,10 +251,17 @@ rte_malloc_set_limit(__rte_unused const char *type,
phys_addr_t
rte_malloc_virt2phy(const void *addr)
{
+ phys_addr_t paddr;
const struct malloc_elem *elem = malloc_elem_from_data(addr);
if (elem == NULL)
return RTE_BAD_PHYS_ADDR;
if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR)
return RTE_BAD_PHYS_ADDR;
- return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ paddr = (uintptr_t)addr;
+ else
+ paddr = elem->ms->phys_addr +
+ ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
+ return paddr;
}
--
2.14.1
^ permalink raw reply [flat|nested] 248+ messages in thread
* Re: [dpdk-dev] [PATCH v10 0/9] Infrastructure to detect iova mapping on the bus
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 " Santosh Shukla
` (8 preceding siblings ...)
2017-10-06 11:03 ` [dpdk-dev] [PATCH v10 9/9] eal/rte_malloc: " Santosh Shukla
@ 2017-10-06 18:40 ` Thomas Monjalon
9 siblings, 0 replies; 248+ messages in thread
From: Thomas Monjalon @ 2017-10-06 18:40 UTC (permalink / raw)
To: Santosh Shukla
Cc: dev, olivier.matz, jerin.jacob, hemant.agrawal, aconole, stephen,
anatoly.burakov, gaetan.rivet, shreyansh.jain, bruce.richardson,
sergio.gonzalez.monroy, maxime.coquelin
> Santosh Shukla (9):
> eal/pci: export match function
> eal/pci: get iommu class
> linuxapp/eal_pci: get iommu class
> bus: get iommu class
> eal: introduce helper API for iova mode
> eal: auto detect iova mode
> linuxapp/eal_vfio: honor iova mode before mapping
> linuxapp/eal_memory: honor iova mode in virt2phy
> eal/rte_malloc: honor iova mode in virt2phy
Applied with few uppercase changes in comments, thanks
^ permalink raw reply [flat|nested] 248+ messages in thread