DPDK patches and discussions
 help / color / Atom feed
From: <jerinj@marvell.com>
To: <dev@dpdk.org>, Anatoly Burakov <anatoly.burakov@intel.com>,
	John McNamara <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	"Igor Russkikh" <igor.russkikh@aquantia.com>,
	Pavel Belous <pavel.belous@aquantia.com>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Somnath Kotur <somnath.kotur@broadcom.com>,
	Wenzhuo Lu <wenzhuo.lu@intel.com>,
	John Daley <johndale@cisco.com>,
	Hyong Youb Kim <hyonkim@cisco.com>,
	Qi Zhang <qi.z.zhang@intel.com>,
	Xiao Wang <xiao.w.wang@intel.com>,
	Beilei Xing <beilei.xing@intel.com>,
	Jingjing Wu <jingjing.wu@intel.com>,
	Qiming Yang <qiming.yang@intel.com>,
	"Konstantin Ananyev" <konstantin.ananyev@intel.com>,
	Matan Azrad <matan@mellanox.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Yongseok Koh <yskoh@mellanox.com>,
	Viacheslav Ovsiienko <viacheslavo@mellanox.com>,
	Alejandro Lucero <alejandro.lucero@netronome.com>,
	Jerin Jacob <jerinj@marvell.com>,
	"Nithin Dabilpuram" <ndabilpuram@marvell.com>,
	Kiran Kumar K <kirankumark@marvell.com>,
	Rasesh Mody <rmody@marvell.com>,
	Shahed Shaikh <shshaikh@marvell.com>,
	Bruce Richardson <bruce.richardson@intel.com>
Cc: <thomas@monjalon.net>, <david.marchand@redhat.com>
Subject: [dpdk-dev] [PATCH v2 2/4] eal: fix IOVA mode selection as VA for pci drivers
Date: Tue, 16 Jul 2019 19:16:07 +0530
Message-ID: <20190716134609.40930-3-jerinj@marvell.com> (raw)
In-Reply-To: <20190716134609.40930-1-jerinj@marvell.com>

From: David Marchand <david.marchand@redhat.com>

The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
was intended to mean "driver only supports VA" but had been understood
as "driver supports both PA and VA" by most net drivers and used to let
dpdk processes to run as non root (which do not have access to physical
addresses on recent kernels).

The check on physical addresses actually closed the gap for those
drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
flag can retain its intended meaning.
Document explicitly its meaning.

We can check that a driver requirement wrt to IOVA mode is fulfilled
before trying to probe a device.

Finally, document the heuristic used to select the IOVA mode and hope
that we won't break it again.

Fixes: 703458e19c16 ("bus/pci: consider only usable devices for IOVA mode")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
---
 .../prog_guide/env_abstraction_layer.rst      | 31 +++++++++++++++++++
 drivers/bus/pci/linux/pci.c                   | 16 ++++------
 drivers/bus/pci/pci_common.c                  | 30 ++++++++++++++----
 drivers/bus/pci/rte_bus_pci.h                 |  4 +--
 drivers/net/atlantic/atl_ethdev.c             |  3 +-
 drivers/net/bnxt/bnxt_ethdev.c                |  3 +-
 drivers/net/e1000/em_ethdev.c                 |  3 +-
 drivers/net/e1000/igb_ethdev.c                |  5 ++-
 drivers/net/enic/enic_ethdev.c                |  3 +-
 drivers/net/fm10k/fm10k_ethdev.c              |  3 +-
 drivers/net/i40e/i40e_ethdev.c                |  3 +-
 drivers/net/i40e/i40e_ethdev_vf.c             |  2 +-
 drivers/net/iavf/iavf_ethdev.c                |  3 +-
 drivers/net/ice/ice_ethdev.c                  |  3 +-
 drivers/net/ixgbe/ixgbe_ethdev.c              |  5 ++-
 drivers/net/mlx4/mlx4.c                       |  3 +-
 drivers/net/mlx5/mlx5.c                       |  2 +-
 drivers/net/nfp/nfp_net.c                     |  6 ++--
 drivers/net/octeontx2/otx2_ethdev.c           |  5 ---
 drivers/net/qede/qede_ethdev.c                |  6 ++--
 drivers/raw/ioat/ioat_rawdev.c                |  3 +-
 lib/librte_eal/common/eal_common_bus.c        | 30 ++++++++++++++++--
 22 files changed, 110 insertions(+), 62 deletions(-)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index f15bcd976..77307e3a6 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -419,6 +419,37 @@ Misc Functions
 
 Locks and atomic operations are per-architecture (i686 and x86_64).
 
+IOVA Mode Detection
+~~~~~~~~~~~~~~~~~~~
+
+IOVA Mode is selected by considering what the current usable Devices on the
+system requires and/or supports.
+
+Below is the 2-step heuristic for this choice.
+
+For the first step, EAL asks each bus its requirement in terms of IOVA mode
+and decides on a preferred IOVA mode.
+
+- if all buses report RTE_IOVA_PA, then the preferred IOVA mode is RTE_IOVA_PA,
+- if all buses report RTE_IOVA_VA, then the preferred IOVA mode is RTE_IOVA_VA,
+- if all buses report RTE_IOVA_DC, no bus expressed a preferrence, then the
+  preferred mode is RTE_IOVA_DC,
+- if the buses disagree (at least one wants RTE_IOVA_PA and at least one wants
+  RTE_IOVA_VA), then the preferred IOVA mode is RTE_IOVA_DC (see below with the
+  check on Physical Addresses availability),
+
+The second step is checking if the preferred mode complies with the Physical
+Addresses availability since those are only available to root user in recent
+kernels.
+
+- if the preferred mode is RTE_IOVA_PA but there is no access to Physical
+  Addresses, then EAL init will fail early, since later probing of the devices
+  would fail anyway,
+- if the preferred mode is RTE_IOVA_DC then based on the Physical Addresses
+  availability, the preferred mode is adjusted to RTE_IOVA_PA or RTE_IOVA_VA.
+  In the case when the buses had disagreed on the IOVA Mode at the first step,
+  part of the buses won't work because of this decision.
+
 IOVA Mode Configuration
 ~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
index b12f10af5..1a2f99b32 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -578,12 +578,10 @@ pci_device_iova_mode(const struct rte_pci_driver *pdrv,
 			else
 				is_vfio_noiommu_enabled = 0;
 		}
-		if ((pdrv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) == 0) {
+		if (is_vfio_noiommu_enabled != 0)
 			iova_mode = RTE_IOVA_PA;
-		} else if (is_vfio_noiommu_enabled != 0) {
-			RTE_LOG(DEBUG, EAL, "Forcing to 'PA', vfio-noiommu mode configured\n");
-			iova_mode = RTE_IOVA_PA;
-		}
+		else if ((pdrv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) != 0)
+			iova_mode = RTE_IOVA_VA;
 #endif
 		break;
 	}
@@ -594,8 +592,8 @@ pci_device_iova_mode(const struct rte_pci_driver *pdrv,
 		break;
 
 	default:
-		RTE_LOG(DEBUG, EAL, "Unsupported kernel driver? Defaulting to IOVA as 'PA'\n");
-		iova_mode = RTE_IOVA_PA;
+		if ((pdrv->drv_flags & RTE_PCI_DRV_IOVA_AS_VA) != 0)
+			iova_mode = RTE_IOVA_VA;
 		break;
 	}
 
@@ -607,10 +605,8 @@ pci_device_iova_mode(const struct rte_pci_driver *pdrv,
 		if (iommu_no_va == -1)
 			iommu_no_va = pci_one_device_iommu_support_va(pdev)
 					? 0 : 1;
-		if (iommu_no_va != 0) {
-			RTE_LOG(DEBUG, EAL, "Forcing to 'PA', IOMMU does not support IOVA as 'VA'\n");
+		if (iommu_no_va != 0)
 			iova_mode = RTE_IOVA_PA;
-		}
 	}
 	return iova_mode;
 }
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index d2af472ef..ed55b07f3 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -169,8 +169,22 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
 	 * This needs to be before rte_pci_map_device(), as it enables to use
 	 * driver flags for adjusting configuration.
 	 */
-	if (!already_probed)
+	if (!already_probed) {
+		enum rte_iova_mode dev_iova_mode;
+		enum rte_iova_mode iova_mode;
+
+		dev_iova_mode = pci_device_iova_mode(dr, dev);
+		iova_mode = rte_eal_iova_mode();
+		if (dev_iova_mode != RTE_IOVA_DC &&
+		    dev_iova_mode != iova_mode) {
+			RTE_LOG(ERR, EAL, "  Expecting '%s' IOVA mode but current mode is '%s', not initializing\n",
+				dev_iova_mode == RTE_IOVA_PA ? "PA" : "VA",
+				iova_mode == RTE_IOVA_PA ? "PA" : "VA");
+			return -EINVAL;
+		}
+
 		dev->driver = dr;
+	}
 
 	if (!already_probed && (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING)) {
 		/* map resources for devices that use igb_uio */
@@ -629,12 +643,16 @@ rte_pci_get_iommu_class(void)
 				devices_want_va = true;
 		}
 	}
-	if (devices_want_pa) {
-		iova_mode = RTE_IOVA_PA;
-		if (devices_want_va)
-			RTE_LOG(WARNING, EAL, "Some devices want 'VA' but forcing 'PA' because other devices want it\n");
-	} else if (devices_want_va) {
+	if (devices_want_va && !devices_want_pa) {
 		iova_mode = RTE_IOVA_VA;
+	} else if (devices_want_pa && !devices_want_va) {
+		iova_mode = RTE_IOVA_PA;
+	} else {
+		iova_mode = RTE_IOVA_DC;
+		if (devices_want_va) {
+			RTE_LOG(WARNING, EAL, "Some devices want 'VA' but forcing 'DC' because other devices want 'PA'.\n");
+			RTE_LOG(WARNING, EAL, "Depending on the final decision by the EAL, part of your devices won't initialise.\n");
+		}
 	}
 	return iova_mode;
 }
diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index 06e004cd3..0f2177564 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -187,8 +187,8 @@ struct rte_pci_bus {
 #define RTE_PCI_DRV_INTR_RMV 0x0010
 /** Device driver needs to keep mapped resources if unsupported dev detected */
 #define RTE_PCI_DRV_KEEP_MAPPED_RES 0x0020
-/** Device driver supports IOVA as VA */
-#define RTE_PCI_DRV_IOVA_AS_VA 0X0040
+/** Device driver only supports IOVA as VA and cannot work with IOVA as PA */
+#define RTE_PCI_DRV_IOVA_AS_VA 0x0040
 
 /**
  * Map the PCI device resources in user space virtual memory address
diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c
index fdc0a7f2d..fa89ae755 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -157,8 +157,7 @@ static const struct rte_pci_id pci_id_atl_map[] = {
 
 static struct rte_pci_driver rte_atl_pmd = {
 	.id_table = pci_id_atl_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_atl_pci_probe,
 	.remove = eth_atl_pci_remove,
 };
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 8fc510351..9306d5655 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4028,8 +4028,7 @@ static int bnxt_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver bnxt_rte_pmd = {
 	.id_table = bnxt_pci_id_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING |
-		RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = bnxt_pci_probe,
 	.remove = bnxt_pci_remove,
 };
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index dc886613a..0c859e52b 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -352,8 +352,7 @@ static int eth_em_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_em_pmd = {
 	.id_table = pci_id_em_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_em_pci_probe,
 	.remove = eth_em_pci_remove,
 };
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 3ee28cfbc..e784eeb73 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1116,8 +1116,7 @@ static int eth_igb_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_igb_pmd = {
 	.id_table = pci_id_igb_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_igb_pci_probe,
 	.remove = eth_igb_pci_remove,
 };
@@ -1140,7 +1139,7 @@ static int eth_igbvf_pci_remove(struct rte_pci_device *pci_dev)
  */
 static struct rte_pci_driver rte_igbvf_pmd = {
 	.id_table = pci_id_igbvf_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 	.probe = eth_igbvf_pci_probe,
 	.remove = eth_igbvf_pci_remove,
 };
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 5cfbd31a2..e9c6f83ce 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -1247,8 +1247,7 @@ static int eth_enic_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_enic_pmd = {
 	.id_table = pci_id_enic_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_enic_pci_probe,
 	.remove = eth_enic_pci_remove,
 };
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a1e3836cb..2d3c47763 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3268,8 +3268,7 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {
 
 static struct rte_pci_driver rte_pmd_fm10k = {
 	.id_table = pci_id_fm10k_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_fm10k_pci_probe,
 	.remove = eth_fm10k_pci_remove,
 };
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2b9fc4572..dd46d4d9d 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -696,8 +696,7 @@ static int eth_i40e_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_i40e_pmd = {
 	.id_table = pci_id_i40e_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_i40e_pci_probe,
 	.remove = eth_i40e_pci_remove,
 };
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 5be32b069..3ff2f6097 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1557,7 +1557,7 @@ static int eth_i40evf_pci_remove(struct rte_pci_device *pci_dev)
  */
 static struct rte_pci_driver rte_i40evf_pmd = {
 	.id_table = pci_id_i40evf_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 	.probe = eth_i40evf_pci_probe,
 	.remove = eth_i40evf_pci_remove,
 };
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 53dc05c78..a97cd76fd 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -1402,8 +1402,7 @@ static int eth_iavf_pci_remove(struct rte_pci_device *pci_dev)
 /* Adaptive virtual function driver struct */
 static struct rte_pci_driver rte_iavf_pmd = {
 	.id_table = pci_id_iavf_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_iavf_pci_probe,
 	.remove = eth_iavf_pci_remove,
 };
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 9ce730cd4..f05b48c01 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -3737,8 +3737,7 @@ ice_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_ice_pmd = {
 	.id_table = pci_id_ice_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = ice_pci_probe,
 	.remove = ice_pci_remove,
 };
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 22c5b2c5c..4a6e5c32e 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1869,8 +1869,7 @@ static int eth_ixgbe_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_ixgbe_pmd = {
 	.id_table = pci_id_ixgbe_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_ixgbe_pci_probe,
 	.remove = eth_ixgbe_pci_remove,
 };
@@ -1892,7 +1891,7 @@ static int eth_ixgbevf_pci_remove(struct rte_pci_device *pci_dev)
  */
 static struct rte_pci_driver rte_ixgbevf_pmd = {
 	.id_table = pci_id_ixgbevf_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 	.probe = eth_ixgbevf_pci_probe,
 	.remove = eth_ixgbevf_pci_remove,
 };
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 2e169b088..d6e5753bf 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1142,8 +1142,7 @@ static struct rte_pci_driver mlx4_driver = {
 	},
 	.id_table = mlx4_pci_id_map,
 	.probe = mlx4_pci_probe,
-	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
 };
 
 #ifdef RTE_IBVERBS_LINK_DLOPEN
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d93f92db5..0f05853f9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2087,7 +2087,7 @@ static struct rte_pci_driver mlx5_driver = {
 	.dma_map = mlx5_dma_map,
 	.dma_unmap = mlx5_dma_unmap,
 	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV |
-		     RTE_PCI_DRV_PROBE_AGAIN | RTE_PCI_DRV_IOVA_AS_VA,
+		     RTE_PCI_DRV_PROBE_AGAIN,
 };
 
 #ifdef RTE_IBVERBS_LINK_DLOPEN
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 1a7aa17ee..f5d33efcf 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -3760,16 +3760,14 @@ static int eth_nfp_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_nfp_net_pf_pmd = {
 	.id_table = pci_id_nfp_pf_net_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = nfp_pf_pci_probe,
 	.remove = eth_nfp_pci_remove,
 };
 
 static struct rte_pci_driver rte_nfp_net_vf_pmd = {
 	.id_table = pci_id_nfp_vf_net_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = eth_nfp_pci_probe,
 	.remove = eth_nfp_pci_remove,
 };
diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index fcb1869d5..5ec55511b 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -1188,11 +1188,6 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
 		goto fail;
 	}
 
-	if (rte_eal_iova_mode() != RTE_IOVA_VA) {
-		otx2_err("iova mode should be va");
-		goto fail;
-	}
-
 	if (conf->link_speeds & ETH_LINK_SPEED_FIXED) {
 		otx2_err("Setting link speed/duplex not supported");
 		goto fail;
diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index 82363e6eb..0b3046a8a 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -2737,8 +2737,7 @@ static int qedevf_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_qedevf_pmd = {
 	.id_table = pci_id_qedevf_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = qedevf_eth_dev_pci_probe,
 	.remove = qedevf_eth_dev_pci_remove,
 };
@@ -2757,8 +2756,7 @@ static int qede_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 
 static struct rte_pci_driver rte_qede_pmd = {
 	.id_table = pci_id_qede_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = qede_eth_dev_pci_probe,
 	.remove = qede_eth_dev_pci_remove,
 };
diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c
index d509b6606..7270ad7aa 100644
--- a/drivers/raw/ioat/ioat_rawdev.c
+++ b/drivers/raw/ioat/ioat_rawdev.c
@@ -338,8 +338,7 @@ static const struct rte_pci_id pci_id_ioat_map[] = {
 
 static struct rte_pci_driver ioat_pmd_drv = {
 	.id_table = pci_id_ioat_map,
-	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-		     RTE_PCI_DRV_IOVA_AS_VA,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 	.probe = ioat_rawdev_probe,
 	.remove = ioat_rawdev_remove,
 };
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 77f1be1b4..bf0da6e4f 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -228,13 +228,37 @@ rte_bus_find_by_device_name(const char *str)
 enum rte_iova_mode
 rte_bus_get_iommu_class(void)
 {
-	int mode = RTE_IOVA_DC;
+	enum rte_iova_mode mode = RTE_IOVA_DC;
+	bool buses_want_va = false;
+	bool buses_want_pa = false;
 	struct rte_bus *bus;
 
 	TAILQ_FOREACH(bus, &rte_bus_list, next) {
+		enum rte_iova_mode bus_iova_mode;
 
-		if (bus->get_iommu_class)
-			mode |= bus->get_iommu_class();
+		if (bus->get_iommu_class == NULL)
+			continue;
+
+		bus_iova_mode = bus->get_iommu_class();
+		RTE_LOG(DEBUG, EAL, "Bus %s wants IOVA as '%s'\n",
+			bus->name,
+			bus_iova_mode == RTE_IOVA_DC ? "DC" :
+			(bus_iova_mode == RTE_IOVA_PA ? "PA" : "VA"));
+		if (bus_iova_mode == RTE_IOVA_PA)
+			buses_want_pa = true;
+		else if (bus_iova_mode == RTE_IOVA_VA)
+			buses_want_va = true;
+	}
+	if (buses_want_va && !buses_want_pa) {
+		mode = RTE_IOVA_VA;
+	} else if (buses_want_pa && !buses_want_va) {
+		mode = RTE_IOVA_PA;
+	} else {
+		mode = RTE_IOVA_DC;
+		if (buses_want_va) {
+			RTE_LOG(WARNING, EAL, "Some buses want 'VA' but forcing 'DC' because other buses want 'PA'.\n");
+			RTE_LOG(WARNING, EAL, "Depending on the final decision by the EAL, part of your buses won't initialise.\n");
+		}
 	}
 
 	return mode;
-- 
2.22.0


  parent reply index

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-10 21:48 [dpdk-dev] [PATCH 0/2] Fixes on IOVA mode selection David Marchand
2019-07-10 21:48 ` [dpdk-dev] [PATCH 1/2] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-16 10:37   ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-10 21:48 ` [dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers David Marchand
2019-07-11 14:40   ` Thomas Monjalon
2019-07-12  8:05     ` Jerin Jacob Kollanukkaran
2019-07-12 11:03   ` Burakov, Anatoly
2019-07-12 12:43     ` Thomas Monjalon
2019-07-12 12:58       ` Burakov, Anatoly
2019-07-12 13:19         ` Bruce Richardson
2019-07-15 14:26       ` Jerin Jacob Kollanukkaran
2019-07-15 15:03         ` Thomas Monjalon
2019-07-15 15:35           ` Jerin Jacob Kollanukkaran
2019-07-15 16:06             ` Thomas Monjalon
2019-07-15 16:27               ` Jerin Jacob Kollanukkaran
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 0/4] Fixes on IOVA mode selection jerinj
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-16 13:46   ` jerinj [this message]
2019-07-16 14:26     ` [dpdk-dev] [PATCH v2 2/4] eal: fix IOVA mode selection as VA for pci drivers Burakov, Anatoly
2019-07-16 15:07       ` Jerin Jacob Kollanukkaran
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-16 14:33     ` Burakov, Anatoly
2019-07-17  8:33       ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-17 12:38         ` Burakov, Anatoly
2019-07-17 14:04           ` Jerin Jacob Kollanukkaran
2019-07-18  6:45   ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 2/4] eal: fix IOVA mode selection as VA for pci drivers jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-22 11:28     ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 " David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 1/4] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 2/4] eal: fix IOVA mode selection as VA for PCI drivers David Marchand
2019-11-25  9:33     ` Ferruh Yigit
2019-11-25 10:22       ` Thomas Monjalon
2019-11-25 12:03         ` Ferruh Yigit
2019-11-25 12:36           ` David Marchand
2019-11-25 12:58             ` Burakov, Anatoly
2019-11-25 14:29               ` Thomas Monjalon
2019-11-25 11:07       ` Jerin Jacob
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 3/4] drivers: change IOVA as VA PCI flag name David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 4/4] eal: select IOVA as VA mode for default case David Marchand
2019-07-22 15:53   ` [dpdk-dev] [PATCH v4 0/4] Fixes on IOVA mode selection Thomas Monjalon
2019-07-23  3:35     ` Stojaczyk, Dariusz
2019-07-23  4:18       ` Jerin Jacob Kollanukkaran
2019-07-23  4:54         ` Stojaczyk, Dariusz
2019-07-23  5:27           ` Jerin Jacob Kollanukkaran
2019-07-23  7:21             ` Thomas Monjalon
2019-07-23  9:57             ` Burakov, Anatoly
2019-07-23 10:25               ` Thomas Monjalon
2019-07-23 13:56                 ` Burakov, Anatoly
2019-07-23 14:24                   ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 14:29                   ` [dpdk-dev] " Burakov, Anatoly
2019-07-23 14:36                     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 15:47                       ` Burakov, Anatoly

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190716134609.40930-3-jerinj@marvell.com \
    --to=jerinj@marvell.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=anatoly.burakov@intel.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=hyonkim@cisco.com \
    --cc=igor.russkikh@aquantia.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=johndale@cisco.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=matan@mellanox.com \
    --cc=ndabilpuram@marvell.com \
    --cc=pavel.belous@aquantia.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=rmody@marvell.com \
    --cc=shahafs@mellanox.com \
    --cc=shshaikh@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@mellanox.com \
    --cc=wenzhuo.lu@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox