DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure
@ 2017-01-02 23:08 Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 1/6] ethdev: increase length ethernet device internal name Stephen Hemminger
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

These are the patches to allow building drivers on VMBUS with
DPDK. The Hyper-V PMD (later patch set) builds on this.

Most of the infrastructure for the bus and device model is already
done, the one missing piece was that the eth_driver structure still
assumed all devices were PCI.


Stephen Hemminger (6):
  ethdev: increase length ethernet device internal name
  i40e: don't refer to eth_dev->pci_dev
  vmxnet3: don't refer to eth_dev->pci_dev
  cxgbe: don't refer to eth_dev->pci_dev
  ethdev: break ethernet driver and pci_driver connection
  eal: VMBUS infrastructure

 app/test/virtual_pmd.c                      |  22 +-
 doc/guides/rel_notes/deprecation.rst        |   3 +
 drivers/net/bnx2x/bnx2x_ethdev.c            |  16 +-
 drivers/net/bnxt/bnxt_ethdev.c              |  22 +-
 drivers/net/cxgbe/cxgbe_ethdev.c            |   8 +-
 drivers/net/cxgbe/sge.c                     |   9 +-
 drivers/net/e1000/em_ethdev.c               |  10 +-
 drivers/net/e1000/igb_ethdev.c              |  20 +-
 drivers/net/ena/ena_ethdev.c                |   8 +-
 drivers/net/enic/enic_ethdev.c              |   8 +-
 drivers/net/fm10k/fm10k_ethdev.c            |  10 +-
 drivers/net/i40e/i40e_ethdev.c              |  10 +-
 drivers/net/i40e/i40e_ethdev_vf.c           |  10 +-
 drivers/net/i40e/i40e_fdir.c                |   3 +-
 drivers/net/ixgbe/ixgbe_ethdev.c            |  20 +-
 drivers/net/mlx4/mlx4.c                     |   8 +-
 drivers/net/mlx5/mlx5.c                     |   8 +-
 drivers/net/nfp/nfp_net.c                   |   8 +-
 drivers/net/qede/qede_ethdev.c              |  42 +-
 drivers/net/szedata2/rte_eth_szedata2.c     |  10 +-
 drivers/net/thunderx/nicvf_ethdev.c         |   8 +-
 drivers/net/virtio/virtio_ethdev.c          |  10 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c        |  10 +-
 drivers/net/vmxnet3/vmxnet3_rxtx.c          |   5 +-
 lib/librte_eal/common/Makefile              |   2 +-
 lib/librte_eal/common/eal_common_devargs.c  |   7 +
 lib/librte_eal/common/eal_common_options.c  |  38 ++
 lib/librte_eal/common/eal_internal_cfg.h    |   3 +-
 lib/librte_eal/common/eal_options.h         |   6 +
 lib/librte_eal/common/eal_private.h         |   5 +
 lib/librte_eal/common/include/rte_devargs.h |   8 +
 lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile        |   6 +
 lib/librte_eal/linuxapp/eal/eal.c           |  11 +
 lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 907 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.c               |  99 ++-
 lib/librte_ether/rte_ethdev.h               |  49 +-
 mk/rte.app.mk                               |   1 +
 38 files changed, 1542 insertions(+), 137 deletions(-)
 create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 1/6] ethdev: increase length ethernet device internal name
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Allow sufficicent space for UUID in string form (36+1) which
is necessary with VMBUS. 

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 +++
 lib/librte_ether/rte_ethdev.h        | 6 +++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 1438c777..69669e44 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -58,6 +58,9 @@ Deprecation Notices
   ``port`` field, may be moved or removed as part of this mbuf work. A
   ``timestamp`` will also be added.
 
+* ethdev: for 17.02 the size of internal device name will be increased
+  to 40 characters to allow for storing UUID.
+
 * The mbuf flags PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated and
   are respectively replaced by PKT_RX_VLAN_STRIPPED and
   PKT_RX_QINQ_STRIPPED, that are better described. The old flags and
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index fb517544..09dd744d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1649,7 +1649,11 @@ struct rte_eth_dev_sriov {
 };
 #define RTE_ETH_DEV_SRIOV(dev)         ((dev)->data->sriov)
 
-#define RTE_ETH_NAME_MAX_LEN (32)
+/*
+ * Internal identifier length
+ * Sufficiently large to allow for UUID or PCI address
+ */
+#define RTE_ETH_NAME_MAX_LEN 40
 
 /**
  * @internal
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 1/6] ethdev: increase length ethernet device internal name Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  2017-01-06  1:50   ` Wu, Jingjing
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 3/6] vmxnet3: " Stephen Hemminger
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Later patches remove pci_dev from the ethernet device structure.
Fix the i40e code to just use it's own name when forming zone name.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/i40e/i40e_fdir.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 335bf15c..68a2523c 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -250,8 +250,7 @@ i40e_fdir_setup(struct i40e_pf *pf)
 	}
 
 	/* reserve memory for the fdir programming packet */
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d",
-			eth_dev->driver->pci_drv.driver.name,
+	snprintf(z_name, sizeof(z_name), "i40e_%s_%d",
 			I40E_FDIR_MZ_NAME,
 			eth_dev->data->port_id);
 	mz = i40e_memzone_reserve(z_name, I40E_FDIR_PKT_LEN, SOCKET_ID_ANY);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 3/6] vmxnet3: don't refer to eth_dev->pci_dev
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 1/6] ethdev: increase length ethernet device internal name Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 4/6] cxgbe: " Stephen Hemminger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Fix the vmxnet3 code to just use it's own name when forming zone name.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b1091688..772cdf18 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -793,9 +793,8 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
 	char z_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 dev->driver->pci_drv.driver.name, ring_name,
-		 dev->data->port_id, queue_id);
+	snprintf(z_name, sizeof(z_name), "vmxnet3_%s_%d_%d",
+		 ring_name, dev->data->port_id, queue_id);
 
 	mz = rte_memzone_lookup(z_name);
 	if (mz)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 4/6] cxgbe: don't refer to eth_dev->pci_dev
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
                   ` (2 preceding siblings ...)
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 3/6] vmxnet3: " Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 5/6] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 6/6] eal: VMBUS infrastructure Stephen Hemminger
  5 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Later patches remove pci_dev from the ethernet device structure.
Fix the cxgbe code to just use it's own name when forming zone name.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/cxgbe/sge.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 736f08ce..e935dc42 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1644,8 +1644,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
 	/* Size needs to be multiple of 16, including status entry. */
 	iq->size = cxgbe_roundup(iq->size, 16);
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 eth_dev->driver->pci_drv.driver.name,
+	snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
 		 fwevtq ? "fwq_ring" : "rx_ring",
 		 eth_dev->data->port_id, queue_id);
 	snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
@@ -1697,8 +1696,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
 			fl->size = s->fl_starve_thres - 1 + 2 * 8;
 		fl->size = cxgbe_roundup(fl->size, 8);
 
-		snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-			 eth_dev->driver->pci_drv.driver.name,
+		snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
 			 fwevtq ? "fwq_ring" : "fl_ring",
 			 eth_dev->data->port_id, queue_id);
 		snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
@@ -1893,8 +1891,7 @@ int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
 	/* Add status entries */
 	nentries = txq->q.size + s->stat_len / sizeof(struct tx_desc);
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 eth_dev->driver->pci_drv.driver.name, "tx_ring",
+	snprintf(z_name, sizeof(z_name), "cxgbe_tx_ring_%d_%d",
 		 eth_dev->data->port_id, queue_id);
 	snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 5/6] ethdev: break ethernet driver and pci_driver connection
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
                   ` (3 preceding siblings ...)
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 4/6] cxgbe: " Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 6/6] eal: VMBUS infrastructure Stephen Hemminger
  5 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

There are multiple buses and device types now. Therefore it no longer
makes sense that PCI driver information is part of the Ethernet driver
structure.

This patch removes pci_driver from eth_driver and introduces a
new combined structure for use in all existing PMD's. The rationale
is that although all existing PCI drivers are Ethernet drivers,
it make sense that future projects may want to support PCI devices
that are not Ethernet.

It also removes the requirement that driver is first element in
PCI driver structure.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 app/test/virtual_pmd.c                  | 22 ++++++++---------
 drivers/net/bnx2x/bnx2x_ethdev.c        | 16 ++++++++-----
 drivers/net/bnxt/bnxt_ethdev.c          | 22 +++++++++--------
 drivers/net/cxgbe/cxgbe_ethdev.c        |  8 ++++---
 drivers/net/e1000/em_ethdev.c           | 10 ++++----
 drivers/net/e1000/igb_ethdev.c          | 20 +++++++++-------
 drivers/net/ena/ena_ethdev.c            |  8 ++++---
 drivers/net/enic/enic_ethdev.c          |  8 ++++---
 drivers/net/fm10k/fm10k_ethdev.c        | 10 ++++----
 drivers/net/i40e/i40e_ethdev.c          | 10 ++++----
 drivers/net/i40e/i40e_ethdev_vf.c       | 10 ++++----
 drivers/net/ixgbe/ixgbe_ethdev.c        | 20 +++++++++-------
 drivers/net/mlx4/mlx4.c                 |  8 ++++---
 drivers/net/mlx5/mlx5.c                 |  8 ++++---
 drivers/net/nfp/nfp_net.c               |  8 ++++---
 drivers/net/qede/qede_ethdev.c          | 42 +++++++++++++++++----------------
 drivers/net/szedata2/rte_eth_szedata2.c | 10 ++++----
 drivers/net/thunderx/nicvf_ethdev.c     |  8 ++++---
 drivers/net/virtio/virtio_ethdev.c      | 10 ++++----
 drivers/net/vmxnet3/vmxnet3_ethdev.c    | 10 ++++----
 lib/librte_ether/rte_ethdev.c           |  9 +++----
 lib/librte_ether/rte_ethdev.h           | 18 +++++++++-----
 22 files changed, 172 insertions(+), 123 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 6e4dcd8f..e7f56527 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -533,7 +533,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	struct rte_pci_device *pci_dev = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct eth_driver *eth_drv = NULL;
-	struct rte_pci_driver *pci_drv = NULL;
+	struct rte_pci_eth_driver *pci_eth_drv = NULL;
 	struct rte_pci_id *id_table = NULL;
 	struct virtual_ethdev_private *dev_private = NULL;
 	char name_buf[RTE_RING_NAMESIZE];
@@ -554,8 +554,8 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	if (eth_drv == NULL)
 		goto err;
 
-	pci_drv = rte_zmalloc_socket(name, sizeof(*pci_drv), 0, socket_id);
-	if (pci_drv == NULL)
+	pci_eth_drv = rte_zmalloc_socket(name, sizeof(*pci_eth_drv), 0, socket_id);
+	if (pci_eth_drv == NULL)
 		goto err;
 
 	id_table = rte_zmalloc_socket(name, sizeof(*id_table), 0, socket_id);
@@ -585,17 +585,15 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 		goto err;
 
 	pci_dev->device.numa_node = socket_id;
-	pci_drv->driver.name = virtual_ethdev_driver_name;
-	pci_drv->id_table = id_table;
+	pci_eth_drv->pci_drv.driver.name = virtual_ethdev_driver_name;
+	pci_eth_drv->pci_drv.id_table = id_table;
 
 	if (isr_support)
-		pci_drv->drv_flags |= RTE_PCI_DRV_INTR_LSC;
+		pci_eth_drv->pci_drv.drv_flags |= RTE_PCI_DRV_INTR_LSC;
 	else
-		pci_drv->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
+		pci_eth_drv->pci_drv.drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
 
-
-	eth_drv->pci_drv = (struct rte_pci_driver)(*pci_drv);
-	eth_dev->driver = eth_drv;
+	eth_dev->driver = &pci_eth_drv->eth_drv;
 
 	eth_dev->data->nb_rx_queues = (uint16_t)1;
 	eth_dev->data->nb_tx_queues = (uint16_t)1;
@@ -622,7 +620,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	dev_private->dev_ops = virtual_ethdev_default_dev_ops;
 	eth_dev->dev_ops = &dev_private->dev_ops;
 
-	pci_dev->device.driver = &eth_drv->pci_drv.driver;
+	pci_dev->device.driver = &pci_eth_drv->pci_drv.driver;
 	eth_dev->device = &pci_dev->device;
 
 	eth_dev->rx_pkt_burst = virtual_ethdev_rx_burst_success;
@@ -632,7 +630,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 
 err:
 	rte_free(pci_dev);
-	rte_free(pci_drv);
+	rte_free(pci_eth_drv);
 	rte_free(eth_drv);
 	rte_free(id_table);
 	rte_free(dev_private);
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 2735fd0f..296855dd 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -617,29 +617,33 @@ eth_bnx2xvf_dev_init(struct rte_eth_dev *eth_dev)
 	return bnx2x_common_dev_init(eth_dev, 1);
 }
 
-static struct eth_driver rte_bnx2x_pmd = {
+static struct rte_pci_eth_driver rte_bnx2x_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_bnx2x_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2x_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	eth_drv = {
+		.eth_dev_init = eth_bnx2x_dev_init,
+		.dev_private_size = sizeof(struct bnx2x_softc),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_bnx2xvf_pmd = {
+static struct rte_pci_eth_driver rte_bnx2xvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_bnx2xvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2xvf_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	eth_drv = {
+		.eth_dev_init = eth_bnx2xvf_dev_init,
+		.dev_private_size = sizeof(struct bnx2x_softc),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 7518b6b7..9017825b 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1164,17 +1164,19 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev) {
 	return rc;
 }
 
-static struct eth_driver bnxt_rte_pmd = {
+static struct rte_pci_eth_driver bnxt_rte_pmd = {
 	.pci_drv = {
-		    .id_table = bnxt_pci_id_map,
-		    .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
-			    RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove
-		    },
-	.eth_dev_init = bnxt_dev_init,
-	.eth_dev_uninit = bnxt_dev_uninit,
-	.dev_private_size = sizeof(struct bnxt),
+		.id_table = bnxt_pci_id_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING |
+			     RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove
+	},
+	.eth_drv = {
+		.eth_dev_init = bnxt_dev_init,
+		.eth_dev_uninit = bnxt_dev_uninit,
+		.dev_private_size = sizeof(struct bnxt),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd.pci_drv);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 64345e37..ccf93904 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1039,15 +1039,17 @@ static int eth_cxgbe_dev_init(struct rte_eth_dev *eth_dev)
 	return err;
 }
 
-static struct eth_driver rte_cxgbe_pmd = {
+static struct rte_pci_eth_driver rte_cxgbe_pmd = {
 	.pci_drv = {
 		.id_table = cxgb4_pci_tbl,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_cxgbe_dev_init,
-	.dev_private_size = sizeof(struct port_info),
+	.eth_drv = {
+		.eth_dev_init = eth_cxgbe_dev_init,
+		.dev_private_size = sizeof(struct port_info),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 436acbb5..f5bb0764 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -388,7 +388,7 @@ eth_em_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_em_pmd = {
+static struct rte_pci_eth_driver rte_em_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_em_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -396,9 +396,11 @@ static struct eth_driver rte_em_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_em_dev_init,
-	.eth_dev_uninit = eth_em_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_em_dev_init,
+		.eth_dev_uninit = eth_em_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 static int
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 4a154479..a8e769c1 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1078,7 +1078,7 @@ eth_igbvf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_igb_pmd = {
+static struct rte_pci_eth_driver rte_igb_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_igb_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -1086,24 +1086,28 @@ static struct eth_driver rte_igb_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_igb_dev_init,
-	.eth_dev_uninit = eth_igb_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_igb_dev_init,
+		.eth_dev_uninit = eth_igb_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_igbvf_pmd = {
+static struct rte_pci_eth_driver rte_igbvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_igbvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_igbvf_dev_init,
-	.eth_dev_uninit = eth_igbvf_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_igbvf_dev_init,
+		.eth_dev_uninit = eth_igbvf_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 static void
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index dcee8ed0..89f6bd6d 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1705,15 +1705,17 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	return sent_idx;
 }
 
-static struct eth_driver rte_ena_pmd = {
+static struct rte_pci_eth_driver rte_ena_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ena_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ena_dev_init,
-	.dev_private_size = sizeof(struct ena_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ena_dev_init,
+		.dev_private_size = sizeof(struct ena_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd.pci_drv);
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index e5ceb98e..b47975d1 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -633,15 +633,17 @@ static int eth_enicpmd_dev_init(struct rte_eth_dev *eth_dev)
 	return enic_probe(enic);
 }
 
-static struct eth_driver rte_enic_pmd = {
+static struct rte_pci_eth_driver rte_enic_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_enic_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_enicpmd_dev_init,
-	.dev_private_size = sizeof(struct enic),
+	.eth_drv = {
+		.eth_dev_init = eth_enicpmd_dev_init,
+		.dev_private_size = sizeof(struct enic),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd.pci_drv);
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index b8257e46..cbd27cbc 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3069,7 +3069,7 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {
 	{ .vendor_id = 0, /* sentinel */ },
 };
 
-static struct eth_driver rte_pmd_fm10k = {
+static struct rte_pci_eth_driver rte_pmd_fm10k = {
 	.pci_drv = {
 		.id_table = pci_id_fm10k_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -3077,9 +3077,11 @@ static struct eth_driver rte_pmd_fm10k = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_fm10k_dev_init,
-	.eth_dev_uninit = eth_fm10k_dev_uninit,
-	.dev_private_size = sizeof(struct fm10k_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_fm10k_dev_init,
+		.eth_dev_uninit = eth_fm10k_dev_uninit,
+		.dev_private_size = sizeof(struct fm10k_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k.pci_drv);
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 8f63044b..efb641b7 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -668,7 +668,7 @@ static const struct rte_i40e_xstats_name_off rte_i40e_txq_prio_strings[] = {
 #define I40E_NB_TXQ_PRIO_XSTATS (sizeof(rte_i40e_txq_prio_strings) / \
 		sizeof(rte_i40e_txq_prio_strings[0]))
 
-static struct eth_driver rte_i40e_pmd = {
+static struct rte_pci_eth_driver rte_i40e_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_i40e_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -676,9 +676,11 @@ static struct eth_driver rte_i40e_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_i40e_dev_init,
-	.eth_dev_uninit = eth_i40e_dev_uninit,
-	.dev_private_size = sizeof(struct i40e_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_i40e_dev_init,
+		.eth_dev_uninit = eth_i40e_dev_uninit,
+		.dev_private_size = sizeof(struct i40e_adapter),
+	},
 };
 
 static inline int
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 0dc0af52..6dbcc88c 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1526,16 +1526,18 @@ i40evf_dev_uninit(struct rte_eth_dev *eth_dev)
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_i40evf_pmd = {
+static struct rte_pci_eth_driver rte_i40evf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_i40evf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = i40evf_dev_init,
-	.eth_dev_uninit = i40evf_dev_uninit,
-	.dev_private_size = sizeof(struct i40e_adapter),
+	.eth_drv = {
+		.eth_dev_init = i40evf_dev_init,
+		.eth_dev_uninit = i40evf_dev_uninit,
+		.dev_private_size = sizeof(struct i40e_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd.pci_drv);
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index ec2edade..dbad48dc 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1560,7 +1560,7 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_ixgbe_pmd = {
+static struct rte_pci_eth_driver rte_ixgbe_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ixgbe_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -1568,24 +1568,28 @@ static struct eth_driver rte_ixgbe_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ixgbe_dev_init,
-	.eth_dev_uninit = eth_ixgbe_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ixgbe_dev_init,
+		.eth_dev_uninit = eth_ixgbe_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_ixgbevf_pmd = {
+static struct rte_pci_eth_driver rte_ixgbevf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ixgbevf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ixgbevf_dev_init,
-	.eth_dev_uninit = eth_ixgbevf_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ixgbevf_dev_init,
+		.eth_dev_uninit = eth_ixgbevf_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
+	},
 };
 
 static int
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index eb06f56a..7b184019 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5524,7 +5524,7 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 	}
 }
 
-static struct eth_driver mlx4_driver;
+static struct rte_pci_eth_driver mlx4_driver;
 
 /**
  * DPDK callback to register a PCI device.
@@ -5903,7 +5903,7 @@ static const struct rte_pci_id mlx4_pci_id_map[] = {
 	}
 };
 
-static struct eth_driver mlx4_driver = {
+static struct rte_pci_eth_driver mlx4_driver = {
 	.pci_drv = {
 		.driver = {
 			.name = MLX4_DRIVER_NAME
@@ -5912,7 +5912,9 @@ static struct eth_driver mlx4_driver = {
 		.probe = mlx4_pci_probe,
 		.drv_flags = RTE_PCI_DRV_INTR_LSC,
 	},
-	.dev_private_size = sizeof(struct priv)
+	.eth_drv = {
+		.dev_private_size = sizeof(struct priv),
+	},
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index b97b6d16..efc0430c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -338,7 +338,7 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 	return 0;
 }
 
-static struct eth_driver mlx5_driver;
+static struct rte_pci_eth_driver mlx5_driver;
 
 /**
  * DPDK callback to register a PCI device.
@@ -723,7 +723,7 @@ static const struct rte_pci_id mlx5_pci_id_map[] = {
 	}
 };
 
-static struct eth_driver mlx5_driver = {
+static struct rte_pci_eth_driver mlx5_driver = {
 	.pci_drv = {
 		.driver = {
 			.name = MLX5_DRIVER_NAME
@@ -732,7 +732,9 @@ static struct eth_driver mlx5_driver = {
 		.probe = mlx5_pci_probe,
 		.drv_flags = RTE_PCI_DRV_INTR_LSC,
 	},
-	.dev_private_size = sizeof(struct priv)
+	.eth_drv = {
+		.dev_private_size = sizeof(struct priv),
+	},
 };
 
 /**
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index e85315f1..eb4006f9 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2472,7 +2472,7 @@ static struct rte_pci_id pci_id_nfp_net_map[] = {
 	},
 };
 
-static struct eth_driver rte_nfp_net_pmd = {
+static struct rte_pci_eth_driver rte_nfp_net_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_nfp_net_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -2480,8 +2480,10 @@ static struct eth_driver rte_nfp_net_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = nfp_net_init,
-	.dev_private_size = sizeof(struct nfp_net_adapter),
+	.eth_drv = {
+		.eth_dev_init = nfp_net_init,
+		.dev_private_size = sizeof(struct nfp_net_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_nfp, rte_nfp_net_pmd.pci_drv);
diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index e91e627c..beb10157 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -1641,30 +1641,32 @@ static struct rte_pci_id pci_id_qede_map[] = {
 	{.vendor_id = 0,}
 };
 
-static struct eth_driver rte_qedevf_pmd = {
+static struct rte_pci_eth_driver rte_qedevf_pmd = {
 	.pci_drv = {
-		    .id_table = pci_id_qedevf_map,
-		    .drv_flags =
-		    RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove,
-		   },
-	.eth_dev_init = qedevf_eth_dev_init,
-	.eth_dev_uninit = qedevf_eth_dev_uninit,
-	.dev_private_size = sizeof(struct qede_dev),
+		.id_table = pci_id_qedevf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.eth_drv = {
+		.eth_dev_init = qedevf_eth_dev_init,
+		.eth_dev_uninit = qedevf_eth_dev_uninit,
+		.dev_private_size = sizeof(struct qede_dev),
+	},
 };
 
-static struct eth_driver rte_qede_pmd = {
+static struct rte_pci_eth_driver rte_qede_pmd = {
 	.pci_drv = {
-		    .id_table = pci_id_qede_map,
-		    .drv_flags =
-		    RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove,
-		   },
-	.eth_dev_init = qede_eth_dev_init,
-	.eth_dev_uninit = qede_eth_dev_uninit,
-	.dev_private_size = sizeof(struct qede_dev),
+		.id_table = pci_id_qede_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.eth_drv = {
+		.eth_dev_init = qede_eth_dev_init,
+		.eth_dev_uninit = qede_eth_dev_uninit,
+		.dev_private_size = sizeof(struct qede_dev),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_qede, rte_qede_pmd.pci_drv);
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c b/drivers/net/szedata2/rte_eth_szedata2.c
index fe7a6b3b..b9054671 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1587,15 +1587,17 @@ static const struct rte_pci_id rte_szedata2_pci_id_table[] = {
 	}
 };
 
-static struct eth_driver szedata2_eth_driver = {
+static struct rte_pci_eth_driver szedata2_eth_driver = {
 	.pci_drv = {
 		.id_table = rte_szedata2_pci_id_table,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init     = rte_szedata2_eth_dev_init,
-	.eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
-	.dev_private_size = sizeof(struct pmd_internals),
+	.eth_drv = {
+		.eth_dev_init     = rte_szedata2_eth_dev_init,
+		.eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
+		.dev_private_size = sizeof(struct pmd_internals),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(RTE_SZEDATA2_DRIVER_NAME, szedata2_eth_driver.pci_drv);
diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c
index 10603197..f13fad90 100644
--- a/drivers/net/thunderx/nicvf_ethdev.c
+++ b/drivers/net/thunderx/nicvf_ethdev.c
@@ -2111,15 +2111,17 @@ static const struct rte_pci_id pci_id_nicvf_map[] = {
 	},
 };
 
-static struct eth_driver rte_nicvf_pmd = {
+static struct rte_pci_eth_driver rte_nicvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_nicvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = nicvf_eth_dev_init,
-	.dev_private_size = sizeof(struct nicvf),
+	.eth_drv = {
+		.eth_dev_init = nicvf_eth_dev_init,
+		.dev_private_size = sizeof(struct nicvf),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_thunderx, rte_nicvf_pmd.pci_drv);
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index b11bee63..dc68dbb9 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1375,7 +1375,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_virtio_pmd = {
+static struct rte_pci_eth_driver rte_virtio_pmd = {
 	.pci_drv = {
 		.driver = {
 			.name = "net_virtio",
@@ -1385,9 +1385,11 @@ static struct eth_driver rte_virtio_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_virtio_dev_init,
-	.eth_dev_uninit = eth_virtio_dev_uninit,
-	.dev_private_size = sizeof(struct virtio_hw),
+	.eth_drv = {
+		.eth_dev_init = eth_virtio_dev_init,
+		.eth_dev_uninit = eth_virtio_dev_uninit,
+		.dev_private_size = sizeof(struct virtio_hw),
+	},
 };
 
 RTE_INIT(rte_virtio_pmd_init);
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 9c4d93c1..422461e9 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -333,16 +333,18 @@ eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_vmxnet3_pmd = {
+static struct rte_pci_eth_driver rte_vmxnet3_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_vmxnet3_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_vmxnet3_dev_init,
-	.eth_dev_uninit = eth_vmxnet3_dev_uninit,
-	.dev_private_size = sizeof(struct vmxnet3_hw),
+	.eth_drv = {
+		.eth_dev_init = eth_vmxnet3_dev_init,
+		.eth_dev_uninit = eth_vmxnet3_dev_uninit,
+		.dev_private_size = sizeof(struct vmxnet3_hw),
+	},
 };
 
 static int
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 280f0db1..acc0e556 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -239,13 +239,14 @@ int
 rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 		      struct rte_pci_device *pci_dev)
 {
-	struct eth_driver    *eth_drv;
+	const struct rte_pci_eth_driver *pci_eth_drv;
+	const struct eth_driver *eth_drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
-
 	int diag;
 
-	eth_drv = (struct eth_driver *)pci_drv;
+	pci_eth_drv = container_of(pci_drv, struct rte_pci_eth_driver, pci_drv);
+	eth_drv = &pci_eth_drv->eth_drv;
 
 	rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
 			sizeof(ethdev_name));
@@ -263,7 +264,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 	}
 	eth_dev->device = &pci_dev->device;
 	eth_dev->intr_handle = &pci_dev->intr_handle;
-	eth_dev->driver = eth_drv;
+	eth_dev->driver = &pci_eth_drv->eth_drv;
 
 	/* Invoke PMD device initialization function */
 	diag = (*eth_drv->eth_dev_init)(eth_dev);
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 09dd744d..d310541a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1851,25 +1851,31 @@ typedef int (*eth_dev_uninit_t)(struct rte_eth_dev *eth_dev);
  * @internal
  * The structure associated with a PMD Ethernet driver.
  *
- * Each Ethernet driver acts as a PCI driver and is represented by a generic
+ * Each Ethernet driver acts is represented by a generic
  * *eth_driver* structure that holds:
  *
- * - An *rte_pci_driver* structure (which must be the first field).
+ * - The *eth_dev_init* function invoked for each matching device.
  *
- * - The *eth_dev_init* function invoked for each matching PCI device.
- *
- * - The *eth_dev_uninit* function invoked for each matching PCI device.
+ * - The *eth_dev_uninit* function invoked for each matching device.
  *
  * - The size of the private data to allocate for each matching device.
  */
 struct eth_driver {
-	struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
 	eth_dev_init_t eth_dev_init;      /**< Device init function. */
 	eth_dev_uninit_t eth_dev_uninit;  /**< Device uninit function. */
 	unsigned int dev_private_size;    /**< Size of device private data. */
 };
 
 /**
+ * @internal
+ * The structure associated with a PMD PCI Ethernet driver.
+ */
+struct rte_pci_eth_driver {
+	struct rte_pci_driver	pci_drv;	/**< Underlying PCI driver. */
+	struct eth_driver	eth_drv;	/**< Ethernet driver. */
+};
+
+/**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
  *
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-dev] [PATCH 6/6] eal: VMBUS infrastructure
  2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
                   ` (4 preceding siblings ...)
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 5/6] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
@ 2017-01-02 23:08 ` Stephen Hemminger
  5 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2017-01-02 23:08 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add support for VMBUS on Hyper-V/Azure. VMBUS is similar to PCI
but has different addressing and internal API's.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 lib/librte_eal/common/Makefile              |   2 +-
 lib/librte_eal/common/eal_common_devargs.c  |   7 +
 lib/librte_eal/common/eal_common_options.c  |  38 ++
 lib/librte_eal/common/eal_internal_cfg.h    |   3 +-
 lib/librte_eal/common/eal_options.h         |   6 +
 lib/librte_eal/common/eal_private.h         |   5 +
 lib/librte_eal/common/include/rte_devargs.h |   8 +
 lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile        |   6 +
 lib/librte_eal/linuxapp/eal/eal.c           |  11 +
 lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 907 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.c               |  90 +++
 lib/librte_ether/rte_ethdev.h               |  25 +
 mk/rte.app.mk                               |   1 +
 14 files changed, 1356 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index a92c984b..9254bae4 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
-INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
+INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h rte_vmbus.h
 INC += rte_per_lcore.h rte_random.h
 INC += rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_version.h
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e403717b..934ca840 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -113,6 +113,13 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 			goto fail;
 
 		break;
+	case RTE_DEVTYPE_WHITELISTED_VMBUS:
+	case RTE_DEVTYPE_BLACKLISTED_VMBUS:
+#ifdef RTE_LIBRTE_HV_PMD
+		if (uuid_parse(buf, devargs->uuid) == 0)
+			break;
+#endif
+		goto fail;
 	}
 
 	free(buf);
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 611e581c..89dc9de7 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -95,6 +95,11 @@ eal_long_options[] = {
 	{OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
 	{OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
 	{OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
+#ifdef RTE_LIBRTE_HV_PMD
+	{OPT_NO_VMBUS,          0, NULL, OPT_NO_VMBUS_NUM         },
+	{OPT_VMBUS_BLACKLIST,   1, NULL, OPT_VMBUS_BLACKLIST_NUM  },
+	{OPT_VMBUS_WHITELIST,   1, NULL, OPT_VMBUS_WHITELIST_NUM  },
+#endif
 	{0,                     0, NULL, 0                        }
 };
 
@@ -858,6 +863,21 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->no_pci = 1;
 		break;
 
+#ifdef RTE_LIBRTE_HV_PMD
+	case OPT_NO_VMBUS_NUM:
+		conf->no_vmbus = 1;
+		break;
+	case OPT_VMBUS_BLACKLIST_NUM:
+		if (rte_eal_devargs_add(RTE_DEVTYPE_BLACKLISTED_VMBUS,
+					optarg) < 0)
+			return -1;
+		break;
+	case OPT_VMBUS_WHITELIST_NUM:
+		if (rte_eal_devargs_add(RTE_DEVTYPE_WHITELISTED_VMBUS,
+				optarg) < 0)
+			return -1;
+		break;
+#endif
 	case OPT_NO_HPET_NUM:
 		conf->no_hpet = 1;
 		break;
@@ -1017,6 +1037,14 @@ eal_check_common_options(struct internal_config *internal_cfg)
 		return -1;
 	}
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_VMBUS) != 0 &&
+		rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_VMBUS) != 0) {
+		RTE_LOG(ERR, EAL, "Options vmbus blacklist and whitelist "
+			"cannot be used at the same time\n");
+		return -1;
+	}
+#endif
 	return 0;
 }
 
@@ -1066,5 +1094,15 @@ eal_common_usage(void)
 	       "  --"OPT_NO_PCI"            Disable PCI\n"
 	       "  --"OPT_NO_HPET"           Disable HPET\n"
 	       "  --"OPT_NO_SHCONF"         No shared config (mmap'd files)\n"
+#ifdef RTE_LIBRTE_HV_PMD
+	       "  --"OPT_NO_VMBUS"          Disable VMBUS\n"
+	       "  --"OPT_VMBUS_BLACKLIST" Add a VMBUS device to black list.\n"
+	       "                      Prevent EAL from using this PCI device. The argument\n"
+	       "                      format is device UUID.\n"
+	       "  --"OPT_VMBUS_WHITELIST" Add a VMBUS device to white list.\n"
+	       "                      Only use the specified VMBUS devices. The argument format\n"
+	       "                      is device UUID This option can be present\n"
+	       "                      several times (once per device).\n"
+#endif
 	       "\n", RTE_MAX_LCORE);
 }
diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
index 5f1367eb..18271943 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -69,7 +69,8 @@ struct internal_config {
 	volatile unsigned no_pci;         /**< true to disable PCI */
 	volatile unsigned no_hpet;        /**< true to disable HPET */
 	volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping
-										* instead of native TSC */
+					   * instead of native TSC */
+	volatile unsigned no_vmbus;       /**< true to disable VMBUS */
 	volatile unsigned no_shconf;      /**< true if there is no shared config */
 	volatile unsigned create_uio_dev; /**< true to create /dev/uioX devices */
 	volatile enum rte_proc_type_t process_type; /**< multi-process proc type */
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index a881c62e..156727e7 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -83,6 +83,12 @@ enum {
 	OPT_VMWARE_TSC_MAP_NUM,
 #define OPT_XEN_DOM0          "xen-dom0"
 	OPT_XEN_DOM0_NUM,
+#define OPT_NO_VMBUS          "no-vmbus"
+	OPT_NO_VMBUS_NUM,
+#define OPT_VMBUS_BLACKLIST   "vmbus-blacklist"
+	OPT_VMBUS_BLACKLIST_NUM,
+#define OPT_VMBUS_WHITELIST   "vmbus-whitelist"
+	OPT_VMBUS_WHITELIST_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 9e7d8f6b..c856c63e 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -210,6 +210,11 @@ int pci_uio_map_resource_by_index(struct rte_pci_device *dev, int res_idx,
 		struct mapped_pci_resource *uio_res, int map_idx);
 
 /**
+ * VMBUS related functions and structures
+ */
+int rte_eal_vmbus_init(void);
+
+/**
  * Init tail queues for non-EAL library structures. This is to allow
  * the rings, mempools, etc. lists to be shared among multiple processes
  *
diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
index 88120a1c..c079d289 100644
--- a/lib/librte_eal/common/include/rte_devargs.h
+++ b/lib/librte_eal/common/include/rte_devargs.h
@@ -51,6 +51,9 @@ extern "C" {
 #include <stdio.h>
 #include <sys/queue.h>
 #include <rte_pci.h>
+#ifdef RTE_LIBRTE_HV_PMD
+#include <uuid/uuid.h>
+#endif
 
 /**
  * Type of generic device
@@ -59,6 +62,8 @@ enum rte_devtype {
 	RTE_DEVTYPE_WHITELISTED_PCI,
 	RTE_DEVTYPE_BLACKLISTED_PCI,
 	RTE_DEVTYPE_VIRTUAL,
+	RTE_DEVTYPE_WHITELISTED_VMBUS,
+	RTE_DEVTYPE_BLACKLISTED_VMBUS,
 };
 
 /**
@@ -88,6 +93,9 @@ struct rte_devargs {
 			/** Driver name. */
 			char drv_name[32];
 		} virt;
+#ifdef RTE_LIBRTE_HV_PMD
+		uuid_t uuid;
+#endif
 	};
 	/** Arguments string as given by user or "" for no argument. */
 	char *args;
diff --git a/lib/librte_eal/common/include/rte_vmbus.h b/lib/librte_eal/common/include/rte_vmbus.h
new file mode 100644
index 00000000..f96d753e
--- /dev/null
+++ b/lib/librte_eal/common/include/rte_vmbus.h
@@ -0,0 +1,249 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
+ *   Copyright(c) 2016 Microsoft Corporation
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_VMBUS_H_
+#define _RTE_VMBUS_H_
+
+/**
+ * @file
+ *
+ * RTE VMBUS Interface
+ */
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <limits.h>
+#include <errno.h>
+#include <uuid/uuid.h>
+#include <sys/queue.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_debug.h>
+#include <rte_interrupts.h>
+#include <rte_dev.h>
+
+TAILQ_HEAD(vmbus_device_list, rte_vmbus_device);
+TAILQ_HEAD(vmbus_driver_list, rte_vmbus_driver);
+
+extern struct vmbus_driver_list vmbus_driver_list;
+extern struct vmbus_device_list vmbus_device_list;
+
+/** Pathname of VMBUS devices directory. */
+#define SYSFS_VMBUS_DEVICES "/sys/bus/vmbus/devices"
+
+#define UUID_BUF_SZ	(36 + 1)
+
+
+/** Maximum number of VMBUS resources. */
+#define VMBUS_MAX_RESOURCE 7
+
+/**
+ * A structure describing a VMBUS device.
+ */
+struct rte_vmbus_device {
+	TAILQ_ENTRY(rte_vmbus_device) next;     /**< Next probed VMBUS device. */
+	struct rte_device device;               /**< Inherit core device */
+	uuid_t device_id;			/**< VMBUS device id */
+	uuid_t class_id;			/**< VMBUS device type */
+	uint32_t relid;				/**< VMBUS id for notification */
+	uint8_t	monitor_id;
+	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
+	const struct rte_vmbus_driver *driver;  /**< Associated driver */
+
+	struct rte_mem_resource mem_resource[VMBUS_MAX_RESOURCE];
+						/**< VMBUS Memory Resource */
+	char sysfs_name[];			/**< Name in sysfs bus directory */
+};
+
+struct rte_vmbus_driver;
+
+/**
+ * Initialisation function for the driver called during VMBUS probing.
+ */
+typedef int (vmbus_probe_t)(struct rte_vmbus_driver *,
+			    struct rte_vmbus_device *);
+
+/**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (vmbus_remove_t)(struct rte_vmbus_device *);
+
+/**
+ * A structure describing a VMBUS driver.
+ */
+struct rte_vmbus_driver {
+	TAILQ_ENTRY(rte_vmbus_driver) next;     /**< Next in list. */
+	struct rte_driver driver;
+	vmbus_probe_t *probe;                   /**< Device Probe function. */
+	vmbus_remove_t *remove;                 /**< Device Remove function. */
+
+	const uuid_t *id_table;			/**< ID table. */
+};
+
+struct vmbus_map {
+	void *addr;
+	char *path;
+	uint64_t offset;
+	uint64_t size;
+	uint64_t phaddr;
+};
+
+/*
+ * For multi-process we need to reproduce all vmbus mappings in secondary
+ * processes, so save them in a tailq.
+ */
+struct mapped_vmbus_resource {
+	TAILQ_ENTRY(mapped_vmbus_resource) next;
+
+	uuid_t uuid;
+	char path[PATH_MAX];
+	int nb_maps;
+	struct vmbus_map maps[VMBUS_MAX_RESOURCE];
+};
+
+TAILQ_HEAD(mapped_vmbus_res_list, mapped_vmbus_resource);
+
+/**
+ * Scan the content of the VMBUS bus, and the devices in the devices list
+ *
+ * @return
+ *  0 on success, negative on error
+ */
+int rte_eal_vmbus_scan(void);
+
+/**
+ * Probe the VMBUS bus for registered drivers.
+ *
+ * Scan the content of the VMBUS bus, and call the probe() function for
+ * all registered drivers that have a matching entry in its id_table
+ * for discovered devices.
+ *
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_probe(void);
+
+/**
+ * Map the VMBUS device resources in user space virtual memory address
+ *
+ * @param dev
+ *   A pointer to a rte_vmbus_device structure describing the device
+ *   to use
+ *
+ * @return
+ *   0 on success, negative on error and positive if no driver
+ *   is found for the device.
+ */
+int rte_eal_vmbus_map_device(struct rte_vmbus_device *dev);
+
+/**
+ * Unmap this device
+ *
+ * @param dev
+ *   A pointer to a rte_vmbus_device structure describing the device
+ *   to use
+ */
+void rte_eal_vmbus_unmap_device(struct rte_vmbus_device *dev);
+
+/**
+ * Probe the single VMBUS device.
+ *
+ * Scan the content of the VMBUS bus, and find the vmbus device
+ * specified by device uuid, then call the probe() function for
+ * registered driver that has a matching entry in its id_table for
+ * discovered device.
+ *
+ * @param id
+ *   The VMBUS device uuid.
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_probe_one(uuid_t id);
+
+/**
+ * Close the single VMBUS device.
+ *
+ * Scan the content of the VMBUS bus, and find the vmbus device id,
+ * then call the remove() function for registered driver that has a
+ * matching entry in its id_table for discovered device.
+ *
+ * @param id
+ *   The VMBUS device uuid.
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_detach(uuid_t id);
+
+/**
+ * Register a VMBUS driver.
+ *
+ * @param driver
+ *   A pointer to a rte_vmbus_driver structure describing the driver
+ *   to be registered.
+ */
+void rte_eal_vmbus_register(struct rte_vmbus_driver *driver);
+
+/** Helper for VMBUS device registration from driver nstance */
+#define RTE_PMD_REGISTER_VMBUS(nm, vmbus_drv) \
+RTE_INIT(vmbusinitfn_ ##nm); \
+static void vmbusinitfn_ ##nm(void) \
+{\
+	(vmbus_drv).driver.name = RTE_STR(nm);\
+	(vmbus_drv).driver.type = PMD_VMBUS; \
+	rte_eal_vmbus_register(&vmbus_drv); \
+} \
+RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
+
+/**
+ * Unregister a VMBUS driver.
+ *
+ * @param driver
+ *   A pointer to a rte_vmbus_driver structure describing the driver
+ *   to be unregistered.
+ */
+void rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_VMBUS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 4e206f09..f6ca3848 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -71,6 +71,11 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_HV_PMD),y)
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_vmbus.c
+LDLIBS += -luuid
+endif
+
 # from common dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
@@ -114,6 +119,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_uio.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
+CFLAGS_eal_vmbux.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 16dd5b9c..1eec963f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -70,6 +70,7 @@
 #include <rte_cpuflags.h>
 #include <rte_interrupts.h>
 #include <rte_pci.h>
+#include <rte_vmbus.h>
 #include <rte_dev.h>
 #include <rte_devargs.h>
 #include <rte_common.h>
@@ -830,6 +831,11 @@ rte_eal_init(int argc, char **argv)
 
 	eal_check_mem_on_local_socket();
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_vmbus_init() < 0)
+		RTE_LOG(ERR, EAL, "Cannot init VMBUS\n");
+#endif
+
 	if (eal_plugins_init() < 0)
 		rte_panic("Cannot init plugins\n");
 
@@ -884,6 +890,11 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_probe())
 		rte_panic("Cannot probe PCI\n");
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_vmbus_probe() < 0)
+		rte_panic("Cannot probe VMBUS\n");
+#endif
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_vmbus.c b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
new file mode 100644
index 00000000..f298e04e
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
@@ -0,0 +1,907 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
+ *   Copyright(c) 2016 Microsoft Corporation
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *	 notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *	 notice, this list of conditions and the following disclaimer in
+ *	 the documentation and/or other materials provided with the
+ *	 distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *	 contributors may be used to endorse or promote products derived
+ *	 from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include <string.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+
+#include <rte_eal.h>
+#include <rte_tailq.h>
+#include <rte_log.h>
+#include <rte_devargs.h>
+#include <rte_vmbus.h>
+#include <rte_malloc.h>
+
+#include "eal_private.h"
+#include "eal_pci_init.h"
+#include "eal_filesystem.h"
+
+struct vmbus_driver_list vmbus_driver_list =
+	TAILQ_HEAD_INITIALIZER(vmbus_driver_list);
+struct vmbus_device_list vmbus_device_list =
+	TAILQ_HEAD_INITIALIZER(vmbus_device_list);
+
+static void *vmbus_map_addr;
+
+static struct rte_tailq_elem rte_vmbus_uio_tailq = {
+	.name = "UIO_RESOURCE_LIST",
+};
+EAL_REGISTER_TAILQ(rte_vmbus_uio_tailq);
+
+/*
+ * parse a sysfs file containing one integer value
+ * different to the eal version, as it needs to work with 64-bit values
+ */
+static int
+vmbus_get_sysfs_uuid(const char *filename, uuid_t uu)
+{
+	char buf[BUFSIZ];
+	char *cp = NULL;
+	FILE *f;
+
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
+				__func__, filename);
+		fclose(f);
+		return -1;
+	}
+	fclose(f);
+
+	cp = strchr(cp, '\n');
+	if (cp)
+		*cp = '\0';
+
+	/* strip { } notation */
+	if (buf[0] == '{' && (cp = strchr(buf, '}')))
+		*cp = '\0';
+
+	if (uuid_parse(buf, uu) < 0) {
+		RTE_LOG(ERR, EAL, "%s %s not a valid UUID\n",
+			filename, buf);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* map a particular resource from a file */
+static void *
+vmbus_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
+		   int flags)
+{
+	void *mapaddr;
+
+	/* Map the memory resource of device */
+	mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
+		       MAP_SHARED | flags, fd, offset);
+	if (mapaddr == MAP_FAILED ||
+	    (requested_addr != NULL && mapaddr != requested_addr)) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s)\n",
+			__func__, fd, requested_addr,
+			(unsigned long)size, (unsigned long)offset,
+			strerror(errno));
+	} else
+		RTE_LOG(DEBUG, EAL, "  VMBUS memory mapped at %p\n", mapaddr);
+
+	return mapaddr;
+}
+
+/* unmap a particular resource */
+static void
+vmbus_unmap_resource(void *requested_addr, size_t size)
+{
+	if (requested_addr == NULL)
+		return;
+
+	/* Unmap the VMBUS memory resource of device */
+	if (munmap(requested_addr, size)) {
+		RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+			__func__, requested_addr, (unsigned long)size,
+			strerror(errno));
+	} else
+		RTE_LOG(DEBUG, EAL, "  VMBUS memory unmapped at %p\n",
+				requested_addr);
+}
+
+/* Only supports current kernel version
+ * Unlike PCI there is no option (or need) to create UIO device.
+ */
+static int vmbus_get_uio_dev(const char *name,
+			     char *dstbuf, size_t buflen)
+{
+	char dirname[PATH_MAX];
+	unsigned int uio_num;
+	struct dirent *e;
+	DIR *dir;
+
+	snprintf(dirname, sizeof(dirname),
+		 "/sys/bus/vmbus/devices/%s/uio", name);
+
+	dir = opendir(dirname);
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot map uio resources for %s: %s\n",
+			name, strerror(errno));
+		return -1;
+	}
+
+	/* take the first file starting with "uio" */
+	while ((e = readdir(dir)) != NULL) {
+		if (sscanf(e->d_name, "uio%u", &uio_num) != 1)
+			continue;
+
+		snprintf(dstbuf, buflen, "%s/uio%u", dirname, uio_num);
+		break;
+	}
+	closedir(dir);
+
+	return e ? (int) uio_num : -1;
+}
+
+/*
+ * parse a sysfs file containing one integer value
+ * different to the eal version, as it needs to work with 64-bit values
+ */
+static int
+vmbus_parse_sysfs_value(const char *dir, const char *name,
+			uint64_t *val)
+{
+	char filename[PATH_MAX];
+	FILE *f;
+	char buf[BUFSIZ];
+	char *end = NULL;
+
+	snprintf(filename, sizeof(filename), "%s/%s", dir, name);
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
+				__func__, filename);
+		fclose(f);
+		return -1;
+	}
+	fclose(f);
+
+	*val = strtoull(buf, &end, 0);
+	if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
+		RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+	return 0;
+}
+
+/* Get mappings out of values provided by uio */
+static int
+vmbus_uio_get_mappings(const char *uioname,
+		       struct vmbus_map maps[])
+{
+	int i;
+
+	for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
+		struct vmbus_map *map = &maps[i];
+		char dirname[PATH_MAX];
+
+		/* check if map directory exists */
+		snprintf(dirname, sizeof(dirname),
+			 "%s/maps/map%d", uioname, i);
+
+		if (access(dirname, F_OK) != 0)
+			break;
+
+		/* get mapping offset */
+		if (vmbus_parse_sysfs_value(dirname, "offset",
+					    &map->offset) < 0)
+			return -1;
+
+		/* get mapping size */
+		if (vmbus_parse_sysfs_value(dirname, "size",
+					    &map->size) < 0)
+			return -1;
+
+		/* get mapping physical address */
+		if (vmbus_parse_sysfs_value(dirname, "addr",
+					    &maps->phaddr) < 0)
+			return -1;
+	}
+
+	return i;
+}
+
+static void
+vmbus_uio_free_resource(struct rte_vmbus_device *dev,
+		struct mapped_vmbus_resource *uio_res)
+{
+	rte_free(uio_res);
+
+	if (dev->intr_handle.fd) {
+		close(dev->intr_handle.fd);
+		dev->intr_handle.fd = -1;
+		dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+	}
+}
+
+static struct mapped_vmbus_resource *
+vmbus_uio_alloc_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	char dirname[PATH_MAX], devname[PATH_MAX];
+	int uio_num, nb_maps;
+
+	uio_num = vmbus_get_uio_dev(dev->sysfs_name, dirname, sizeof(dirname));
+	if (uio_num < 0) {
+		RTE_LOG(WARNING, EAL,
+			"  %s not managed by UIO driver, skipping\n",
+			dev->sysfs_name);
+		return NULL;
+	}
+
+	/* allocate the mapping details for secondary processes*/
+	uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
+	if (uio_res == NULL) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot store uio mmap details\n", __func__);
+		goto error;
+	}
+
+	snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
+	dev->intr_handle.fd = open(devname, O_RDWR);
+	if (dev->intr_handle.fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+			devname, strerror(errno));
+		goto error;
+	}
+
+	dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
+
+	snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
+	uuid_copy(uio_res->uuid, dev->device_id);
+
+	nb_maps = vmbus_uio_get_mappings(dirname, uio_res->maps);
+	if (nb_maps < 0)
+		goto error;
+
+	RTE_LOG(DEBUG, EAL, "Found %d memory maps for device %s\n",
+		nb_maps, dev->sysfs_name);
+
+	return uio_res;
+
+ error:
+	vmbus_uio_free_resource(dev, uio_res);
+	return NULL;
+}
+
+static int
+vmbus_uio_map_resource_by_index(struct rte_vmbus_device *dev,
+				unsigned int res_idx,
+				struct mapped_vmbus_resource *uio_res,
+				unsigned int map_idx)
+{
+	struct vmbus_map *maps = uio_res->maps;
+	char devname[PATH_MAX];
+	void *mapaddr;
+	int fd;
+
+	snprintf(devname, sizeof(devname),
+		 "/sys/bus/vmbus/%s/resource%u", dev->sysfs_name, res_idx);
+
+	fd = open(devname, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+				devname, strerror(errno));
+		return -1;
+	}
+
+	/* allocate memory to keep path */
+	maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
+	if (maps[map_idx].path == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot allocate memory for path: %s\n",
+				strerror(errno));
+		return -1;
+	}
+
+	/* try mapping somewhere close to the end of hugepages */
+	if (vmbus_map_addr == NULL)
+		vmbus_map_addr = pci_find_max_end_va();
+
+	mapaddr = vmbus_map_resource(vmbus_map_addr, fd, 0,
+				     dev->mem_resource[res_idx].len, 0);
+	close(fd);
+	if (mapaddr == MAP_FAILED) {
+		rte_free(maps[map_idx].path);
+		return -1;
+	}
+
+	vmbus_map_addr = RTE_PTR_ADD(mapaddr,
+				     dev->mem_resource[res_idx].len);
+
+	maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
+	maps[map_idx].size = dev->mem_resource[res_idx].len;
+	maps[map_idx].addr = mapaddr;
+	maps[map_idx].offset = 0;
+	strcpy(maps[map_idx].path, devname);
+	dev->mem_resource[res_idx].addr = mapaddr;
+
+	return 0;
+}
+
+static void
+vmbus_uio_unmap(struct mapped_vmbus_resource *uio_res)
+{
+	int i;
+
+	if (uio_res == NULL)
+		return;
+
+	for (i = 0; i != uio_res->nb_maps; i++) {
+		vmbus_unmap_resource(uio_res->maps[i].addr,
+				     uio_res->maps[i].size);
+
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+			rte_free(uio_res->maps[i].path);
+	}
+}
+
+static struct mapped_vmbus_resource *
+vmbus_uio_find_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	if (dev == NULL)
+		return NULL;
+
+	TAILQ_FOREACH(uio_res, uio_res_list, next) {
+		if (uuid_compare(uio_res->uuid, dev->device_id) == 0)
+			return uio_res;
+	}
+	return NULL;
+}
+
+/* unmap the VMBUS resource of a VMBUS device in virtual memory */
+static void
+vmbus_uio_unmap_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	if (dev == NULL)
+		return;
+
+	/* find an entry for the device */
+	uio_res = vmbus_uio_find_resource(dev);
+	if (uio_res == NULL)
+		return;
+
+	/* secondary processes - just free maps */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return vmbus_uio_unmap(uio_res);
+
+	TAILQ_REMOVE(uio_res_list, uio_res, next);
+
+	/* unmap all resources */
+	vmbus_uio_unmap(uio_res);
+
+	/* free uio resource */
+	rte_free(uio_res);
+
+	/* close fd if in primary process */
+	close(dev->intr_handle.fd);
+	if (dev->intr_handle.uio_cfg_fd >= 0) {
+		close(dev->intr_handle.uio_cfg_fd);
+		dev->intr_handle.uio_cfg_fd = -1;
+	}
+
+	dev->intr_handle.fd = -1;
+	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+
+static int
+vmbus_uio_map_secondary(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	TAILQ_FOREACH(uio_res, uio_res_list, next) {
+		int i;
+
+		/* skip this element if it doesn't match our id */
+		if (uuid_compare(uio_res->uuid, dev->device_id))
+			continue;
+
+		for (i = 0; i != uio_res->nb_maps; i++) {
+			void *mapaddr;
+			int fd;
+
+			fd = open(uio_res->maps[i].path, O_RDWR);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+					uio_res->maps[i].path, strerror(errno));
+				return -1;
+			}
+
+			mapaddr = vmbus_map_resource(uio_res->maps[i].addr, fd,
+						     uio_res->maps[i].offset,
+						     uio_res->maps[i].size, 0);
+			/* fd is not needed in slave process, close it */
+			close(fd);
+
+			if (mapaddr == uio_res->maps[i].addr)
+				continue;
+
+			RTE_LOG(ERR, EAL,
+				"Cannot mmap device resource file %s to address: %p\n",
+				uio_res->maps[i].path,
+				uio_res->maps[i].addr);
+
+			/* unmap addrs correctly mapped */
+			while (i != 0) {
+				--i;
+				vmbus_unmap_resource(uio_res->maps[i].addr,
+						     uio_res->maps[i].size);
+			}
+			return -1;
+
+		}
+		return 0;
+	}
+
+	RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
+	return 1;
+}
+
+/* map the resources of a vmbus device in virtual memory */
+int
+rte_eal_vmbus_map_device(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+		RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head, mapped_vmbus_res_list);
+	int i, ret, map_idx = 0;
+
+	dev->intr_handle.fd = -1;
+	dev->intr_handle.uio_cfg_fd = -1;
+	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+
+	/* secondary processes - use already recorded details */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return vmbus_uio_map_secondary(dev);
+
+	/* allocate uio resource */
+	uio_res = vmbus_uio_alloc_resource(dev);
+	if (uio_res == NULL)
+		return -1;
+
+	/* Map all BARs */
+	for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
+		uint64_t phaddr;
+
+		/* skip empty BAR */
+		phaddr = dev->mem_resource[i].phys_addr;
+		if (phaddr == 0)
+			continue;
+
+		ret = vmbus_uio_map_resource_by_index(dev, i,
+						      uio_res, map_idx);
+		if (ret)
+			goto error;
+
+		map_idx++;
+	}
+
+	uio_res->nb_maps = map_idx;
+
+	TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
+
+	return 0;
+error:
+	for (i = 0; i < map_idx; i++) {
+		vmbus_unmap_resource(uio_res->maps[i].addr,
+				     uio_res->maps[i].size);
+		rte_free(uio_res->maps[i].path);
+	}
+	vmbus_uio_free_resource(dev, uio_res);
+	return -1;
+}
+
+/* Scan one vmbus sysfs entry, and fill the devices list from it. */
+static int
+vmbus_scan_one(const char *name)
+{
+	struct rte_vmbus_device *dev, *dev2;
+	char filename[PATH_MAX];
+	char dirname[PATH_MAX];
+	unsigned long tmp;
+
+	dev = malloc(sizeof(*dev) + strlen(name) + 1);
+	if (dev == NULL)
+		return -1;
+
+	memset(dev, 0, sizeof(*dev));
+	strcpy(dev->sysfs_name, name);
+	if (dev->sysfs_name == NULL)
+		goto error;
+
+	/* sysfs base directory
+	 *   /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
+	 * or on older kernel
+	 *   /sys/bus/vmbus/devices/vmbus_1
+	 */
+	snprintf(dirname, sizeof(dirname), "%s/%s",
+		 SYSFS_VMBUS_DEVICES, name);
+
+	/* get device id */
+	snprintf(filename, sizeof(filename), "%s/device_id", dirname);
+	if (vmbus_get_sysfs_uuid(filename, dev->device_id) < 0)
+		goto error;
+
+	/* get device class  */
+	snprintf(filename, sizeof(filename), "%s/class_id", dirname);
+	if (vmbus_get_sysfs_uuid(filename, dev->class_id) < 0)
+		goto error;
+
+	/* get relid */
+	snprintf(filename, sizeof(filename), "%s/id", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		goto error;
+	dev->relid = tmp;
+
+	/* get monitor id */
+	snprintf(filename, sizeof(filename), "%s/monitor_id", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		goto error;
+	dev->monitor_id = tmp;
+
+	/* get numa node */
+	snprintf(filename, sizeof(filename), "%s/numa_node",
+		 dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		/* if no NUMA support, set default to 0 */
+		dev->device.numa_node = 0;
+	else
+		dev->device.numa_node = tmp;
+
+	/* device is valid, add in list (sorted) */
+	RTE_LOG(DEBUG, EAL, "Adding vmbus device %s\n", name);
+
+	TAILQ_FOREACH(dev2, &vmbus_device_list, next) {
+		int ret;
+
+		ret = uuid_compare(dev->device_id, dev->device_id);
+		if (ret > 0)
+			continue;
+
+		if (ret < 0) {
+			TAILQ_INSERT_BEFORE(dev2, dev, next);
+			rte_eal_device_insert(&dev->device);
+		} else { /* already registered */
+			memmove(dev2->mem_resource, dev->mem_resource,
+				sizeof(dev->mem_resource));
+			free(dev);
+		}
+		return 0;
+	}
+
+	rte_eal_device_insert(&dev->device);
+	TAILQ_INSERT_TAIL(&vmbus_device_list, dev, next);
+
+	return 0;
+error:
+	free(dev);
+	return -1;
+}
+
+/*
+ * Scan the content of the vmbus, and the devices in the devices list
+ */
+static int
+vmbus_scan(void)
+{
+	struct dirent *e;
+	DIR *dir;
+
+	dir = opendir(SYSFS_VMBUS_DEVICES);
+	if (dir == NULL) {
+		if (errno == ENOENT)
+			return 0;
+
+		RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
+			__func__, strerror(errno));
+		return -1;
+	}
+
+	while ((e = readdir(dir)) != NULL) {
+		if (e->d_name[0] == '.')
+			continue;
+
+		if (vmbus_scan_one(e->d_name) < 0)
+			goto error;
+	}
+	closedir(dir);
+	return 0;
+
+error:
+	closedir(dir);
+	return -1;
+}
+
+/* Init the VMBUS EAL subsystem */
+int rte_eal_vmbus_init(void)
+{
+	/* VMBUS can be disabled */
+	if (internal_config.no_vmbus)
+		return 0;
+
+	if (vmbus_scan() < 0) {
+		RTE_LOG(ERR, EAL, "%s(): Cannot scan vmbus\n", __func__);
+		return -1;
+	}
+	return 0;
+}
+
+/* Below is PROBE part of eal_vmbus library */
+
+/*
+ * If device ID match, call the devinit() function of the driver.
+ */
+static int
+rte_eal_vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
+			       struct rte_vmbus_device *dev)
+{
+	const uuid_t *id_table;
+
+	RTE_LOG(DEBUG, EAL, "  probe driver: %s\n", dr->driver.name);
+
+	for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
+		struct rte_devargs *args;
+		char guid[UUID_BUF_SZ];
+		int ret;
+
+		/* skip devices not assocaited with this device class */
+		if (uuid_compare(*id_table, dev->class_id) != 0)
+			continue;
+
+		uuid_unparse(dev->device_id, guid);
+		RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
+			guid, dev->device.numa_node);
+
+		/* no initialization when blacklisted, return without error */
+		args = dev->device.devargs;
+		if (args && args->type == RTE_DEVTYPE_BLACKLISTED_VMBUS) {
+			RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
+			return 1;
+		}
+
+		RTE_LOG(INFO, EAL, "  probe driver: %s\n", dr->driver.name);
+
+		/* map resources for device */
+		ret = rte_eal_vmbus_map_device(dev);
+		if (ret != 0)
+			return ret;
+
+		/* reference driver structure */
+		dev->driver = dr;
+
+		/* call the driver probe() function */
+		ret = dr->probe(dr, dev);
+		if (ret)
+			dev->driver = NULL;
+
+		return ret;
+	}
+
+	/* return positive value if driver doesn't support this device */
+	return 1;
+}
+
+
+/*
+ * If vendor/device ID match, call the remove() function of the
+ * driver.
+ */
+static int
+vmbus_detach_dev(struct rte_vmbus_driver *dr,
+		 struct rte_vmbus_device *dev)
+{
+	const uuid_t *id_table;
+
+	for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
+		char guid[UUID_BUF_SZ];
+
+		/* skip devices not assocaited with this device class */
+		if (uuid_compare(*id_table, dev->class_id) != 0)
+			continue;
+
+		uuid_unparse(dev->device_id, guid);
+		RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
+			guid, dev->device.numa_node);
+
+		RTE_LOG(DEBUG, EAL, "  remove driver: %s\n", dr->driver.name);
+
+		if (dr->remove && (dr->remove(dev) < 0))
+			return -1;	/* negative value is an error */
+
+		/* clear driver structure */
+		dev->driver = NULL;
+
+		vmbus_uio_unmap_resource(dev);
+		return 0;
+	}
+
+	/* return positive value if driver doesn't support this device */
+	return 1;
+}
+
+/*
+ * call the devinit() function of all
+ * registered drivers for the vmbus device. Return -1 if no driver is
+ * found for this class of vmbus device.
+ * The present assumption is that we have drivers only for vmbus network
+ * devices. That's why we don't check driver's id_table now.
+ */
+static int
+vmbus_probe_all_drivers(struct rte_vmbus_device *dev)
+{
+	struct rte_vmbus_driver *dr = NULL;
+	int ret;
+
+	TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
+		ret = rte_eal_vmbus_probe_one_driver(dr, dev);
+		if (ret < 0) {
+			/* negative value is an error */
+			RTE_LOG(ERR, EAL, "Failed to probe driver %s\n",
+				dr->driver.name);
+			return -1;
+		}
+		/* positive value means driver doesn't support it */
+		if (ret > 0)
+			continue;
+
+		return 0;
+	}
+
+	return 1;
+}
+
+
+/*
+ * If device ID matches, call the remove() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+vmbus_detach_all_drivers(struct rte_vmbus_device *dev)
+{
+	struct rte_vmbus_driver *dr;
+	int rc = 0;
+
+	if (dev == NULL)
+		return -1;
+
+	TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
+		rc = vmbus_detach_dev(dr, dev);
+		if (rc < 0)
+			/* negative value is an error */
+			return -1;
+		if (rc > 0)
+			/* positive value means driver doesn't support it */
+			continue;
+		return 0;
+	}
+	return 1;
+}
+
+/* Detach device specified by its VMBUS id */
+int
+rte_eal_vmbus_detach(uuid_t device_id)
+{
+	struct rte_vmbus_device *dev;
+	char ubuf[UUID_BUF_SZ];
+
+	TAILQ_FOREACH(dev, &vmbus_device_list, next) {
+		if (uuid_compare(dev->device_id, device_id) != 0)
+			continue;
+
+		if (vmbus_detach_all_drivers(dev) < 0)
+			goto err_return;
+
+		TAILQ_REMOVE(&vmbus_device_list, dev, next);
+		free(dev);
+		return 0;
+	}
+	return -1;
+
+err_return:
+	uuid_unparse(device_id, ubuf);
+	RTE_LOG(WARNING, EAL, "Requested device %s cannot be used\n",
+		ubuf);
+	return -1;
+}
+
+/*
+ * Scan the vmbus, and call the devinit() function for
+ * all registered drivers that have a matching entry in its id_table
+ * for discovered devices.
+ */
+int
+rte_eal_vmbus_probe(void)
+{
+	struct rte_vmbus_device *dev = NULL;
+
+	TAILQ_FOREACH(dev, &vmbus_device_list, next) {
+		char ubuf[UUID_BUF_SZ];
+
+		uuid_unparse(dev->device_id, ubuf);
+
+		RTE_LOG(DEBUG, EAL, "Probing driver for device %s ...\n",
+			ubuf);
+		vmbus_probe_all_drivers(dev);
+	}
+	return 0;
+}
+
+/* register vmbus driver */
+void
+rte_eal_vmbus_register(struct rte_vmbus_driver *driver)
+{
+	TAILQ_INSERT_TAIL(&vmbus_driver_list, driver, next);
+}
+
+/* unregister vmbus driver */
+void
+rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver)
+{
+	TAILQ_REMOVE(&vmbus_driver_list, driver, next);
+}
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index acc0e556..92f296a0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3331,3 +3331,93 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
 				-ENOTSUP);
 	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
 }
+
+
+#ifdef RTE_LIBRTE_HV_PMD
+int
+rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
+			struct rte_vmbus_device *vmbus_dev)
+{
+	struct eth_driver  *eth_drv = (struct eth_driver *)vmbus_drv;
+	struct rte_eth_dev *eth_dev;
+	char ustr[UUID_BUF_SZ];
+	int diag;
+
+	uuid_unparse(vmbus_dev->device_id, ustr);
+
+	eth_dev = rte_eth_dev_allocate(ustr);
+	if (eth_dev == NULL)
+		return -ENOMEM;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
+				  eth_drv->dev_private_size,
+				  RTE_CACHE_LINE_SIZE);
+		if (eth_dev->data->dev_private == NULL)
+			rte_panic("Cannot allocate memzone for private port data\n");
+	}
+
+	eth_dev->device = &vmbus_dev->device;
+	eth_dev->driver = eth_drv;
+	eth_dev->data->rx_mbuf_alloc_failed = 0;
+
+	/* init user callbacks */
+	TAILQ_INIT(&(eth_dev->link_intr_cbs));
+
+	/*
+	 * Set the default maximum frame size.
+	 */
+	eth_dev->data->mtu = ETHER_MTU;
+
+	/* Invoke PMD device initialization function */
+	diag = (*eth_drv->eth_dev_init)(eth_dev);
+	if (diag == 0)
+		return 0;
+
+	RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(%s) failed\n",
+			    vmbus_drv->driver.name, ustr);
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_free(eth_dev->data->dev_private);
+
+	return diag;
+}
+
+int
+rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev)
+{
+	const struct eth_driver *eth_drv;
+	struct rte_eth_dev *eth_dev;
+	char ustr[UUID_BUF_SZ];
+	int ret;
+
+	if (vmbus_dev == NULL)
+		return -EINVAL;
+
+	uuid_unparse(vmbus_dev->device_id, ustr);
+	eth_dev = rte_eth_dev_allocated(ustr);
+	if (eth_dev == NULL)
+		return -ENODEV;
+
+	eth_drv = (const struct eth_driver *)vmbus_dev->driver;
+
+	/* Invoke PMD device uninit function */
+	if (*eth_drv->eth_dev_uninit) {
+		ret = (*eth_drv->eth_dev_uninit)(eth_dev);
+		if (ret)
+			return ret;
+	}
+
+	/* free ether device */
+	rte_eth_dev_release_port(eth_dev);
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_free(eth_dev->data->dev_private);
+
+	eth_dev->device = NULL;
+	eth_dev->driver = NULL;
+	eth_dev->data = NULL;
+
+	return 0;
+}
+#endif
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d310541a..d5f06529 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -180,6 +180,7 @@ extern "C" {
 #include <rte_log.h>
 #include <rte_interrupts.h>
 #include <rte_pci.h>
+#include <rte_vmbus.h>
 #include <rte_dev.h>
 #include <rte_devargs.h>
 #include "rte_ether.h"
@@ -1876,6 +1877,15 @@ struct rte_pci_eth_driver {
 };
 
 /**
+ * @internal
+ * The structure associated with a PMD VMBUS Ethernet driver.
+ */
+struct rte_vmbus_eth_driver {
+	struct rte_vmbus_driver vmbus_drv;	/**< Underlying VMBUS driver. */
+	struct eth_driver	eth_drv;	/**< Ethernet driver. */
+};
+
+/**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
  *
@@ -4399,6 +4409,21 @@ int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
  */
 int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
 
+/**
+ * @internal
+ * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
+ * interface.
+ */
+int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
+			  struct rte_vmbus_device *vmbus_dev);
+
+/**
+ * @internal
+ * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
+ * interface.
+ */
+int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index f75f0e24..6b304084 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -130,6 +130,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST)      += -lrte_pmd_vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)    += -lrte_pmd_vmxnet3_uio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_HV_PMD)	    += -luuid
 
 ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)    += -lrte_pmd_aesni_mb
-- 
2.11.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev
  2017-01-02 23:08 ` [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
@ 2017-01-06  1:50   ` Wu, Jingjing
  0 siblings, 0 replies; 8+ messages in thread
From: Wu, Jingjing @ 2017-01-06  1:50 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Stephen Hemminger



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Tuesday, January 3, 2017 7:09 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Subject: [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev
> 
> Later patches remove pci_dev from the ethernet device structure.
> Fix the i40e code to just use it's own name when forming zone name.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

Acked-by: Jingjing Wu <jingjing.wu@intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-01-06  1:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-02 23:08 [dpdk-dev] [PATCH 0/6] VMBUS and driver infrastructure Stephen Hemminger
2017-01-02 23:08 ` [dpdk-dev] [PATCH 1/6] ethdev: increase length ethernet device internal name Stephen Hemminger
2017-01-02 23:08 ` [dpdk-dev] [PATCH 2/6] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
2017-01-06  1:50   ` Wu, Jingjing
2017-01-02 23:08 ` [dpdk-dev] [PATCH 3/6] vmxnet3: " Stephen Hemminger
2017-01-02 23:08 ` [dpdk-dev] [PATCH 4/6] cxgbe: " Stephen Hemminger
2017-01-02 23:08 ` [dpdk-dev] [PATCH 5/6] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
2017-01-02 23:08 ` [dpdk-dev] [PATCH 6/6] eal: VMBUS infrastructure Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).