- * [dpdk-dev] [PATCH 1/8] ethdev: increase length ethernet device internal name
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Allow sufficicent space for UUID in string form (36+1).
Needed to use UUID with Hyper-V
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 +++
 lib/librte_ether/rte_ethdev.h        | 6 +++++-
 2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 1438c777..69669e44 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -58,6 +58,9 @@ Deprecation Notices
   ``port`` field, may be moved or removed as part of this mbuf work. A
   ``timestamp`` will also be added.
 
+* ethdev: for 17.02 the size of internal device name will be increased
+  to 40 characters to allow for storing UUID.
+
 * The mbuf flags PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated and
   are respectively replaced by PKT_RX_VLAN_STRIPPED and
   PKT_RX_QINQ_STRIPPED, that are better described. The old flags and
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1c356c1b..b4168830 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1682,7 +1682,11 @@ struct rte_eth_dev_sriov {
 };
 #define RTE_ETH_DEV_SRIOV(dev)         ((dev)->data->sriov)
 
-#define RTE_ETH_NAME_MAX_LEN (32)
+/*
+ * Internal identifier length
+ * Sufficiently large to allow for UUID or PCI address
+ */
+#define RTE_ETH_NAME_MAX_LEN 40
 
 /**
  * @internal
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 1/8] ethdev: increase length ethernet device internal name Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 12:08   ` Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 3/8] vmxnet3: " Stephen Hemminger
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Later patches remove pci_dev from the ethernet device structure.
Fix the i40e code to just use it's own name when forming zone name.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/i40e/i40e_fdir.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 335bf15c..68a2523c 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -250,8 +250,7 @@ i40e_fdir_setup(struct i40e_pf *pf)
 	}
 
 	/* reserve memory for the fdir programming packet */
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d",
-			eth_dev->driver->pci_drv.driver.name,
+	snprintf(z_name, sizeof(z_name), "i40e_%s_%d",
 			I40E_FDIR_MZ_NAME,
 			eth_dev->data->port_id);
 	mz = i40e_memzone_reserve(z_name, I40E_FDIR_PKT_LEN, SOCKET_ID_ANY);
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
@ 2017-01-10 12:08   ` Jan Blunck
  2017-01-10 17:57     ` Stephen Hemminger
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 12:08 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Later patches remove pci_dev from the ethernet device structure.
> Fix the i40e code to just use it's own name when forming zone name.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  drivers/net/i40e/i40e_fdir.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
> index 335bf15c..68a2523c 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -250,8 +250,7 @@ i40e_fdir_setup(struct i40e_pf *pf)
>         }
>
>         /* reserve memory for the fdir programming packet */
> -       snprintf(z_name, sizeof(z_name), "%s_%s_%d",
> -                       eth_dev->driver->pci_drv.driver.name,
> +       snprintf(z_name, sizeof(z_name), "i40e_%s_%d",
The driver is called 'net_i40e'.
>                         I40E_FDIR_MZ_NAME,
>                         eth_dev->data->port_id);
>         mz = i40e_memzone_reserve(z_name, I40E_FDIR_PKT_LEN, SOCKET_ID_ANY);
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
- * Re: [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev
  2017-01-10 12:08   ` Jan Blunck
@ 2017-01-10 17:57     ` Stephen Hemminger
  2017-01-11  7:55       ` Jan Blunck
  0 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-10 17:57 UTC (permalink / raw)
  To: Jan Blunck; +Cc: dev, Stephen Hemminger
On Tue, 10 Jan 2017 13:08:30 +0100
Jan Blunck <jblunck@infradead.org> wrote:
> On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > Later patches remove pci_dev from the ethernet device structure.
> > Fix the i40e code to just use it's own name when forming zone name.
> >
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > ---
> >  drivers/net/i40e/i40e_fdir.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
> > index 335bf15c..68a2523c 100644
> > --- a/drivers/net/i40e/i40e_fdir.c
> > +++ b/drivers/net/i40e/i40e_fdir.c
> > @@ -250,8 +250,7 @@ i40e_fdir_setup(struct i40e_pf *pf)
> >         }
> >
> >         /* reserve memory for the fdir programming packet */
> > -       snprintf(z_name, sizeof(z_name), "%s_%s_%d",
> > -                       eth_dev->driver->pci_drv.driver.name,
> > +       snprintf(z_name, sizeof(z_name), "i40e_%s_%d",  
> 
> The driver is called 'net_i40e'.
It really doesn't matter. The memory name is just so that primary and secondary
find the same resources.  Having net_ on the front doesn't change or help.
^ permalink raw reply	[flat|nested] 30+ messages in thread 
- * Re: [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev
  2017-01-10 17:57     ` Stephen Hemminger
@ 2017-01-11  7:55       ` Jan Blunck
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Blunck @ 2017-01-11  7:55 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Tue, Jan 10, 2017 at 6:57 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 10 Jan 2017 13:08:30 +0100
> Jan Blunck <jblunck@infradead.org> wrote:
>
>> On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> > Later patches remove pci_dev from the ethernet device structure.
>> > Fix the i40e code to just use it's own name when forming zone name.
>> >
>> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>> > ---
>> >  drivers/net/i40e/i40e_fdir.c | 3 +--
>> >  1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
>> > index 335bf15c..68a2523c 100644
>> > --- a/drivers/net/i40e/i40e_fdir.c
>> > +++ b/drivers/net/i40e/i40e_fdir.c
>> > @@ -250,8 +250,7 @@ i40e_fdir_setup(struct i40e_pf *pf)
>> >         }
>> >
>> >         /* reserve memory for the fdir programming packet */
>> > -       snprintf(z_name, sizeof(z_name), "%s_%s_%d",
>> > -                       eth_dev->driver->pci_drv.driver.name,
>> > +       snprintf(z_name, sizeof(z_name), "i40e_%s_%d",
>>
>> The driver is called 'net_i40e'.
>
> It really doesn't matter. The memory name is just so that primary and secondary
> find the same resources.  Having net_ on the front doesn't change or help.
I understand. Still David Marchand just recently went through the
exercise to align all driver names and their usage.
Is there a reason why you didn't choose to use eth_dev->data->drv_name?
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
 
 
- * [dpdk-dev] [PATCH 3/8] vmxnet3: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 1/8] ethdev: increase length ethernet device internal name Stephen Hemminger
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 2/8] i40e: don't refer to eth_dev->pci_dev Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 12:10   ` Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 4/8] cxgbe: " Stephen Hemminger
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Fix the vmxnet3 code to just use it's own name when forming zone name.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 36513693..8df4c8ea 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -849,9 +849,8 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
 	char z_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 dev->driver->pci_drv.driver.name, ring_name,
-		 dev->data->port_id, queue_id);
+	snprintf(z_name, sizeof(z_name), "vmxnet3_%s_%d_%d",
+		 ring_name, dev->data->port_id, queue_id);
 
 	mz = rte_memzone_lookup(z_name);
 	if (mz)
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 3/8] vmxnet3: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 3/8] vmxnet3: " Stephen Hemminger
@ 2017-01-10 12:10   ` Jan Blunck
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 12:10 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Fix the vmxnet3 code to just use it's own name when forming zone name.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_rxtx.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
> index 36513693..8df4c8ea 100644
> --- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
> +++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
> @@ -849,9 +849,8 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
>         char z_name[RTE_MEMZONE_NAMESIZE];
>         const struct rte_memzone *mz;
>
> -       snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                dev->driver->pci_drv.driver.name, ring_name,
> -                dev->data->port_id, queue_id);
> +       snprintf(z_name, sizeof(z_name), "vmxnet3_%s_%d_%d",
> +                ring_name, dev->data->port_id, queue_id);
The driver is called 'net_vmxnet3'.
>
>         mz = rte_memzone_lookup(z_name);
>         if (mz)
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
- * [dpdk-dev] [PATCH 4/8] cxgbe: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
                   ` (2 preceding siblings ...)
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 3/8] vmxnet3: " Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 12:12   ` Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 5/8] nfp: " Stephen Hemminger
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Later patches remove pci_dev from the ethernet device structure.
Fix the cxgbe code to just use it's own name when forming zone name.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/cxgbe/sge.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 736f08ce..e935dc42 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1644,8 +1644,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
 	/* Size needs to be multiple of 16, including status entry. */
 	iq->size = cxgbe_roundup(iq->size, 16);
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 eth_dev->driver->pci_drv.driver.name,
+	snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
 		 fwevtq ? "fwq_ring" : "rx_ring",
 		 eth_dev->data->port_id, queue_id);
 	snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
@@ -1697,8 +1696,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
 			fl->size = s->fl_starve_thres - 1 + 2 * 8;
 		fl->size = cxgbe_roundup(fl->size, 8);
 
-		snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-			 eth_dev->driver->pci_drv.driver.name,
+		snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
 			 fwevtq ? "fwq_ring" : "fl_ring",
 			 eth_dev->data->port_id, queue_id);
 		snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
@@ -1893,8 +1891,7 @@ int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
 	/* Add status entries */
 	nentries = txq->q.size + s->stat_len / sizeof(struct tx_desc);
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 eth_dev->driver->pci_drv.driver.name, "tx_ring",
+	snprintf(z_name, sizeof(z_name), "cxgbe_%d_%d",
 		 eth_dev->data->port_id, queue_id);
 	snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
 
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 4/8] cxgbe: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 4/8] cxgbe: " Stephen Hemminger
@ 2017-01-10 12:12   ` Jan Blunck
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 12:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Later patches remove pci_dev from the ethernet device structure.
> Fix the cxgbe code to just use it's own name when forming zone name.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  drivers/net/cxgbe/sge.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
> index 736f08ce..e935dc42 100644
> --- a/drivers/net/cxgbe/sge.c
> +++ b/drivers/net/cxgbe/sge.c
> @@ -1644,8 +1644,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
>         /* Size needs to be multiple of 16, including status entry. */
>         iq->size = cxgbe_roundup(iq->size, 16);
>
> -       snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                eth_dev->driver->pci_drv.driver.name,
> +       snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
The driver is called 'net_cxgbe'.
>                  fwevtq ? "fwq_ring" : "rx_ring",
>                  eth_dev->data->port_id, queue_id);
>         snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
> @@ -1697,8 +1696,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq,
>                         fl->size = s->fl_starve_thres - 1 + 2 * 8;
>                 fl->size = cxgbe_roundup(fl->size, 8);
>
> -               snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                        eth_dev->driver->pci_drv.driver.name,
> +               snprintf(z_name, sizeof(z_name), "cxgbe_%s_%d_%d",
>                          fwevtq ? "fwq_ring" : "fl_ring",
>                          eth_dev->data->port_id, queue_id);
>                 snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
> @@ -1893,8 +1891,7 @@ int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
>         /* Add status entries */
>         nentries = txq->q.size + s->stat_len / sizeof(struct tx_desc);
>
> -       snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                eth_dev->driver->pci_drv.driver.name, "tx_ring",
> +       snprintf(z_name, sizeof(z_name), "cxgbe_%d_%d",
>                  eth_dev->data->port_id, queue_id);
>         snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
>
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
- * [dpdk-dev] [PATCH 5/8] nfp: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
                   ` (3 preceding siblings ...)
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 4/8] cxgbe: " Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 12:13   ` Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 6/8] qat: " Stephen Hemminger
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Later patches remove pci_dev from the ethernet device structure.
Fix the nfp code to just use it's own name when forming zone name.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/nfp/nfp_net.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index e85315f1..970b5c84 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -213,8 +213,7 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
 	char z_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 
-	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 dev->driver->pci_drv.driver.name,
+	snprintf(z_name, sizeof(z_name), "nfp_%s_%u_%u",
 		 ring_name, dev->data->port_id, queue_id);
 
 	mz = rte_memzone_lookup(z_name);
@@ -1009,7 +1008,6 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
 	dev_info->pci_dev = RTE_DEV_TO_PCI(dev->device);
-	dev_info->driver_name = dev->driver->pci_drv.driver.name;
 	dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
 	dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
 	dev_info->min_rx_bufsize = ETHER_MIN_MTU;
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 5/8] nfp: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 5/8] nfp: " Stephen Hemminger
@ 2017-01-10 12:13   ` Jan Blunck
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 12:13 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Later patches remove pci_dev from the ethernet device structure.
> Fix the nfp code to just use it's own name when forming zone name.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  drivers/net/nfp/nfp_net.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> index e85315f1..970b5c84 100644
> --- a/drivers/net/nfp/nfp_net.c
> +++ b/drivers/net/nfp/nfp_net.c
> @@ -213,8 +213,7 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
>         char z_name[RTE_MEMZONE_NAMESIZE];
>         const struct rte_memzone *mz;
>
> -       snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                dev->driver->pci_drv.driver.name,
> +       snprintf(z_name, sizeof(z_name), "nfp_%s_%u_%u",
>                  ring_name, dev->data->port_id, queue_id);
>
The driver is called 'net_nfp'.
>         mz = rte_memzone_lookup(z_name);
> @@ -1009,7 +1008,6 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
>         hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>
>         dev_info->pci_dev = RTE_DEV_TO_PCI(dev->device);
> -       dev_info->driver_name = dev->driver->pci_drv.driver.name;
>         dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
>         dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
>         dev_info->min_rx_bufsize = ETHER_MIN_MTU;
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
- * [dpdk-dev] [PATCH 6/8] qat:  don't refer to eth_dev->pci_dev
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
                   ` (4 preceding siblings ...)
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 5/8] nfp: " Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 12:15   ` Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure Stephen Hemminger
  7 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Later patches remove pci_dev from the ethernet device structure.
Fix the quick assist code to just use it's own name when forming zone name.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/crypto/qat/qat_qp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/qat/qat_qp.c b/drivers/crypto/qat/qat_qp.c
index 2e7188bd..fe76e04a 100644
--- a/drivers/crypto/qat/qat_qp.c
+++ b/drivers/crypto/qat/qat_qp.c
@@ -299,9 +299,9 @@ qat_queue_create(struct rte_cryptodev *dev, struct qat_queue *queue,
 	/*
 	 * Allocate a memzone for the queue - create a unique name.
 	 */
-	snprintf(queue->memz_name, sizeof(queue->memz_name), "%s_%s_%d_%d_%d",
-		dev->driver->pci_drv.driver.name, "qp_mem", dev->data->dev_id,
-		queue->hw_bundle_number, queue->hw_queue_number);
+	snprintf(queue->memz_name, sizeof(queue->memz_name),
+		 "qat_qp_mem_%d_%u_%u", dev->data->dev_id,
+		 queue->hw_bundle_number, queue->hw_queue_number);
 	qp_mz = queue_dma_zone_reserve(queue->memz_name, queue_size_bytes,
 			socket_id);
 	if (qp_mz == NULL) {
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 6/8] qat: don't refer to eth_dev->pci_dev
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 6/8] qat: " Stephen Hemminger
@ 2017-01-10 12:15   ` Jan Blunck
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 12:15 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Later patches remove pci_dev from the ethernet device structure.
> Fix the quick assist code to just use it's own name when forming zone name.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  drivers/crypto/qat/qat_qp.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/crypto/qat/qat_qp.c b/drivers/crypto/qat/qat_qp.c
> index 2e7188bd..fe76e04a 100644
> --- a/drivers/crypto/qat/qat_qp.c
> +++ b/drivers/crypto/qat/qat_qp.c
> @@ -299,9 +299,9 @@ qat_queue_create(struct rte_cryptodev *dev, struct qat_queue *queue,
>         /*
>          * Allocate a memzone for the queue - create a unique name.
>          */
> -       snprintf(queue->memz_name, sizeof(queue->memz_name), "%s_%s_%d_%d_%d",
> -               dev->driver->pci_drv.driver.name, "qp_mem", dev->data->dev_id,
> -               queue->hw_bundle_number, queue->hw_queue_number);
> +       snprintf(queue->memz_name, sizeof(queue->memz_name),
> +                "qat_qp_mem_%d_%u_%u", dev->data->dev_id,
This driver is called 'crypto_qat'.
> +                queue->hw_bundle_number, queue->hw_queue_number);
>         qp_mz = queue_dma_zone_reserve(queue->memz_name, queue_size_bytes,
>                         socket_id);
>         if (qp_mz == NULL) {
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread
 
- * [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
                   ` (5 preceding siblings ...)
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 6/8] qat: " Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 13:59   ` Ferruh Yigit
  2017-01-10 16:11   ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Jan Blunck
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure Stephen Hemminger
  7 siblings, 2 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
There are multiple buses and device types now. Therefore it no longer
makes sense that PCI driver information is part of the Ethernet driver
structure.
This patch removes pci_driver from eth_driver and introduces a
new combined structure for use in all existing PMD's. The rationale
is that although all existing PCI drivers are Ethernet drivers,
it make sense that future projects may want to support PCI devices
that are not Ethernet.
It also removes the requirement that driver is first element in
PCI driver structure.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 app/test/virtual_pmd.c                  | 22 ++++++++---------
 drivers/net/bnx2x/bnx2x_ethdev.c        | 16 ++++++++-----
 drivers/net/bnxt/bnxt_ethdev.c          | 22 +++++++++--------
 drivers/net/cxgbe/cxgbe_ethdev.c        |  8 ++++---
 drivers/net/e1000/em_ethdev.c           | 10 ++++----
 drivers/net/e1000/igb_ethdev.c          | 20 +++++++++-------
 drivers/net/ena/ena_ethdev.c            |  8 ++++---
 drivers/net/enic/enic_ethdev.c          |  8 ++++---
 drivers/net/fm10k/fm10k_ethdev.c        | 10 ++++----
 drivers/net/i40e/i40e_ethdev.c          | 10 ++++----
 drivers/net/i40e/i40e_ethdev_vf.c       | 10 ++++----
 drivers/net/ixgbe/ixgbe_ethdev.c        | 20 +++++++++-------
 drivers/net/mlx4/mlx4.c                 |  8 ++++---
 drivers/net/mlx5/mlx5.c                 |  8 ++++---
 drivers/net/nfp/nfp_net.c               |  8 ++++---
 drivers/net/qede/qede_ethdev.c          | 42 +++++++++++++++++----------------
 drivers/net/szedata2/rte_eth_szedata2.c | 10 ++++----
 drivers/net/thunderx/nicvf_ethdev.c     |  8 ++++---
 drivers/net/virtio/virtio_ethdev.c      | 10 ++++----
 drivers/net/vmxnet3/vmxnet3_ethdev.c    | 10 ++++----
 lib/librte_ether/rte_ethdev.c           |  9 +++----
 lib/librte_ether/rte_ethdev.h           | 18 +++++++++-----
 22 files changed, 172 insertions(+), 123 deletions(-)
diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 6e4dcd8f..e7f56527 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -533,7 +533,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	struct rte_pci_device *pci_dev = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct eth_driver *eth_drv = NULL;
-	struct rte_pci_driver *pci_drv = NULL;
+	struct rte_pci_eth_driver *pci_eth_drv = NULL;
 	struct rte_pci_id *id_table = NULL;
 	struct virtual_ethdev_private *dev_private = NULL;
 	char name_buf[RTE_RING_NAMESIZE];
@@ -554,8 +554,8 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	if (eth_drv == NULL)
 		goto err;
 
-	pci_drv = rte_zmalloc_socket(name, sizeof(*pci_drv), 0, socket_id);
-	if (pci_drv == NULL)
+	pci_eth_drv = rte_zmalloc_socket(name, sizeof(*pci_eth_drv), 0, socket_id);
+	if (pci_eth_drv == NULL)
 		goto err;
 
 	id_table = rte_zmalloc_socket(name, sizeof(*id_table), 0, socket_id);
@@ -585,17 +585,15 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 		goto err;
 
 	pci_dev->device.numa_node = socket_id;
-	pci_drv->driver.name = virtual_ethdev_driver_name;
-	pci_drv->id_table = id_table;
+	pci_eth_drv->pci_drv.driver.name = virtual_ethdev_driver_name;
+	pci_eth_drv->pci_drv.id_table = id_table;
 
 	if (isr_support)
-		pci_drv->drv_flags |= RTE_PCI_DRV_INTR_LSC;
+		pci_eth_drv->pci_drv.drv_flags |= RTE_PCI_DRV_INTR_LSC;
 	else
-		pci_drv->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
+		pci_eth_drv->pci_drv.drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
 
-
-	eth_drv->pci_drv = (struct rte_pci_driver)(*pci_drv);
-	eth_dev->driver = eth_drv;
+	eth_dev->driver = &pci_eth_drv->eth_drv;
 
 	eth_dev->data->nb_rx_queues = (uint16_t)1;
 	eth_dev->data->nb_tx_queues = (uint16_t)1;
@@ -622,7 +620,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 	dev_private->dev_ops = virtual_ethdev_default_dev_ops;
 	eth_dev->dev_ops = &dev_private->dev_ops;
 
-	pci_dev->device.driver = ð_drv->pci_drv.driver;
+	pci_dev->device.driver = &pci_eth_drv->pci_drv.driver;
 	eth_dev->device = &pci_dev->device;
 
 	eth_dev->rx_pkt_burst = virtual_ethdev_rx_burst_success;
@@ -632,7 +630,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
 
 err:
 	rte_free(pci_dev);
-	rte_free(pci_drv);
+	rte_free(pci_eth_drv);
 	rte_free(eth_drv);
 	rte_free(id_table);
 	rte_free(dev_private);
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 7140118f..ef704d72 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -618,29 +618,33 @@ eth_bnx2xvf_dev_init(struct rte_eth_dev *eth_dev)
 	return bnx2x_common_dev_init(eth_dev, 1);
 }
 
-static struct eth_driver rte_bnx2x_pmd = {
+static struct rte_pci_eth_driver rte_bnx2x_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_bnx2x_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2x_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	eth_drv = {
+		.eth_dev_init = eth_bnx2x_dev_init,
+		.dev_private_size = sizeof(struct bnx2x_softc),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_bnx2xvf_pmd = {
+static struct rte_pci_eth_driver rte_bnx2xvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_bnx2xvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2xvf_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	eth_drv = {
+		.eth_dev_init = eth_bnx2xvf_dev_init,
+		.dev_private_size = sizeof(struct bnx2x_softc),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 7518b6b7..9017825b 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1164,17 +1164,19 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev) {
 	return rc;
 }
 
-static struct eth_driver bnxt_rte_pmd = {
+static struct rte_pci_eth_driver bnxt_rte_pmd = {
 	.pci_drv = {
-		    .id_table = bnxt_pci_id_map,
-		    .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
-			    RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove
-		    },
-	.eth_dev_init = bnxt_dev_init,
-	.eth_dev_uninit = bnxt_dev_uninit,
-	.dev_private_size = sizeof(struct bnxt),
+		.id_table = bnxt_pci_id_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING |
+			     RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove
+	},
+	.eth_drv = {
+		.eth_dev_init = bnxt_dev_init,
+		.eth_dev_uninit = bnxt_dev_uninit,
+		.dev_private_size = sizeof(struct bnxt),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd.pci_drv);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 64345e37..ccf93904 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1039,15 +1039,17 @@ static int eth_cxgbe_dev_init(struct rte_eth_dev *eth_dev)
 	return err;
 }
 
-static struct eth_driver rte_cxgbe_pmd = {
+static struct rte_pci_eth_driver rte_cxgbe_pmd = {
 	.pci_drv = {
 		.id_table = cxgb4_pci_tbl,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_cxgbe_dev_init,
-	.dev_private_size = sizeof(struct port_info),
+	.eth_drv = {
+		.eth_dev_init = eth_cxgbe_dev_init,
+		.dev_private_size = sizeof(struct port_info),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 5f6e66dd..5b87d729 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -389,7 +389,7 @@ eth_em_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_em_pmd = {
+static struct rte_pci_eth_driver rte_em_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_em_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -397,9 +397,11 @@ static struct eth_driver rte_em_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_em_dev_init,
-	.eth_dev_uninit = eth_em_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_em_dev_init,
+		.eth_dev_uninit = eth_em_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 static int
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 2bb57f54..4a2d3b3f 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1082,7 +1082,7 @@ eth_igbvf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_igb_pmd = {
+static struct rte_pci_eth_driver rte_igb_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_igb_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -1090,24 +1090,28 @@ static struct eth_driver rte_igb_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_igb_dev_init,
-	.eth_dev_uninit = eth_igb_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_igb_dev_init,
+		.eth_dev_uninit = eth_igb_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_igbvf_pmd = {
+static struct rte_pci_eth_driver rte_igbvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_igbvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_igbvf_dev_init,
-	.eth_dev_uninit = eth_igbvf_dev_uninit,
-	.dev_private_size = sizeof(struct e1000_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_igbvf_dev_init,
+		.eth_dev_uninit = eth_igbvf_dev_uninit,
+		.dev_private_size = sizeof(struct e1000_adapter),
+	},
 };
 
 static void
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e99bf299..d6406fa1 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1756,15 +1756,17 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	return sent_idx;
 }
 
-static struct eth_driver rte_ena_pmd = {
+static struct rte_pci_eth_driver rte_ena_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ena_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ena_dev_init,
-	.dev_private_size = sizeof(struct ena_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ena_dev_init,
+		.dev_private_size = sizeof(struct ena_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd.pci_drv);
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index e5ceb98e..b47975d1 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -633,15 +633,17 @@ static int eth_enicpmd_dev_init(struct rte_eth_dev *eth_dev)
 	return enic_probe(enic);
 }
 
-static struct eth_driver rte_enic_pmd = {
+static struct rte_pci_eth_driver rte_enic_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_enic_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_enicpmd_dev_init,
-	.dev_private_size = sizeof(struct enic),
+	.eth_drv = {
+		.eth_dev_init = eth_enicpmd_dev_init,
+		.dev_private_size = sizeof(struct enic),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd.pci_drv);
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index d8353e9d..4dea1fd6 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3077,7 +3077,7 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {
 	{ .vendor_id = 0, /* sentinel */ },
 };
 
-static struct eth_driver rte_pmd_fm10k = {
+static struct rte_pci_eth_driver rte_pmd_fm10k = {
 	.pci_drv = {
 		.id_table = pci_id_fm10k_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -3085,9 +3085,11 @@ static struct eth_driver rte_pmd_fm10k = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_fm10k_dev_init,
-	.eth_dev_uninit = eth_fm10k_dev_uninit,
-	.dev_private_size = sizeof(struct fm10k_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_fm10k_dev_init,
+		.eth_dev_uninit = eth_fm10k_dev_uninit,
+		.dev_private_size = sizeof(struct fm10k_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k.pci_drv);
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 0eb4c990..8b4c6079 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -668,7 +668,7 @@ static const struct rte_i40e_xstats_name_off rte_i40e_txq_prio_strings[] = {
 #define I40E_NB_TXQ_PRIO_XSTATS (sizeof(rte_i40e_txq_prio_strings) / \
 		sizeof(rte_i40e_txq_prio_strings[0]))
 
-static struct eth_driver rte_i40e_pmd = {
+static struct rte_pci_eth_driver rte_i40e_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_i40e_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -676,9 +676,11 @@ static struct eth_driver rte_i40e_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_i40e_dev_init,
-	.eth_dev_uninit = eth_i40e_dev_uninit,
-	.dev_private_size = sizeof(struct i40e_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_i40e_dev_init,
+		.eth_dev_uninit = eth_i40e_dev_uninit,
+		.dev_private_size = sizeof(struct i40e_adapter),
+	},
 };
 
 static inline int
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 0dc0af52..6dbcc88c 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1526,16 +1526,18 @@ i40evf_dev_uninit(struct rte_eth_dev *eth_dev)
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_i40evf_pmd = {
+static struct rte_pci_eth_driver rte_i40evf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_i40evf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = i40evf_dev_init,
-	.eth_dev_uninit = i40evf_dev_uninit,
-	.dev_private_size = sizeof(struct i40e_adapter),
+	.eth_drv = {
+		.eth_dev_init = i40evf_dev_init,
+		.eth_dev_uninit = i40evf_dev_uninit,
+		.dev_private_size = sizeof(struct i40e_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd.pci_drv);
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 060772d4..6fdf227e 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1563,7 +1563,7 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_ixgbe_pmd = {
+static struct rte_pci_eth_driver rte_ixgbe_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ixgbe_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -1571,24 +1571,28 @@ static struct eth_driver rte_ixgbe_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ixgbe_dev_init,
-	.eth_dev_uninit = eth_ixgbe_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ixgbe_dev_init,
+		.eth_dev_uninit = eth_ixgbe_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
+	},
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_ixgbevf_pmd = {
+static struct rte_pci_eth_driver rte_ixgbevf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_ixgbevf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ixgbevf_dev_init,
-	.eth_dev_uninit = eth_ixgbevf_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
+	.eth_drv = {
+		.eth_dev_init = eth_ixgbevf_dev_init,
+		.eth_dev_uninit = eth_ixgbevf_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
+	},
 };
 
 static int
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index eb06f56a..7b184019 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5524,7 +5524,7 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 	}
 }
 
-static struct eth_driver mlx4_driver;
+static struct rte_pci_eth_driver mlx4_driver;
 
 /**
  * DPDK callback to register a PCI device.
@@ -5903,7 +5903,7 @@ static const struct rte_pci_id mlx4_pci_id_map[] = {
 	}
 };
 
-static struct eth_driver mlx4_driver = {
+static struct rte_pci_eth_driver mlx4_driver = {
 	.pci_drv = {
 		.driver = {
 			.name = MLX4_DRIVER_NAME
@@ -5912,7 +5912,9 @@ static struct eth_driver mlx4_driver = {
 		.probe = mlx4_pci_probe,
 		.drv_flags = RTE_PCI_DRV_INTR_LSC,
 	},
-	.dev_private_size = sizeof(struct priv)
+	.eth_drv = {
+		.dev_private_size = sizeof(struct priv),
+	},
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index b97b6d16..efc0430c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -338,7 +338,7 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 	return 0;
 }
 
-static struct eth_driver mlx5_driver;
+static struct rte_pci_eth_driver mlx5_driver;
 
 /**
  * DPDK callback to register a PCI device.
@@ -723,7 +723,7 @@ static const struct rte_pci_id mlx5_pci_id_map[] = {
 	}
 };
 
-static struct eth_driver mlx5_driver = {
+static struct rte_pci_eth_driver mlx5_driver = {
 	.pci_drv = {
 		.driver = {
 			.name = MLX5_DRIVER_NAME
@@ -732,7 +732,9 @@ static struct eth_driver mlx5_driver = {
 		.probe = mlx5_pci_probe,
 		.drv_flags = RTE_PCI_DRV_INTR_LSC,
 	},
-	.dev_private_size = sizeof(struct priv)
+	.eth_drv = {
+		.dev_private_size = sizeof(struct priv),
+	},
 };
 
 /**
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 970b5c84..f5c6634f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2470,7 +2470,7 @@ static struct rte_pci_id pci_id_nfp_net_map[] = {
 	},
 };
 
-static struct eth_driver rte_nfp_net_pmd = {
+static struct rte_pci_eth_driver rte_nfp_net_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_nfp_net_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -2478,8 +2478,10 @@ static struct eth_driver rte_nfp_net_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = nfp_net_init,
-	.dev_private_size = sizeof(struct nfp_net_adapter),
+	.eth_drv = {
+		.eth_dev_init = nfp_net_init,
+		.dev_private_size = sizeof(struct nfp_net_adapter),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_nfp, rte_nfp_net_pmd.pci_drv);
diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index edc5b43b..13d76a6d 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -1643,30 +1643,32 @@ static struct rte_pci_id pci_id_qede_map[] = {
 	{.vendor_id = 0,}
 };
 
-static struct eth_driver rte_qedevf_pmd = {
+static struct rte_pci_eth_driver rte_qedevf_pmd = {
 	.pci_drv = {
-		    .id_table = pci_id_qedevf_map,
-		    .drv_flags =
-		    RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove,
-		   },
-	.eth_dev_init = qedevf_eth_dev_init,
-	.eth_dev_uninit = qedevf_eth_dev_uninit,
-	.dev_private_size = sizeof(struct qede_dev),
+		.id_table = pci_id_qedevf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.eth_drv = {
+		.eth_dev_init = qedevf_eth_dev_init,
+		.eth_dev_uninit = qedevf_eth_dev_uninit,
+		.dev_private_size = sizeof(struct qede_dev),
+	},
 };
 
-static struct eth_driver rte_qede_pmd = {
+static struct rte_pci_eth_driver rte_qede_pmd = {
 	.pci_drv = {
-		    .id_table = pci_id_qede_map,
-		    .drv_flags =
-		    RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
-		    .probe = rte_eth_dev_pci_probe,
-		    .remove = rte_eth_dev_pci_remove,
-		   },
-	.eth_dev_init = qede_eth_dev_init,
-	.eth_dev_uninit = qede_eth_dev_uninit,
-	.dev_private_size = sizeof(struct qede_dev),
+		.id_table = pci_id_qede_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.eth_drv = {
+		.eth_dev_init = qede_eth_dev_init,
+		.eth_dev_uninit = qede_eth_dev_uninit,
+		.dev_private_size = sizeof(struct qede_dev),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_qede, rte_qede_pmd.pci_drv);
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c b/drivers/net/szedata2/rte_eth_szedata2.c
index fe7a6b3b..b9054671 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1587,15 +1587,17 @@ static const struct rte_pci_id rte_szedata2_pci_id_table[] = {
 	}
 };
 
-static struct eth_driver szedata2_eth_driver = {
+static struct rte_pci_eth_driver szedata2_eth_driver = {
 	.pci_drv = {
 		.id_table = rte_szedata2_pci_id_table,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init     = rte_szedata2_eth_dev_init,
-	.eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
-	.dev_private_size = sizeof(struct pmd_internals),
+	.eth_drv = {
+		.eth_dev_init     = rte_szedata2_eth_dev_init,
+		.eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
+		.dev_private_size = sizeof(struct pmd_internals),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(RTE_SZEDATA2_DRIVER_NAME, szedata2_eth_driver.pci_drv);
diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c
index 10603197..f13fad90 100644
--- a/drivers/net/thunderx/nicvf_ethdev.c
+++ b/drivers/net/thunderx/nicvf_ethdev.c
@@ -2111,15 +2111,17 @@ static const struct rte_pci_id pci_id_nicvf_map[] = {
 	},
 };
 
-static struct eth_driver rte_nicvf_pmd = {
+static struct rte_pci_eth_driver rte_nicvf_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_nicvf_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = nicvf_eth_dev_init,
-	.dev_private_size = sizeof(struct nicvf),
+	.eth_drv = {
+		.eth_dev_init = nicvf_eth_dev_init,
+		.dev_private_size = sizeof(struct nicvf),
+	},
 };
 
 RTE_PMD_REGISTER_PCI(net_thunderx, rte_nicvf_pmd.pci_drv);
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 54ea7d77..e6f241ad 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1377,7 +1377,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_virtio_pmd = {
+static struct rte_pci_eth_driver rte_virtio_pmd = {
 	.pci_drv = {
 		.driver = {
 			.name = "net_virtio",
@@ -1387,9 +1387,11 @@ static struct eth_driver rte_virtio_pmd = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_virtio_dev_init,
-	.eth_dev_uninit = eth_virtio_dev_uninit,
-	.dev_private_size = sizeof(struct virtio_hw),
+	.eth_drv = {
+		.eth_dev_init = eth_virtio_dev_init,
+		.eth_dev_uninit = eth_virtio_dev_uninit,
+		.dev_private_size = sizeof(struct virtio_hw),
+	},
 };
 
 RTE_INIT(rte_virtio_pmd_init);
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 54533ca5..cb9221e6 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -337,16 +337,18 @@ eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_vmxnet3_pmd = {
+static struct rte_pci_eth_driver rte_vmxnet3_pmd = {
 	.pci_drv = {
 		.id_table = pci_id_vmxnet3_map,
 		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_vmxnet3_dev_init,
-	.eth_dev_uninit = eth_vmxnet3_dev_uninit,
-	.dev_private_size = sizeof(struct vmxnet3_hw),
+	.eth_drv = {
+		.eth_dev_init = eth_vmxnet3_dev_init,
+		.eth_dev_uninit = eth_vmxnet3_dev_uninit,
+		.dev_private_size = sizeof(struct vmxnet3_hw),
+	},
 };
 
 static int
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 9dea1f15..7c212096 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -239,13 +239,14 @@ int
 rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 		      struct rte_pci_device *pci_dev)
 {
-	struct eth_driver    *eth_drv;
+	const struct rte_pci_eth_driver *pci_eth_drv;
+	const struct eth_driver *eth_drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
-
 	int diag;
 
-	eth_drv = (struct eth_driver *)pci_drv;
+	pci_eth_drv = container_of(pci_drv, struct rte_pci_eth_driver, pci_drv);
+	eth_drv = &pci_eth_drv->eth_drv;
 
 	rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
 			sizeof(ethdev_name));
@@ -263,7 +264,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 	}
 	eth_dev->device = &pci_dev->device;
 	eth_dev->intr_handle = &pci_dev->intr_handle;
-	eth_dev->driver = eth_drv;
+	eth_dev->driver = &pci_eth_drv->eth_drv;
 
 	/* Invoke PMD device initialization function */
 	diag = (*eth_drv->eth_dev_init)(eth_dev);
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b4168830..1a62a322 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1884,25 +1884,31 @@ typedef int (*eth_dev_uninit_t)(struct rte_eth_dev *eth_dev);
  * @internal
  * The structure associated with a PMD Ethernet driver.
  *
- * Each Ethernet driver acts as a PCI driver and is represented by a generic
+ * Each Ethernet driver acts is represented by a generic
  * *eth_driver* structure that holds:
  *
- * - An *rte_pci_driver* structure (which must be the first field).
+ * - The *eth_dev_init* function invoked for each matching device.
  *
- * - The *eth_dev_init* function invoked for each matching PCI device.
- *
- * - The *eth_dev_uninit* function invoked for each matching PCI device.
+ * - The *eth_dev_uninit* function invoked for each matching device.
  *
  * - The size of the private data to allocate for each matching device.
  */
 struct eth_driver {
-	struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
 	eth_dev_init_t eth_dev_init;      /**< Device init function. */
 	eth_dev_uninit_t eth_dev_uninit;  /**< Device uninit function. */
 	unsigned int dev_private_size;    /**< Size of device private data. */
 };
 
 /**
+ * @internal
+ * The structure associated with a PMD PCI Ethernet driver.
+ */
+struct rte_pci_eth_driver {
+	struct rte_pci_driver	pci_drv;	/**< Underlying PCI driver. */
+	struct eth_driver	eth_drv;	/**< Ethernet driver. */
+};
+
+/**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
  *
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
@ 2017-01-10 13:59   ` Ferruh Yigit
  2017-01-10 17:58     ` Ferruh Yigit
  2017-01-10 16:11   ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Jan Blunck
  1 sibling, 1 reply; 30+ messages in thread
From: Ferruh Yigit @ 2017-01-10 13:59 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Stephen Hemminger, Shreyansh Jain
On 1/7/2017 6:17 PM, Stephen Hemminger wrote:
> There are multiple buses and device types now. Therefore it no longer
> makes sense that PCI driver information is part of the Ethernet driver
> structure.
> 
> This patch removes pci_driver from eth_driver and introduces a
> new combined structure for use in all existing PMD's. The rationale
> is that although all existing PCI drivers are Ethernet drivers,
> it make sense that future projects may want to support PCI devices
> that are not Ethernet.
> 
> It also removes the requirement that driver is first element in
> PCI driver structure.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
<...>
>  /**
> + * @internal
> + * The structure associated with a PMD PCI Ethernet driver.
> + */
> +struct rte_pci_eth_driver {
> +	struct rte_pci_driver	pci_drv;	/**< Underlying PCI driver. */
> +	struct eth_driver	eth_drv;	/**< Ethernet driver. */
> +};
So do we need to add rte_vdev_eth_driver struct for virtual drivers, or
need to add rte_pci_cryptodev_driver struct for pci crypto devices?
Can this be done in a more generic way? After Shreyansh's patches, there
will be rte_device, rte_driver abstractions, can they be useful?
<...>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection
  2017-01-10 13:59   ` Ferruh Yigit
@ 2017-01-10 17:58     ` Ferruh Yigit
  2017-01-10 18:02       ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Ferruh Yigit
  0 siblings, 1 reply; 30+ messages in thread
From: Ferruh Yigit @ 2017-01-10 17:58 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Stephen Hemminger, Shreyansh Jain
On 1/10/2017 1:59 PM, Ferruh Yigit wrote:
> On 1/7/2017 6:17 PM, Stephen Hemminger wrote:
>> There are multiple buses and device types now. Therefore it no longer
>> makes sense that PCI driver information is part of the Ethernet driver
>> structure.
>>
>> This patch removes pci_driver from eth_driver and introduces a
>> new combined structure for use in all existing PMD's. The rationale
>> is that although all existing PCI drivers are Ethernet drivers,
>> it make sense that future projects may want to support PCI devices
>> that are not Ethernet.
>>
>> It also removes the requirement that driver is first element in
>> PCI driver structure.
>>
>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>> ---
> 
> <...>
> 
>>  /**
>> + * @internal
>> + * The structure associated with a PMD PCI Ethernet driver.
>> + */
>> +struct rte_pci_eth_driver {
>> +	struct rte_pci_driver	pci_drv;	/**< Underlying PCI driver. */
>> +	struct eth_driver	eth_drv;	/**< Ethernet driver. */
>> +};
> 
> So do we need to add rte_vdev_eth_driver struct for virtual drivers, or
> need to add rte_pci_cryptodev_driver struct for pci crypto devices?
> 
> Can this be done in a more generic way? After Shreyansh's patches, there
> will be rte_device, rte_driver abstractions, can they be useful?
What do you think separating bus (pci) and functionality (eth/crypto)
driver structs, to make them less coupled. This makes combining bus /
function pairs easily.
I will send a patch as reply to this mail, it is not the complete patch,
but just to give the idea. It is based on Shreyansh's patchet.
> 
> <...>
> 
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * [dpdk-dev] [PATCH 1/2] add rte_bus->probe
  2017-01-10 17:58     ` Ferruh Yigit
@ 2017-01-10 18:02       ` Ferruh Yigit
  2017-01-10 18:02         ` [dpdk-dev] [PATCH 2/2] separate bus and functionality driver structs Ferruh Yigit
  2017-01-11  4:53         ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Shreyansh Jain
  0 siblings, 2 replies; 30+ messages in thread
From: Ferruh Yigit @ 2017-01-10 18:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Shreyansh Jain, Jan Blunck, Ferruh Yigit
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 lib/librte_eal/common/eal_common_bus.c  | 7 ++++---
 lib/librte_eal/common/include/rte_bus.h | 3 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 1 +
 3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index f8c2e03..e8d1143 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -145,6 +145,7 @@ rte_eal_bus_register(struct rte_bus *bus)
 	/* A bus should mandatorily have the scan and match implemented */
 	RTE_VERIFY(bus->scan);
 	RTE_VERIFY(bus->match);
+	RTE_VERIFY(bus->probe);
 
 	/* Initialize the driver and device list associated with the bus */
 	TAILQ_INIT(&(bus->driver_list));
@@ -195,19 +196,19 @@ rte_eal_bus_scan(void)
 }
 
 static int
-perform_probe(struct rte_bus *bus __rte_unused, struct rte_driver *driver,
+perform_probe(struct rte_bus *bus, struct rte_driver *driver,
 	      struct rte_device *device)
 {
 	int ret;
 
-	if (!driver->probe) {
+	if (!bus->probe) {
 		RTE_LOG(ERR, EAL, "Driver (%s) doesn't support probe.\n",
 			driver->name);
 		/* This is not an error - just a badly implemented PMD */
 		return 0;
 	}
 
-	ret = driver->probe(driver, device);
+	ret = bus->probe(driver, device);
 	if (ret < 0)
 		/* One of the probes failed */
 		RTE_LOG(ERR, EAL, "Probe failed for (%s).\n", driver->name);
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 07c30c4..ce1f56a 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -135,6 +135,8 @@ typedef int (*bus_scan_t)(struct rte_bus *bus);
  */
 typedef int (*bus_match_t)(struct rte_driver *drv, struct rte_device *dev);
 
+typedef int (*bus_probe_t)(struct rte_driver *drv, struct rte_device *dev);
+
 /**
  * A structure describing a generic bus.
  */
@@ -147,6 +149,7 @@ struct rte_bus {
 	const char *name;            /**< Name of the bus */
 	bus_scan_t scan;            /**< Scan for devices attached to bus */
 	bus_match_t match;
+	bus_probe_t probe;
 	/**< Match device with drivers associated with the bus */
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 314effa..837adf6 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -726,6 +726,7 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 struct rte_bus pci_bus = {
 	.scan = rte_eal_pci_scan,
 	.match = rte_eal_pci_match,
+	.probe = rte_eal_pci_probe,
 };
 
 RTE_REGISTER_BUS(pci, pci_bus);
-- 
2.9.3
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * [dpdk-dev] [PATCH 2/2] separate bus and functionality driver structs
  2017-01-10 18:02       ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Ferruh Yigit
@ 2017-01-10 18:02         ` Ferruh Yigit
  2017-01-11  4:53         ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Shreyansh Jain
  1 sibling, 0 replies; 30+ messages in thread
From: Ferruh Yigit @ 2017-01-10 18:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Shreyansh Jain, Jan Blunck, Ferruh Yigit
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 drivers/net/bnx2x/bnx2x_ethdev.c        | 44 ++++++++++++++---------------
 drivers/net/cxgbe/cxgbe_ethdev.c        | 22 +++++++--------
 drivers/net/e1000/em_ethdev.c           | 26 ++++++++---------
 drivers/net/e1000/igb_ethdev.c          | 50 ++++++++++++++++-----------------
 drivers/net/ena/ena_ethdev.c            | 22 +++++++--------
 drivers/net/enic/enic_ethdev.c          | 22 +++++++--------
 drivers/net/fm10k/fm10k_ethdev.c        | 26 ++++++++---------
 drivers/net/i40e/i40e_ethdev.c          | 26 ++++++++---------
 drivers/net/i40e/i40e_ethdev_vf.c       | 24 ++++++++--------
 drivers/net/ixgbe/ixgbe_ethdev.c        | 50 ++++++++++++++++-----------------
 lib/librte_eal/common/eal_common_pci.c  |  4 +--
 lib/librte_eal/common/include/rte_pci.h |  3 +-
 lib/librte_ether/rte_ethdev.c           | 26 +++++++++++++----
 lib/librte_ether/rte_ethdev.h           |  7 +++--
 14 files changed, 183 insertions(+), 169 deletions(-)
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 2dbd782..bb1937e 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -618,42 +618,42 @@ eth_bnx2xvf_dev_init(struct rte_eth_dev *eth_dev)
 	return bnx2x_common_dev_init(eth_dev, 1);
 }
 
-static struct eth_driver rte_bnx2x_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_bnx2x_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+static struct eth_driver rte_bnx2x_pmd_eth_drv = {
+	.eth_dev_init = eth_bnx2x_dev_init,
+	.dev_private_size = sizeof(struct bnx2x_softc),
+};
+
+static struct rte_pci_driver rte_bnx2x_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2x_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	.id_table = pci_id_bnx2x_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	.func_drv = &rte_bnx2x_pmd_eth_drv.driver,
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_bnx2xvf_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_bnx2xvf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+static struct eth_driver rte_bnx2xvf_pmd_eth_drv = {
+	.eth_dev_init = eth_bnx2xvf_dev_init,
+	.dev_private_size = sizeof(struct bnx2x_softc),
+};
+
+static struct rte_pci_driver rte_bnx2xvf_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_bnx2xvf_dev_init,
-	.dev_private_size = sizeof(struct bnx2x_softc),
+	.id_table = pci_id_bnx2xvf_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+	.func_drv = &rte_bnx2xvf_pmd_eth_drv.driver,
 };
 
-RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2x, pci_id_bnx2x_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_bnx2x, "* igb_uio | uio_pci_generic | vfio");
-RTE_PMD_REGISTER_PCI(net_bnx2xvf, rte_bnx2xvf_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_bnx2xvf, rte_bnx2xvf_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2xvf, pci_id_bnx2xvf_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_bnx2xvf, "* igb_uio | vfio");
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 7718d02..df9a324 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1039,21 +1039,21 @@ static int eth_cxgbe_dev_init(struct rte_eth_dev *eth_dev)
 	return err;
 }
 
-static struct eth_driver rte_cxgbe_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = cxgb4_pci_tbl,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+static struct eth_driver rte_cxgbe_pmd_eth_drv = {
+	.eth_dev_init = eth_cxgbe_dev_init,
+	.dev_private_size = sizeof(struct port_info),
+};
+
+static struct rte_pci_driver rte_cxgbe_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_cxgbe_dev_init,
-	.dev_private_size = sizeof(struct port_info),
+	.id_table = cxgb4_pci_tbl,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	.func_drv = &rte_cxgbe_pmd_eth_drv.driver,
 };
 
-RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_cxgbe, cxgb4_pci_tbl);
 RTE_PMD_REGISTER_KMOD_DEP(net_cxgbe, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 8758aaa..fa5f650 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -415,23 +415,23 @@ eth_em_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_em_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_em_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_em_pmd_eth_drv = {
 	.eth_dev_init = eth_em_dev_init,
 	.eth_dev_uninit = eth_em_dev_uninit,
 	.dev_private_size = sizeof(struct e1000_adapter),
 };
 
+static struct rte_pci_driver rte_em_pmd_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_em_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_em_pmd_eth_drv.driver,
+};
+
 static int
 em_hw_init(struct e1000_hw *hw)
 {
@@ -1851,6 +1851,6 @@ eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
 	return 0;
 }
 
-RTE_PMD_REGISTER_PCI(net_e1000_em, rte_em_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_e1000_em, rte_em_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_em, pci_id_em_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_e1000_em, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 76d73cd..9563c46 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1082,42 +1082,42 @@ eth_igbvf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_igb_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_igb_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_igb_pmd_eth_drv = {
 	.eth_dev_init = eth_igb_dev_init,
 	.eth_dev_uninit = eth_igb_dev_uninit,
 	.dev_private_size = sizeof(struct e1000_adapter),
 };
 
-/*
- * virtual function driver struct
- */
-static struct eth_driver rte_igbvf_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_igbvf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+static struct rte_pci_driver rte_igb_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
+	.id_table = pci_id_igb_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_igb_pmd_eth_drv.driver,
+};
+
+/*
+ * virtual function driver struct
+ */
+static struct eth_driver rte_igbvf_pmd_eth_drv = {
 	.eth_dev_init = eth_igbvf_dev_init,
 	.eth_dev_uninit = eth_igbvf_dev_uninit,
 	.dev_private_size = sizeof(struct e1000_adapter),
 };
 
+static struct rte_pci_driver rte_igbvf_pmd_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_igbvf_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_igbvf_pmd_eth_drv.driver,
+};
+
 static void
 igb_vmdq_vlan_hw_filter_enable(struct rte_eth_dev *dev)
 {
@@ -5261,9 +5261,9 @@ eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
 	E1000_WRITE_FLUSH(hw);
 }
 
-RTE_PMD_REGISTER_PCI(net_e1000_igb, rte_igb_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_e1000_igb, rte_igb_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_igb, pci_id_igb_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_e1000_igb, "* igb_uio | uio_pci_generic | vfio");
-RTE_PMD_REGISTER_PCI(net_e1000_igb_vf, rte_igbvf_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_e1000_igb_vf, rte_igbvf_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_igb_vf, pci_id_igbvf_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_e1000_igb_vf, "* igb_uio | vfio");
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ecdd015..ad5b0a9 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1756,21 +1756,21 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	return sent_idx;
 }
 
-static struct eth_driver rte_ena_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_ena_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+static struct eth_driver rte_ena_pmd_eth_drv = {
+	.eth_dev_init = eth_ena_dev_init,
+	.dev_private_size = sizeof(struct ena_adapter),
+};
+
+static struct rte_pci_driver rte_ena_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_ena_dev_init,
-	.dev_private_size = sizeof(struct ena_adapter),
+	.id_table = pci_id_ena_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+	.func_drv = &rte_ena_pmd_eth_drv.driver,
 };
 
-RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ena, pci_id_ena_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_ena, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 00cf67b..0cb6400 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -634,21 +634,21 @@ static int eth_enicpmd_dev_init(struct rte_eth_dev *eth_dev)
 	return enic_probe(enic);
 }
 
-static struct eth_driver rte_enic_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_enic_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+static struct eth_driver rte_enic_pmd_eth_drv = {
+	.eth_dev_init = eth_enicpmd_dev_init,
+	.dev_private_size = sizeof(struct enic),
+};
+
+static struct rte_pci_driver rte_enic_pmd_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
-	.eth_dev_init = eth_enicpmd_dev_init,
-	.dev_private_size = sizeof(struct enic),
+	.id_table = pci_id_enic_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	.func_drv = &rte_enic_pmd_eth_drv.driver,
 };
 
-RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_enic, pci_id_enic_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enic, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 9760fb7..4c84484 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3077,23 +3077,23 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {
 	{ .vendor_id = 0, /* sentinel */ },
 };
 
-static struct eth_driver rte_pmd_fm10k = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_fm10k_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_pmd_fm10k_eth_drv = {
 	.eth_dev_init = eth_fm10k_dev_init,
 	.eth_dev_uninit = eth_fm10k_dev_uninit,
 	.dev_private_size = sizeof(struct fm10k_adapter),
 };
 
-RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k.pci_drv);
+static struct rte_pci_driver rte_pmd_fm10k_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_fm10k_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_pmd_fm10k_eth_drv.driver,
+};
+
+RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_fm10k, pci_id_fm10k_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_fm10k, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 24683a9..9443c51 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -623,23 +623,23 @@ static const struct rte_i40e_xstats_name_off rte_i40e_txq_prio_strings[] = {
 #define I40E_NB_TXQ_PRIO_XSTATS (sizeof(rte_i40e_txq_prio_strings) / \
 		sizeof(rte_i40e_txq_prio_strings[0]))
 
-static struct eth_driver rte_i40e_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_i40e_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_i40e_pmd_eth_drv = {
 	.eth_dev_init = eth_i40e_dev_init,
 	.eth_dev_uninit = eth_i40e_dev_uninit,
 	.dev_private_size = sizeof(struct i40e_adapter),
 };
 
+static struct rte_pci_driver rte_i40e_pmd_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_i40e_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_i40e_pmd_eth_drv.driver,
+};
+
 static inline int
 rte_i40e_dev_atomic_read_link_status(struct rte_eth_dev *dev,
 				     struct rte_eth_link *link)
@@ -668,7 +668,7 @@ rte_i40e_dev_atomic_write_link_status(struct rte_eth_dev *dev,
 	return 0;
 }
 
-RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_i40e, pci_id_i40e_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_i40e, "* igb_uio | uio_pci_generic | vfio");
 
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 7b97ed3..c548955 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1533,23 +1533,23 @@ i40evf_dev_uninit(struct rte_eth_dev *eth_dev)
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_i40evf_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_i40evf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_i40evf_pmd_eth_drv = {
 	.eth_dev_init = i40evf_dev_init,
 	.eth_dev_uninit = i40evf_dev_uninit,
 	.dev_private_size = sizeof(struct i40e_adapter),
 };
 
-RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd.pci_drv);
+static struct rte_pci_driver rte_i40evf_pmd_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_i40evf_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_i40evf_pmd_eth_drv.driver,
+};
+
+RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_i40e_vf, pci_id_i40evf_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_i40e_vf, "* igb_uio | vfio");
 
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index b17ed1a..520b2af 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1550,42 +1550,42 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_ixgbe_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_ixgbe_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
-	},
+static struct eth_driver rte_ixgbe_eth_drv = {
 	.eth_dev_init = eth_ixgbe_dev_init,
 	.eth_dev_uninit = eth_ixgbe_dev_uninit,
 	.dev_private_size = sizeof(struct ixgbe_adapter),
 };
 
-/*
- * virtual function driver struct
- */
-static struct eth_driver rte_ixgbevf_pmd = {
-	.pci_drv = {
-		.driver = {
-			.probe = rte_eal_pci_probe,
-			.remove = rte_eal_pci_remove,
-		},
-		.id_table = pci_id_ixgbevf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+static struct rte_pci_driver rte_ixgbe_pci_drv = {
+	.driver = {
 		.probe = rte_eth_dev_pci_probe,
 		.remove = rte_eth_dev_pci_remove,
 	},
+	.id_table = pci_id_ixgbe_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_ixgbe_eth_drv.driver,
+};
+
+/*
+ * virtual function driver struct
+ */
+static struct eth_driver rte_ixgbevf_eth_drv = {
 	.eth_dev_init = eth_ixgbevf_dev_init,
 	.eth_dev_uninit = eth_ixgbevf_dev_uninit,
 	.dev_private_size = sizeof(struct ixgbe_adapter),
 };
 
+static struct rte_pci_driver rte_ixgbevf_pci_drv = {
+	.driver = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+	},
+	.id_table = pci_id_ixgbevf_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+	.func_drv = &rte_ixgbevf_eth_drv.driver,
+};
+
 static int
 ixgbe_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 {
@@ -7695,9 +7695,9 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
 	ixgbevf_dev_interrupt_action(dev);
 }
 
-RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_ixgbe, "* igb_uio | uio_pci_generic | vfio");
-RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe_vf, pci_id_ixgbevf_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_ixgbe_vf, "* igb_uio | vfio");
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 2d5a399..ea2f598 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -220,7 +220,7 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr,
 	dev->driver = dr;
 
 	/* call the driver probe() function */
-	ret = dr->probe(dr, dev);
+	ret = dr->driver.probe(&dr->driver, &dev->device);
 	if (ret) {
 		RTE_LOG(DEBUG, EAL, "Driver (%s) probe failed.\n",
 			dr->driver.name);
@@ -252,7 +252,7 @@ rte_eal_pci_detach_dev(struct rte_pci_driver *dr,
 	RTE_LOG(DEBUG, EAL, "  remove driver: %x:%x %s\n", dev->id.vendor_id,
 			dev->id.device_id, dr->driver.name);
 
-	if (dr->remove && (dr->remove(dev) < 0))
+	if (dr->driver.remove && (dr->driver.remove(&dev->device) < 0))
 		return -1;	/* negative value is an error */
 
 	/* clear driver structure */
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 1647672..949ed3e 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -200,8 +200,7 @@ typedef int (pci_remove_t)(struct rte_pci_device *);
 struct rte_pci_driver {
 	TAILQ_ENTRY(rte_pci_driver) next;       /**< Next in list. */
 	struct rte_driver driver;               /**< Inherit core driver. */
-	pci_probe_t *probe;                     /**< Device Probe function. */
-	pci_remove_t *remove;                   /**< Device Remove function. */
+	struct rte_driver *func_drv;
 	const struct rte_pci_id *id_table;	/**< ID table, NULL terminated. */
 	uint32_t drv_flags;                     /**< Flags contolling handling of device. */
 };
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 917557a..3369864 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -236,16 +236,23 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 }
 
 int
-rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
-		      struct rte_pci_device *pci_dev)
+rte_eth_dev_pci_probe(struct rte_driver *drv,
+		      struct rte_device *dev)
 {
-	struct eth_driver    *eth_drv;
+	struct rte_pci_driver *pci_drv;
+	struct rte_pci_device *pci_dev;
+	struct eth_driver *eth_drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+	struct rte_driver *func_drv;
 
 	int diag;
 
-	eth_drv = (struct eth_driver *)pci_drv;
+	pci_drv = container_of(drv, struct rte_pci_driver, driver);
+	pci_dev = container_of(dev, struct rte_pci_device, device);
+
+	func_drv = pci_drv->func_drv;
+	eth_drv = container_of(func_drv, struct eth_driver, driver);
 
 	rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
 			sizeof(ethdev_name));
@@ -281,13 +288,19 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 }
 
 int
-rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
+rte_eth_dev_pci_remove(struct rte_device *dev)
 {
+	struct rte_pci_device *pci_dev;
+	struct rte_pci_driver *pci_drv;
 	const struct eth_driver *eth_drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+	struct rte_driver *func_drv;
 	int ret;
 
+	pci_dev = container_of(dev, struct rte_pci_device, device);
+	pci_drv = pci_dev->driver;
+
 	if (pci_dev == NULL)
 		return -EINVAL;
 
@@ -298,7 +311,8 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 	if (eth_dev == NULL)
 		return -ENODEV;
 
-	eth_drv = (const struct eth_driver *)pci_dev->driver;
+	func_drv = pci_drv->func_drv;
+	eth_drv = container_of(func_drv, struct eth_driver, driver);
 
 	/* Invoke PMD device uninit function */
 	if (*eth_drv->eth_dev_uninit) {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ded43d7..203210d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1862,6 +1862,7 @@ struct eth_driver {
 	eth_dev_init_t eth_dev_init;      /**< Device init function. */
 	eth_dev_uninit_t eth_dev_uninit;  /**< Device uninit function. */
 	unsigned int dev_private_size;    /**< Size of device private data. */
+	struct rte_driver driver;
 };
 
 /**
@@ -4382,15 +4383,15 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
  * Wrapper for use by pci drivers as a .probe function to attach to a ethdev
  * interface.
  */
-int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
-			  struct rte_pci_device *pci_dev);
+int rte_eth_dev_pci_probe(struct rte_driver *drv,
+			  struct rte_device *dev);
 
 /**
  * @internal
  * Wrapper for use by pci drivers as a .remove function to detach a ethdev
  * interface.
  */
-int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
+int rte_eth_dev_pci_remove(struct rte_device *dev);
 
 #ifdef __cplusplus
 }
-- 
2.9.3
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 1/2] add rte_bus->probe
  2017-01-10 18:02       ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Ferruh Yigit
  2017-01-10 18:02         ` [dpdk-dev] [PATCH 2/2] separate bus and functionality driver structs Ferruh Yigit
@ 2017-01-11  4:53         ` Shreyansh Jain
  2017-01-11 15:03           ` Ferruh Yigit
  1 sibling, 1 reply; 30+ messages in thread
From: Shreyansh Jain @ 2017-01-11  4:53 UTC (permalink / raw)
  To: Ferruh Yigit, dev; +Cc: Stephen Hemminger, Jan Blunck
On Tuesday 10 January 2017 11:32 PM, Ferruh Yigit wrote:
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>  lib/librte_eal/common/eal_common_bus.c  | 7 ++++---
>  lib/librte_eal/common/include/rte_bus.h | 3 +++
>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 1 +
>  3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
> index f8c2e03..e8d1143 100644
> --- a/lib/librte_eal/common/eal_common_bus.c
> +++ b/lib/librte_eal/common/eal_common_bus.c
> @@ -145,6 +145,7 @@ rte_eal_bus_register(struct rte_bus *bus)
>  	/* A bus should mandatorily have the scan and match implemented */
>  	RTE_VERIFY(bus->scan);
>  	RTE_VERIFY(bus->match);
> +	RTE_VERIFY(bus->probe);
v6 of my patches would include the above.
>
>  	/* Initialize the driver and device list associated with the bus */
>  	TAILQ_INIT(&(bus->driver_list));
> @@ -195,19 +196,19 @@ rte_eal_bus_scan(void)
>  }
>
>  static int
> -perform_probe(struct rte_bus *bus __rte_unused, struct rte_driver *driver,
> +perform_probe(struct rte_bus *bus, struct rte_driver *driver,
>  	      struct rte_device *device)
>  {
>  	int ret;
>
> -	if (!driver->probe) {
> +	if (!bus->probe) {
>  		RTE_LOG(ERR, EAL, "Driver (%s) doesn't support probe.\n",
>  			driver->name);
>  		/* This is not an error - just a badly implemented PMD */
>  		return 0;
>  	}
>
> -	ret = driver->probe(driver, device);
> +	ret = bus->probe(driver, device);
>  	if (ret < 0)
>  		/* One of the probes failed */
>  		RTE_LOG(ERR, EAL, "Probe failed for (%s).\n", driver->name);
Substantial code has shuffled in v6, including removal of this function.
But again, I agree with your changes.
> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
> index 07c30c4..ce1f56a 100644
> --- a/lib/librte_eal/common/include/rte_bus.h
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -135,6 +135,8 @@ typedef int (*bus_scan_t)(struct rte_bus *bus);
>   */
>  typedef int (*bus_match_t)(struct rte_driver *drv, struct rte_device *dev);
>
> +typedef int (*bus_probe_t)(struct rte_driver *drv, struct rte_device *dev);
> +
>  /**
>   * A structure describing a generic bus.
>   */
> @@ -147,6 +149,7 @@ struct rte_bus {
>  	const char *name;            /**< Name of the bus */
>  	bus_scan_t scan;            /**< Scan for devices attached to bus */
>  	bus_match_t match;
> +	bus_probe_t probe;
>  	/**< Match device with drivers associated with the bus */
>  };
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index 314effa..837adf6 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -726,6 +726,7 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
>  struct rte_bus pci_bus = {
>  	.scan = rte_eal_pci_scan,
>  	.match = rte_eal_pci_match,
> +	.probe = rte_eal_pci_probe,
>  };
>
>  RTE_REGISTER_BUS(pci, pci_bus);
>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 1/2] add rte_bus->probe
  2017-01-11  4:53         ` [dpdk-dev] [PATCH 1/2] add rte_bus->probe Shreyansh Jain
@ 2017-01-11 15:03           ` Ferruh Yigit
  2017-01-12  5:28             ` Shreyansh Jain
  0 siblings, 1 reply; 30+ messages in thread
From: Ferruh Yigit @ 2017-01-11 15:03 UTC (permalink / raw)
  To: Shreyansh Jain, dev; +Cc: Stephen Hemminger, Jan Blunck
On 1/11/2017 4:53 AM, Shreyansh Jain wrote:
> On Tuesday 10 January 2017 11:32 PM, Ferruh Yigit wrote:
>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>> ---
>>  lib/librte_eal/common/eal_common_bus.c  | 7 ++++---
>>  lib/librte_eal/common/include/rte_bus.h | 3 +++
>>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 1 +
>>  3 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
>> index f8c2e03..e8d1143 100644
>> --- a/lib/librte_eal/common/eal_common_bus.c
>> +++ b/lib/librte_eal/common/eal_common_bus.c
>> @@ -145,6 +145,7 @@ rte_eal_bus_register(struct rte_bus *bus)
>>  	/* A bus should mandatorily have the scan and match implemented */
>>  	RTE_VERIFY(bus->scan);
>>  	RTE_VERIFY(bus->match);
>> +	RTE_VERIFY(bus->probe);
> 
> v6 of my patches would include the above.
Since I am aware of you are working on something similar, I added this
(in a dirty way) just to able to test next patch.
Thanks,
ferruh
<...>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
- * Re: [dpdk-dev] [PATCH 1/2] add rte_bus->probe
  2017-01-11 15:03           ` Ferruh Yigit
@ 2017-01-12  5:28             ` Shreyansh Jain
  0 siblings, 0 replies; 30+ messages in thread
From: Shreyansh Jain @ 2017-01-12  5:28 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev
> -----Original Message-----
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> Sent: Wednesday, January 11, 2017 8:34 PM
> To: Shreyansh Jain <shreyansh.jain@nxp.com>; dev@dpdk.org
> Cc: Stephen Hemminger <sthemmin@microsoft.com>; Jan Blunck
> <jblunck@infradead.org>
> Subject: Re: [PATCH 1/2] add rte_bus->probe
> 
> On 1/11/2017 4:53 AM, Shreyansh Jain wrote:
> > On Tuesday 10 January 2017 11:32 PM, Ferruh Yigit wrote:
> >> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> >> ---
> >>  lib/librte_eal/common/eal_common_bus.c  | 7 ++++---
> >>  lib/librte_eal/common/include/rte_bus.h | 3 +++
> >>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 1 +
> >>  3 files changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/lib/librte_eal/common/eal_common_bus.c
> b/lib/librte_eal/common/eal_common_bus.c
> >> index f8c2e03..e8d1143 100644
> >> --- a/lib/librte_eal/common/eal_common_bus.c
> >> +++ b/lib/librte_eal/common/eal_common_bus.c
> >> @@ -145,6 +145,7 @@ rte_eal_bus_register(struct rte_bus *bus)
> >>  	/* A bus should mandatorily have the scan and match implemented */
> >>  	RTE_VERIFY(bus->scan);
> >>  	RTE_VERIFY(bus->match);
> >> +	RTE_VERIFY(bus->probe);
> >
> > v6 of my patches would include the above.
> 
> Since I am aware of you are working on something similar, I added this
> (in a dirty way) just to able to test next patch.
:) I understood this after sending my mail. I understand your point.
> 
> Thanks,
> ferruh
> 
> <...>
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
 
 
 
 
- * Re: [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
  2017-01-10 13:59   ` Ferruh Yigit
@ 2017-01-10 16:11   ` Jan Blunck
  2017-01-10 18:03     ` Stephen Hemminger
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 16:11 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> There are multiple buses and device types now. Therefore it no longer
> makes sense that PCI driver information is part of the Ethernet driver
> structure.
The Ethernet driver itself doesn't over alot of value from an
abstraction point of view. Its questionable if there ever will be an
Ethernet driver that is able to operate on different types of
low-level devices. The virtual devices are anyway able to operate
without an Ethernet driver structure. Most of that functionality
should get moved either into the bus abstraction or the low-level
device probe function.
>
> This patch removes pci_driver from eth_driver and introduces a
> new combined structure for use in all existing PMD's. The rationale
> is that although all existing PCI drivers are Ethernet drivers,
> it make sense that future projects may want to support PCI devices
> that are not Ethernet.
>
> It also removes the requirement that driver is first element in
> PCI driver structure.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  app/test/virtual_pmd.c                  | 22 ++++++++---------
>  drivers/net/bnx2x/bnx2x_ethdev.c        | 16 ++++++++-----
>  drivers/net/bnxt/bnxt_ethdev.c          | 22 +++++++++--------
>  drivers/net/cxgbe/cxgbe_ethdev.c        |  8 ++++---
>  drivers/net/e1000/em_ethdev.c           | 10 ++++----
>  drivers/net/e1000/igb_ethdev.c          | 20 +++++++++-------
>  drivers/net/ena/ena_ethdev.c            |  8 ++++---
>  drivers/net/enic/enic_ethdev.c          |  8 ++++---
>  drivers/net/fm10k/fm10k_ethdev.c        | 10 ++++----
>  drivers/net/i40e/i40e_ethdev.c          | 10 ++++----
>  drivers/net/i40e/i40e_ethdev_vf.c       | 10 ++++----
>  drivers/net/ixgbe/ixgbe_ethdev.c        | 20 +++++++++-------
>  drivers/net/mlx4/mlx4.c                 |  8 ++++---
>  drivers/net/mlx5/mlx5.c                 |  8 ++++---
>  drivers/net/nfp/nfp_net.c               |  8 ++++---
>  drivers/net/qede/qede_ethdev.c          | 42 +++++++++++++++++----------------
>  drivers/net/szedata2/rte_eth_szedata2.c | 10 ++++----
>  drivers/net/thunderx/nicvf_ethdev.c     |  8 ++++---
>  drivers/net/virtio/virtio_ethdev.c      | 10 ++++----
>  drivers/net/vmxnet3/vmxnet3_ethdev.c    | 10 ++++----
>  lib/librte_ether/rte_ethdev.c           |  9 +++----
>  lib/librte_ether/rte_ethdev.h           | 18 +++++++++-----
>  22 files changed, 172 insertions(+), 123 deletions(-)
>
> diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
> index 6e4dcd8f..e7f56527 100644
> --- a/app/test/virtual_pmd.c
> +++ b/app/test/virtual_pmd.c
> @@ -533,7 +533,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
>         struct rte_pci_device *pci_dev = NULL;
>         struct rte_eth_dev *eth_dev = NULL;
>         struct eth_driver *eth_drv = NULL;
> -       struct rte_pci_driver *pci_drv = NULL;
> +       struct rte_pci_eth_driver *pci_eth_drv = NULL;
>         struct rte_pci_id *id_table = NULL;
>         struct virtual_ethdev_private *dev_private = NULL;
>         char name_buf[RTE_RING_NAMESIZE];
> @@ -554,8 +554,8 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
>         if (eth_drv == NULL)
>                 goto err;
>
> -       pci_drv = rte_zmalloc_socket(name, sizeof(*pci_drv), 0, socket_id);
> -       if (pci_drv == NULL)
> +       pci_eth_drv = rte_zmalloc_socket(name, sizeof(*pci_eth_drv), 0, socket_id);
> +       if (pci_eth_drv == NULL)
>                 goto err;
>
>         id_table = rte_zmalloc_socket(name, sizeof(*id_table), 0, socket_id);
> @@ -585,17 +585,15 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
>                 goto err;
>
>         pci_dev->device.numa_node = socket_id;
> -       pci_drv->driver.name = virtual_ethdev_driver_name;
> -       pci_drv->id_table = id_table;
> +       pci_eth_drv->pci_drv.driver.name = virtual_ethdev_driver_name;
> +       pci_eth_drv->pci_drv.id_table = id_table;
>
>         if (isr_support)
> -               pci_drv->drv_flags |= RTE_PCI_DRV_INTR_LSC;
> +               pci_eth_drv->pci_drv.drv_flags |= RTE_PCI_DRV_INTR_LSC;
>         else
> -               pci_drv->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
> +               pci_eth_drv->pci_drv.drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
>
> -
> -       eth_drv->pci_drv = (struct rte_pci_driver)(*pci_drv);
> -       eth_dev->driver = eth_drv;
> +       eth_dev->driver = &pci_eth_drv->eth_drv;
>
>         eth_dev->data->nb_rx_queues = (uint16_t)1;
>         eth_dev->data->nb_tx_queues = (uint16_t)1;
> @@ -622,7 +620,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
>         dev_private->dev_ops = virtual_ethdev_default_dev_ops;
>         eth_dev->dev_ops = &dev_private->dev_ops;
>
> -       pci_dev->device.driver = ð_drv->pci_drv.driver;
> +       pci_dev->device.driver = &pci_eth_drv->pci_drv.driver;
>         eth_dev->device = &pci_dev->device;
>
>         eth_dev->rx_pkt_burst = virtual_ethdev_rx_burst_success;
> @@ -632,7 +630,7 @@ virtual_ethdev_create(const char *name, struct ether_addr *mac_addr,
>
>  err:
>         rte_free(pci_dev);
> -       rte_free(pci_drv);
> +       rte_free(pci_eth_drv);
>         rte_free(eth_drv);
>         rte_free(id_table);
>         rte_free(dev_private);
> diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
> index 7140118f..ef704d72 100644
> --- a/drivers/net/bnx2x/bnx2x_ethdev.c
> +++ b/drivers/net/bnx2x/bnx2x_ethdev.c
> @@ -618,29 +618,33 @@ eth_bnx2xvf_dev_init(struct rte_eth_dev *eth_dev)
>         return bnx2x_common_dev_init(eth_dev, 1);
>  }
>
> -static struct eth_driver rte_bnx2x_pmd = {
> +static struct rte_pci_eth_driver rte_bnx2x_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_bnx2x_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_bnx2x_dev_init,
> -       .dev_private_size = sizeof(struct bnx2x_softc),
> +       eth_drv = {
> +               .eth_dev_init = eth_bnx2x_dev_init,
> +               .dev_private_size = sizeof(struct bnx2x_softc),
> +       },
>  };
>
>  /*
>   * virtual function driver struct
>   */
> -static struct eth_driver rte_bnx2xvf_pmd = {
> +static struct rte_pci_eth_driver rte_bnx2xvf_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_bnx2xvf_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_bnx2xvf_dev_init,
> -       .dev_private_size = sizeof(struct bnx2x_softc),
> +       eth_drv = {
> +               .eth_dev_init = eth_bnx2xvf_dev_init,
> +               .dev_private_size = sizeof(struct bnx2x_softc),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 7518b6b7..9017825b 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -1164,17 +1164,19 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev) {
>         return rc;
>  }
>
> -static struct eth_driver bnxt_rte_pmd = {
> +static struct rte_pci_eth_driver bnxt_rte_pmd = {
>         .pci_drv = {
> -                   .id_table = bnxt_pci_id_map,
> -                   .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
> -                           RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
> -                   .probe = rte_eth_dev_pci_probe,
> -                   .remove = rte_eth_dev_pci_remove
> -                   },
> -       .eth_dev_init = bnxt_dev_init,
> -       .eth_dev_uninit = bnxt_dev_uninit,
> -       .dev_private_size = sizeof(struct bnxt),
> +               .id_table = bnxt_pci_id_map,
> +               .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
> +                            RTE_PCI_DRV_DETACHABLE | RTE_PCI_DRV_INTR_LSC,
> +               .probe = rte_eth_dev_pci_probe,
> +               .remove = rte_eth_dev_pci_remove
> +       },
> +       .eth_drv = {
> +               .eth_dev_init = bnxt_dev_init,
> +               .eth_dev_uninit = bnxt_dev_uninit,
> +               .dev_private_size = sizeof(struct bnxt),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd.pci_drv);
> diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
> index 64345e37..ccf93904 100644
> --- a/drivers/net/cxgbe/cxgbe_ethdev.c
> +++ b/drivers/net/cxgbe/cxgbe_ethdev.c
> @@ -1039,15 +1039,17 @@ static int eth_cxgbe_dev_init(struct rte_eth_dev *eth_dev)
>         return err;
>  }
>
> -static struct eth_driver rte_cxgbe_pmd = {
> +static struct rte_pci_eth_driver rte_cxgbe_pmd = {
>         .pci_drv = {
>                 .id_table = cxgb4_pci_tbl,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_cxgbe_dev_init,
> -       .dev_private_size = sizeof(struct port_info),
> +       .eth_drv = {
> +               .eth_dev_init = eth_cxgbe_dev_init,
> +               .dev_private_size = sizeof(struct port_info),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
> diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
> index 5f6e66dd..5b87d729 100644
> --- a/drivers/net/e1000/em_ethdev.c
> +++ b/drivers/net/e1000/em_ethdev.c
> @@ -389,7 +389,7 @@ eth_em_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_em_pmd = {
> +static struct rte_pci_eth_driver rte_em_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_em_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -397,9 +397,11 @@ static struct eth_driver rte_em_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_em_dev_init,
> -       .eth_dev_uninit = eth_em_dev_uninit,
> -       .dev_private_size = sizeof(struct e1000_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_em_dev_init,
> +               .eth_dev_uninit = eth_em_dev_uninit,
> +               .dev_private_size = sizeof(struct e1000_adapter),
> +       },
>  };
>
>  static int
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
> index 2bb57f54..4a2d3b3f 100644
> --- a/drivers/net/e1000/igb_ethdev.c
> +++ b/drivers/net/e1000/igb_ethdev.c
> @@ -1082,7 +1082,7 @@ eth_igbvf_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_igb_pmd = {
> +static struct rte_pci_eth_driver rte_igb_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_igb_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -1090,24 +1090,28 @@ static struct eth_driver rte_igb_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_igb_dev_init,
> -       .eth_dev_uninit = eth_igb_dev_uninit,
> -       .dev_private_size = sizeof(struct e1000_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_igb_dev_init,
> +               .eth_dev_uninit = eth_igb_dev_uninit,
> +               .dev_private_size = sizeof(struct e1000_adapter),
> +       },
>  };
>
>  /*
>   * virtual function driver struct
>   */
> -static struct eth_driver rte_igbvf_pmd = {
> +static struct rte_pci_eth_driver rte_igbvf_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_igbvf_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_igbvf_dev_init,
> -       .eth_dev_uninit = eth_igbvf_dev_uninit,
> -       .dev_private_size = sizeof(struct e1000_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_igbvf_dev_init,
> +               .eth_dev_uninit = eth_igbvf_dev_uninit,
> +               .dev_private_size = sizeof(struct e1000_adapter),
> +       },
>  };
>
>  static void
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> index e99bf299..d6406fa1 100644
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
> @@ -1756,15 +1756,17 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
>         return sent_idx;
>  }
>
> -static struct eth_driver rte_ena_pmd = {
> +static struct rte_pci_eth_driver rte_ena_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_ena_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_ena_dev_init,
> -       .dev_private_size = sizeof(struct ena_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_ena_dev_init,
> +               .dev_private_size = sizeof(struct ena_adapter),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd.pci_drv);
> diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
> index e5ceb98e..b47975d1 100644
> --- a/drivers/net/enic/enic_ethdev.c
> +++ b/drivers/net/enic/enic_ethdev.c
> @@ -633,15 +633,17 @@ static int eth_enicpmd_dev_init(struct rte_eth_dev *eth_dev)
>         return enic_probe(enic);
>  }
>
> -static struct eth_driver rte_enic_pmd = {
> +static struct rte_pci_eth_driver rte_enic_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_enic_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_enicpmd_dev_init,
> -       .dev_private_size = sizeof(struct enic),
> +       .eth_drv = {
> +               .eth_dev_init = eth_enicpmd_dev_init,
> +               .dev_private_size = sizeof(struct enic),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd.pci_drv);
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
> index d8353e9d..4dea1fd6 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -3077,7 +3077,7 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {
>         { .vendor_id = 0, /* sentinel */ },
>  };
>
> -static struct eth_driver rte_pmd_fm10k = {
> +static struct rte_pci_eth_driver rte_pmd_fm10k = {
>         .pci_drv = {
>                 .id_table = pci_id_fm10k_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -3085,9 +3085,11 @@ static struct eth_driver rte_pmd_fm10k = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_fm10k_dev_init,
> -       .eth_dev_uninit = eth_fm10k_dev_uninit,
> -       .dev_private_size = sizeof(struct fm10k_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_fm10k_dev_init,
> +               .eth_dev_uninit = eth_fm10k_dev_uninit,
> +               .dev_private_size = sizeof(struct fm10k_adapter),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k.pci_drv);
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 0eb4c990..8b4c6079 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -668,7 +668,7 @@ static const struct rte_i40e_xstats_name_off rte_i40e_txq_prio_strings[] = {
>  #define I40E_NB_TXQ_PRIO_XSTATS (sizeof(rte_i40e_txq_prio_strings) / \
>                 sizeof(rte_i40e_txq_prio_strings[0]))
>
> -static struct eth_driver rte_i40e_pmd = {
> +static struct rte_pci_eth_driver rte_i40e_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_i40e_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -676,9 +676,11 @@ static struct eth_driver rte_i40e_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_i40e_dev_init,
> -       .eth_dev_uninit = eth_i40e_dev_uninit,
> -       .dev_private_size = sizeof(struct i40e_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_i40e_dev_init,
> +               .eth_dev_uninit = eth_i40e_dev_uninit,
> +               .dev_private_size = sizeof(struct i40e_adapter),
> +       },
>  };
>
>  static inline int
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
> index 0dc0af52..6dbcc88c 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -1526,16 +1526,18 @@ i40evf_dev_uninit(struct rte_eth_dev *eth_dev)
>  /*
>   * virtual function driver struct
>   */
> -static struct eth_driver rte_i40evf_pmd = {
> +static struct rte_pci_eth_driver rte_i40evf_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_i40evf_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = i40evf_dev_init,
> -       .eth_dev_uninit = i40evf_dev_uninit,
> -       .dev_private_size = sizeof(struct i40e_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = i40evf_dev_init,
> +               .eth_dev_uninit = i40evf_dev_uninit,
> +               .dev_private_size = sizeof(struct i40e_adapter),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd.pci_drv);
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 060772d4..6fdf227e 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1563,7 +1563,7 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_ixgbe_pmd = {
> +static struct rte_pci_eth_driver rte_ixgbe_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_ixgbe_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -1571,24 +1571,28 @@ static struct eth_driver rte_ixgbe_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_ixgbe_dev_init,
> -       .eth_dev_uninit = eth_ixgbe_dev_uninit,
> -       .dev_private_size = sizeof(struct ixgbe_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_ixgbe_dev_init,
> +               .eth_dev_uninit = eth_ixgbe_dev_uninit,
> +               .dev_private_size = sizeof(struct ixgbe_adapter),
> +       },
>  };
>
>  /*
>   * virtual function driver struct
>   */
> -static struct eth_driver rte_ixgbevf_pmd = {
> +static struct rte_pci_eth_driver rte_ixgbevf_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_ixgbevf_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_ixgbevf_dev_init,
> -       .eth_dev_uninit = eth_ixgbevf_dev_uninit,
> -       .dev_private_size = sizeof(struct ixgbe_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = eth_ixgbevf_dev_init,
> +               .eth_dev_uninit = eth_ixgbevf_dev_uninit,
> +               .dev_private_size = sizeof(struct ixgbe_adapter),
> +       },
>  };
>
>  static int
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index eb06f56a..7b184019 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -5524,7 +5524,7 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
>         }
>  }
>
> -static struct eth_driver mlx4_driver;
> +static struct rte_pci_eth_driver mlx4_driver;
>
>  /**
>   * DPDK callback to register a PCI device.
> @@ -5903,7 +5903,7 @@ static const struct rte_pci_id mlx4_pci_id_map[] = {
>         }
>  };
>
> -static struct eth_driver mlx4_driver = {
> +static struct rte_pci_eth_driver mlx4_driver = {
>         .pci_drv = {
>                 .driver = {
>                         .name = MLX4_DRIVER_NAME
> @@ -5912,7 +5912,9 @@ static struct eth_driver mlx4_driver = {
>                 .probe = mlx4_pci_probe,
>                 .drv_flags = RTE_PCI_DRV_INTR_LSC,
>         },
> -       .dev_private_size = sizeof(struct priv)
> +       .eth_drv = {
> +               .dev_private_size = sizeof(struct priv),
> +       },
>  };
>
>  /**
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index b97b6d16..efc0430c 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -338,7 +338,7 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
>         return 0;
>  }
>
> -static struct eth_driver mlx5_driver;
> +static struct rte_pci_eth_driver mlx5_driver;
>
>  /**
>   * DPDK callback to register a PCI device.
> @@ -723,7 +723,7 @@ static const struct rte_pci_id mlx5_pci_id_map[] = {
>         }
>  };
>
> -static struct eth_driver mlx5_driver = {
> +static struct rte_pci_eth_driver mlx5_driver = {
>         .pci_drv = {
>                 .driver = {
>                         .name = MLX5_DRIVER_NAME
> @@ -732,7 +732,9 @@ static struct eth_driver mlx5_driver = {
>                 .probe = mlx5_pci_probe,
>                 .drv_flags = RTE_PCI_DRV_INTR_LSC,
>         },
> -       .dev_private_size = sizeof(struct priv)
> +       .eth_drv = {
> +               .dev_private_size = sizeof(struct priv),
> +       },
>  };
>
>  /**
> diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> index 970b5c84..f5c6634f 100644
> --- a/drivers/net/nfp/nfp_net.c
> +++ b/drivers/net/nfp/nfp_net.c
> @@ -2470,7 +2470,7 @@ static struct rte_pci_id pci_id_nfp_net_map[] = {
>         },
>  };
>
> -static struct eth_driver rte_nfp_net_pmd = {
> +static struct rte_pci_eth_driver rte_nfp_net_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_nfp_net_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> @@ -2478,8 +2478,10 @@ static struct eth_driver rte_nfp_net_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = nfp_net_init,
> -       .dev_private_size = sizeof(struct nfp_net_adapter),
> +       .eth_drv = {
> +               .eth_dev_init = nfp_net_init,
> +               .dev_private_size = sizeof(struct nfp_net_adapter),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_nfp, rte_nfp_net_pmd.pci_drv);
> diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
> index edc5b43b..13d76a6d 100644
> --- a/drivers/net/qede/qede_ethdev.c
> +++ b/drivers/net/qede/qede_ethdev.c
> @@ -1643,30 +1643,32 @@ static struct rte_pci_id pci_id_qede_map[] = {
>         {.vendor_id = 0,}
>  };
>
> -static struct eth_driver rte_qedevf_pmd = {
> +static struct rte_pci_eth_driver rte_qedevf_pmd = {
>         .pci_drv = {
> -                   .id_table = pci_id_qedevf_map,
> -                   .drv_flags =
> -                   RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
> -                   .probe = rte_eth_dev_pci_probe,
> -                   .remove = rte_eth_dev_pci_remove,
> -                  },
> -       .eth_dev_init = qedevf_eth_dev_init,
> -       .eth_dev_uninit = qedevf_eth_dev_uninit,
> -       .dev_private_size = sizeof(struct qede_dev),
> +               .id_table = pci_id_qedevf_map,
> +               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
> +               .probe = rte_eth_dev_pci_probe,
> +               .remove = rte_eth_dev_pci_remove,
> +       },
> +       .eth_drv = {
> +               .eth_dev_init = qedevf_eth_dev_init,
> +               .eth_dev_uninit = qedevf_eth_dev_uninit,
> +               .dev_private_size = sizeof(struct qede_dev),
> +       },
>  };
>
> -static struct eth_driver rte_qede_pmd = {
> +static struct rte_pci_eth_driver rte_qede_pmd = {
>         .pci_drv = {
> -                   .id_table = pci_id_qede_map,
> -                   .drv_flags =
> -                   RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
> -                   .probe = rte_eth_dev_pci_probe,
> -                   .remove = rte_eth_dev_pci_remove,
> -                  },
> -       .eth_dev_init = qede_eth_dev_init,
> -       .eth_dev_uninit = qede_eth_dev_uninit,
> -       .dev_private_size = sizeof(struct qede_dev),
> +               .id_table = pci_id_qede_map,
> +               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
> +               .probe = rte_eth_dev_pci_probe,
> +               .remove = rte_eth_dev_pci_remove,
> +       },
> +       .eth_drv = {
> +               .eth_dev_init = qede_eth_dev_init,
> +               .eth_dev_uninit = qede_eth_dev_uninit,
> +               .dev_private_size = sizeof(struct qede_dev),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_qede, rte_qede_pmd.pci_drv);
> diff --git a/drivers/net/szedata2/rte_eth_szedata2.c b/drivers/net/szedata2/rte_eth_szedata2.c
> index fe7a6b3b..b9054671 100644
> --- a/drivers/net/szedata2/rte_eth_szedata2.c
> +++ b/drivers/net/szedata2/rte_eth_szedata2.c
> @@ -1587,15 +1587,17 @@ static const struct rte_pci_id rte_szedata2_pci_id_table[] = {
>         }
>  };
>
> -static struct eth_driver szedata2_eth_driver = {
> +static struct rte_pci_eth_driver szedata2_eth_driver = {
>         .pci_drv = {
>                 .id_table = rte_szedata2_pci_id_table,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init     = rte_szedata2_eth_dev_init,
> -       .eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
> -       .dev_private_size = sizeof(struct pmd_internals),
> +       .eth_drv = {
> +               .eth_dev_init     = rte_szedata2_eth_dev_init,
> +               .eth_dev_uninit   = rte_szedata2_eth_dev_uninit,
> +               .dev_private_size = sizeof(struct pmd_internals),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(RTE_SZEDATA2_DRIVER_NAME, szedata2_eth_driver.pci_drv);
> diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c
> index 10603197..f13fad90 100644
> --- a/drivers/net/thunderx/nicvf_ethdev.c
> +++ b/drivers/net/thunderx/nicvf_ethdev.c
> @@ -2111,15 +2111,17 @@ static const struct rte_pci_id pci_id_nicvf_map[] = {
>         },
>  };
>
> -static struct eth_driver rte_nicvf_pmd = {
> +static struct rte_pci_eth_driver rte_nicvf_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_nicvf_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = nicvf_eth_dev_init,
> -       .dev_private_size = sizeof(struct nicvf),
> +       .eth_drv = {
> +               .eth_dev_init = nicvf_eth_dev_init,
> +               .dev_private_size = sizeof(struct nicvf),
> +       },
>  };
>
>  RTE_PMD_REGISTER_PCI(net_thunderx, rte_nicvf_pmd.pci_drv);
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index 54ea7d77..e6f241ad 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1377,7 +1377,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_virtio_pmd = {
> +static struct rte_pci_eth_driver rte_virtio_pmd = {
>         .pci_drv = {
>                 .driver = {
>                         .name = "net_virtio",
> @@ -1387,9 +1387,11 @@ static struct eth_driver rte_virtio_pmd = {
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_virtio_dev_init,
> -       .eth_dev_uninit = eth_virtio_dev_uninit,
> -       .dev_private_size = sizeof(struct virtio_hw),
> +       .eth_drv = {
> +               .eth_dev_init = eth_virtio_dev_init,
> +               .eth_dev_uninit = eth_virtio_dev_uninit,
> +               .dev_private_size = sizeof(struct virtio_hw),
> +       },
>  };
>
>  RTE_INIT(rte_virtio_pmd_init);
> diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
> index 54533ca5..cb9221e6 100644
> --- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
> +++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
> @@ -337,16 +337,18 @@ eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_vmxnet3_pmd = {
> +static struct rte_pci_eth_driver rte_vmxnet3_pmd = {
>         .pci_drv = {
>                 .id_table = pci_id_vmxnet3_map,
>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>                 .probe = rte_eth_dev_pci_probe,
>                 .remove = rte_eth_dev_pci_remove,
>         },
> -       .eth_dev_init = eth_vmxnet3_dev_init,
> -       .eth_dev_uninit = eth_vmxnet3_dev_uninit,
> -       .dev_private_size = sizeof(struct vmxnet3_hw),
> +       .eth_drv = {
> +               .eth_dev_init = eth_vmxnet3_dev_init,
> +               .eth_dev_uninit = eth_vmxnet3_dev_uninit,
> +               .dev_private_size = sizeof(struct vmxnet3_hw),
> +       },
>  };
>
>  static int
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 9dea1f15..7c212096 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -239,13 +239,14 @@ int
>  rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>                       struct rte_pci_device *pci_dev)
>  {
> -       struct eth_driver    *eth_drv;
> +       const struct rte_pci_eth_driver *pci_eth_drv;
> +       const struct eth_driver *eth_drv;
>         struct rte_eth_dev *eth_dev;
>         char ethdev_name[RTE_ETH_NAME_MAX_LEN];
> -
>         int diag;
>
> -       eth_drv = (struct eth_driver *)pci_drv;
> +       pci_eth_drv = container_of(pci_drv, struct rte_pci_eth_driver, pci_drv);
> +       eth_drv = &pci_eth_drv->eth_drv;
>
>         rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
>                         sizeof(ethdev_name));
> @@ -263,7 +264,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>         }
>         eth_dev->device = &pci_dev->device;
>         eth_dev->intr_handle = &pci_dev->intr_handle;
> -       eth_dev->driver = eth_drv;
> +       eth_dev->driver = &pci_eth_drv->eth_drv;
>
>         /* Invoke PMD device initialization function */
>         diag = (*eth_drv->eth_dev_init)(eth_dev);
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index b4168830..1a62a322 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1884,25 +1884,31 @@ typedef int (*eth_dev_uninit_t)(struct rte_eth_dev *eth_dev);
>   * @internal
>   * The structure associated with a PMD Ethernet driver.
>   *
> - * Each Ethernet driver acts as a PCI driver and is represented by a generic
> + * Each Ethernet driver acts is represented by a generic
>   * *eth_driver* structure that holds:
>   *
> - * - An *rte_pci_driver* structure (which must be the first field).
> + * - The *eth_dev_init* function invoked for each matching device.
>   *
> - * - The *eth_dev_init* function invoked for each matching PCI device.
> - *
> - * - The *eth_dev_uninit* function invoked for each matching PCI device.
> + * - The *eth_dev_uninit* function invoked for each matching device.
>   *
>   * - The size of the private data to allocate for each matching device.
>   */
>  struct eth_driver {
> -       struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
>         eth_dev_init_t eth_dev_init;      /**< Device init function. */
>         eth_dev_uninit_t eth_dev_uninit;  /**< Device uninit function. */
>         unsigned int dev_private_size;    /**< Size of device private data. */
>  };
>
>  /**
> + * @internal
> + * The structure associated with a PMD PCI Ethernet driver.
> + */
> +struct rte_pci_eth_driver {
> +       struct rte_pci_driver   pci_drv;        /**< Underlying PCI driver. */
> +       struct eth_driver       eth_drv;        /**< Ethernet driver. */
> +};
> +
> +/**
>   * Convert a numerical speed in Mbps to a bitmap flag that can be used in
>   * the bitmap link_speeds of the struct rte_eth_conf
>   *
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection
  2017-01-10 16:11   ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Jan Blunck
@ 2017-01-10 18:03     ` Stephen Hemminger
  0 siblings, 0 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-10 18:03 UTC (permalink / raw)
  To: Jan Blunck; +Cc: dev, Stephen Hemminger
On Tue, 10 Jan 2017 17:11:15 +0100
Jan Blunck <jblunck@infradead.org> wrote:
> On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > There are multiple buses and device types now. Therefore it no longer
> > makes sense that PCI driver information is part of the Ethernet driver
> > structure.  
> 
> The Ethernet driver itself doesn't over alot of value from an
> abstraction point of view. Its questionable if there ever will be an
> Ethernet driver that is able to operate on different types of
> low-level devices. The virtual devices are anyway able to operate
> without an Ethernet driver structure. Most of that functionality
> should get moved either into the bus abstraction or the low-level
> device probe function.
I agree that that 'struct eth_driver' is not adding a lot now.
It should really be all folded back into 'struct rte_driver'.
The concept of init, uninit and private data are all generic and not
really specific to ethernet in anyway.
If we kill off eth_driver then PCI devices only have rte_pci_driver
and VMBUS can have rte_vmbus_driver.
^ permalink raw reply	[flat|nested] 30+ messages in thread 
 
 
- * [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-07 18:17 [dpdk-dev] [PATCH v2 0/8] device abstraction and VMBUS support infrastructure Stephen Hemminger
                   ` (6 preceding siblings ...)
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 7/8] ethdev: break ethernet driver and pci_driver connection Stephen Hemminger
@ 2017-01-07 18:17 ` Stephen Hemminger
  2017-01-10 17:27   ` Jan Blunck
  2017-01-11 14:49   ` Jan Blunck
  7 siblings, 2 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-07 18:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
Add support for VMBUS on Hyper-V/Azure. VMBUS is similar to PCI
but has different addressing and internal API's.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 lib/librte_eal/common/Makefile              |   2 +-
 lib/librte_eal/common/eal_common_devargs.c  |   7 +
 lib/librte_eal/common/eal_common_options.c  |  38 ++
 lib/librte_eal/common/eal_internal_cfg.h    |   1 +
 lib/librte_eal/common/eal_options.h         |   6 +
 lib/librte_eal/common/eal_private.h         |   5 +
 lib/librte_eal/common/include/rte_devargs.h |   8 +
 lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile        |   6 +
 lib/librte_eal/linuxapp/eal/eal.c           |  13 +
 lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 911 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.c               |  90 +++
 lib/librte_ether/rte_ethdev.h               |  31 +
 mk/rte.app.mk                               |   1 +
 14 files changed, 1367 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c
diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 09a3d3af..ceb77bed 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
-INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
+INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h rte_vmbus.h
 INC += rte_per_lcore.h rte_random.h
 INC += rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_version.h
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e403717b..934ca840 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -113,6 +113,13 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 			goto fail;
 
 		break;
+	case RTE_DEVTYPE_WHITELISTED_VMBUS:
+	case RTE_DEVTYPE_BLACKLISTED_VMBUS:
+#ifdef RTE_LIBRTE_HV_PMD
+		if (uuid_parse(buf, devargs->uuid) == 0)
+			break;
+#endif
+		goto fail;
 	}
 
 	free(buf);
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index f36bc556..1a2b418c 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -95,6 +95,11 @@ eal_long_options[] = {
 	{OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
 	{OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
 	{OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
+#ifdef RTE_LIBRTE_HV_PMD
+	{OPT_NO_VMBUS,          0, NULL, OPT_NO_VMBUS_NUM         },
+	{OPT_VMBUS_BLACKLIST,   1, NULL, OPT_VMBUS_BLACKLIST_NUM  },
+	{OPT_VMBUS_WHITELIST,   1, NULL, OPT_VMBUS_WHITELIST_NUM  },
+#endif
 	{0,                     0, NULL, 0                        }
 };
 
@@ -858,6 +863,21 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->no_pci = 1;
 		break;
 
+#ifdef RTE_LIBRTE_HV_PMD
+	case OPT_NO_VMBUS_NUM:
+		conf->no_vmbus = 1;
+		break;
+	case OPT_VMBUS_BLACKLIST_NUM:
+		if (rte_eal_devargs_add(RTE_DEVTYPE_BLACKLISTED_VMBUS,
+					optarg) < 0)
+			return -1;
+		break;
+	case OPT_VMBUS_WHITELIST_NUM:
+		if (rte_eal_devargs_add(RTE_DEVTYPE_WHITELISTED_VMBUS,
+				optarg) < 0)
+			return -1;
+		break;
+#endif
 	case OPT_NO_HPET_NUM:
 		conf->no_hpet = 1;
 		break;
@@ -1017,6 +1037,14 @@ eal_check_common_options(struct internal_config *internal_cfg)
 		return -1;
 	}
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_VMBUS) != 0 &&
+		rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_VMBUS) != 0) {
+		RTE_LOG(ERR, EAL, "Options vmbus blacklist and whitelist "
+			"cannot be used at the same time\n");
+		return -1;
+	}
+#endif
 	return 0;
 }
 
@@ -1066,5 +1094,15 @@ eal_common_usage(void)
 	       "  --"OPT_NO_PCI"            Disable PCI\n"
 	       "  --"OPT_NO_HPET"           Disable HPET\n"
 	       "  --"OPT_NO_SHCONF"         No shared config (mmap'd files)\n"
+#ifdef RTE_LIBRTE_HV_PMD
+	       "  --"OPT_NO_VMBUS"          Disable VMBUS\n"
+	       "  --"OPT_VMBUS_BLACKLIST" Add a VMBUS device to black list.\n"
+	       "                      Prevent EAL from using this PCI device. The argument\n"
+	       "                      format is device UUID.\n"
+	       "  --"OPT_VMBUS_WHITELIST" Add a VMBUS device to white list.\n"
+	       "                      Only use the specified VMBUS devices. The argument format\n"
+	       "                      is device UUID This option can be present\n"
+	       "                      several times (once per device).\n"
+#endif
 	       "\n", RTE_MAX_LCORE);
 }
diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
index 5f1367eb..4b6af937 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -67,6 +67,7 @@ struct internal_config {
 	unsigned hugepage_unlink;         /**< true to unlink backing files */
 	volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
 	volatile unsigned no_pci;         /**< true to disable PCI */
+	volatile unsigned no_vmbus;       /**< true to disable VMBUS */
 	volatile unsigned no_hpet;        /**< true to disable HPET */
 	volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping
 										* instead of native TSC */
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index a881c62e..156727e7 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -83,6 +83,12 @@ enum {
 	OPT_VMWARE_TSC_MAP_NUM,
 #define OPT_XEN_DOM0          "xen-dom0"
 	OPT_XEN_DOM0_NUM,
+#define OPT_NO_VMBUS          "no-vmbus"
+	OPT_NO_VMBUS_NUM,
+#define OPT_VMBUS_BLACKLIST   "vmbus-blacklist"
+	OPT_VMBUS_BLACKLIST_NUM,
+#define OPT_VMBUS_WHITELIST   "vmbus-whitelist"
+	OPT_VMBUS_WHITELIST_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 9e7d8f6b..c856c63e 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -210,6 +210,11 @@ int pci_uio_map_resource_by_index(struct rte_pci_device *dev, int res_idx,
 		struct mapped_pci_resource *uio_res, int map_idx);
 
 /**
+ * VMBUS related functions and structures
+ */
+int rte_eal_vmbus_init(void);
+
+/**
  * Init tail queues for non-EAL library structures. This is to allow
  * the rings, mempools, etc. lists to be shared among multiple processes
  *
diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
index 88120a1c..c079d289 100644
--- a/lib/librte_eal/common/include/rte_devargs.h
+++ b/lib/librte_eal/common/include/rte_devargs.h
@@ -51,6 +51,9 @@ extern "C" {
 #include <stdio.h>
 #include <sys/queue.h>
 #include <rte_pci.h>
+#ifdef RTE_LIBRTE_HV_PMD
+#include <uuid/uuid.h>
+#endif
 
 /**
  * Type of generic device
@@ -59,6 +62,8 @@ enum rte_devtype {
 	RTE_DEVTYPE_WHITELISTED_PCI,
 	RTE_DEVTYPE_BLACKLISTED_PCI,
 	RTE_DEVTYPE_VIRTUAL,
+	RTE_DEVTYPE_WHITELISTED_VMBUS,
+	RTE_DEVTYPE_BLACKLISTED_VMBUS,
 };
 
 /**
@@ -88,6 +93,9 @@ struct rte_devargs {
 			/** Driver name. */
 			char drv_name[32];
 		} virt;
+#ifdef RTE_LIBRTE_HV_PMD
+		uuid_t uuid;
+#endif
 	};
 	/** Arguments string as given by user or "" for no argument. */
 	char *args;
diff --git a/lib/librte_eal/common/include/rte_vmbus.h b/lib/librte_eal/common/include/rte_vmbus.h
new file mode 100644
index 00000000..f96d753e
--- /dev/null
+++ b/lib/librte_eal/common/include/rte_vmbus.h
@@ -0,0 +1,249 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
+ *   Copyright(c) 2016 Microsoft Corporation
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_VMBUS_H_
+#define _RTE_VMBUS_H_
+
+/**
+ * @file
+ *
+ * RTE VMBUS Interface
+ */
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <limits.h>
+#include <errno.h>
+#include <uuid/uuid.h>
+#include <sys/queue.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_debug.h>
+#include <rte_interrupts.h>
+#include <rte_dev.h>
+
+TAILQ_HEAD(vmbus_device_list, rte_vmbus_device);
+TAILQ_HEAD(vmbus_driver_list, rte_vmbus_driver);
+
+extern struct vmbus_driver_list vmbus_driver_list;
+extern struct vmbus_device_list vmbus_device_list;
+
+/** Pathname of VMBUS devices directory. */
+#define SYSFS_VMBUS_DEVICES "/sys/bus/vmbus/devices"
+
+#define UUID_BUF_SZ	(36 + 1)
+
+
+/** Maximum number of VMBUS resources. */
+#define VMBUS_MAX_RESOURCE 7
+
+/**
+ * A structure describing a VMBUS device.
+ */
+struct rte_vmbus_device {
+	TAILQ_ENTRY(rte_vmbus_device) next;     /**< Next probed VMBUS device. */
+	struct rte_device device;               /**< Inherit core device */
+	uuid_t device_id;			/**< VMBUS device id */
+	uuid_t class_id;			/**< VMBUS device type */
+	uint32_t relid;				/**< VMBUS id for notification */
+	uint8_t	monitor_id;
+	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
+	const struct rte_vmbus_driver *driver;  /**< Associated driver */
+
+	struct rte_mem_resource mem_resource[VMBUS_MAX_RESOURCE];
+						/**< VMBUS Memory Resource */
+	char sysfs_name[];			/**< Name in sysfs bus directory */
+};
+
+struct rte_vmbus_driver;
+
+/**
+ * Initialisation function for the driver called during VMBUS probing.
+ */
+typedef int (vmbus_probe_t)(struct rte_vmbus_driver *,
+			    struct rte_vmbus_device *);
+
+/**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (vmbus_remove_t)(struct rte_vmbus_device *);
+
+/**
+ * A structure describing a VMBUS driver.
+ */
+struct rte_vmbus_driver {
+	TAILQ_ENTRY(rte_vmbus_driver) next;     /**< Next in list. */
+	struct rte_driver driver;
+	vmbus_probe_t *probe;                   /**< Device Probe function. */
+	vmbus_remove_t *remove;                 /**< Device Remove function. */
+
+	const uuid_t *id_table;			/**< ID table. */
+};
+
+struct vmbus_map {
+	void *addr;
+	char *path;
+	uint64_t offset;
+	uint64_t size;
+	uint64_t phaddr;
+};
+
+/*
+ * For multi-process we need to reproduce all vmbus mappings in secondary
+ * processes, so save them in a tailq.
+ */
+struct mapped_vmbus_resource {
+	TAILQ_ENTRY(mapped_vmbus_resource) next;
+
+	uuid_t uuid;
+	char path[PATH_MAX];
+	int nb_maps;
+	struct vmbus_map maps[VMBUS_MAX_RESOURCE];
+};
+
+TAILQ_HEAD(mapped_vmbus_res_list, mapped_vmbus_resource);
+
+/**
+ * Scan the content of the VMBUS bus, and the devices in the devices list
+ *
+ * @return
+ *  0 on success, negative on error
+ */
+int rte_eal_vmbus_scan(void);
+
+/**
+ * Probe the VMBUS bus for registered drivers.
+ *
+ * Scan the content of the VMBUS bus, and call the probe() function for
+ * all registered drivers that have a matching entry in its id_table
+ * for discovered devices.
+ *
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_probe(void);
+
+/**
+ * Map the VMBUS device resources in user space virtual memory address
+ *
+ * @param dev
+ *   A pointer to a rte_vmbus_device structure describing the device
+ *   to use
+ *
+ * @return
+ *   0 on success, negative on error and positive if no driver
+ *   is found for the device.
+ */
+int rte_eal_vmbus_map_device(struct rte_vmbus_device *dev);
+
+/**
+ * Unmap this device
+ *
+ * @param dev
+ *   A pointer to a rte_vmbus_device structure describing the device
+ *   to use
+ */
+void rte_eal_vmbus_unmap_device(struct rte_vmbus_device *dev);
+
+/**
+ * Probe the single VMBUS device.
+ *
+ * Scan the content of the VMBUS bus, and find the vmbus device
+ * specified by device uuid, then call the probe() function for
+ * registered driver that has a matching entry in its id_table for
+ * discovered device.
+ *
+ * @param id
+ *   The VMBUS device uuid.
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_probe_one(uuid_t id);
+
+/**
+ * Close the single VMBUS device.
+ *
+ * Scan the content of the VMBUS bus, and find the vmbus device id,
+ * then call the remove() function for registered driver that has a
+ * matching entry in its id_table for discovered device.
+ *
+ * @param id
+ *   The VMBUS device uuid.
+ * @return
+ *   - 0 on success.
+ *   - Negative on error.
+ */
+int rte_eal_vmbus_detach(uuid_t id);
+
+/**
+ * Register a VMBUS driver.
+ *
+ * @param driver
+ *   A pointer to a rte_vmbus_driver structure describing the driver
+ *   to be registered.
+ */
+void rte_eal_vmbus_register(struct rte_vmbus_driver *driver);
+
+/** Helper for VMBUS device registration from driver nstance */
+#define RTE_PMD_REGISTER_VMBUS(nm, vmbus_drv) \
+RTE_INIT(vmbusinitfn_ ##nm); \
+static void vmbusinitfn_ ##nm(void) \
+{\
+	(vmbus_drv).driver.name = RTE_STR(nm);\
+	(vmbus_drv).driver.type = PMD_VMBUS; \
+	rte_eal_vmbus_register(&vmbus_drv); \
+} \
+RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
+
+/**
+ * Unregister a VMBUS driver.
+ *
+ * @param driver
+ *   A pointer to a rte_vmbus_driver structure describing the driver
+ *   to be unregistered.
+ */
+void rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_VMBUS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 4e206f09..f6ca3848 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -71,6 +71,11 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_HV_PMD),y)
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_vmbus.c
+LDLIBS += -luuid
+endif
+
 # from common dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
@@ -114,6 +119,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_uio.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
+CFLAGS_eal_vmbux.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 16dd5b9c..1bc0814a 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -70,6 +70,9 @@
 #include <rte_cpuflags.h>
 #include <rte_interrupts.h>
 #include <rte_pci.h>
+#ifdef RTE_LIBRTE_HV_PMD
+#include <rte_vmbus.h>
+#endif
 #include <rte_dev.h>
 #include <rte_devargs.h>
 #include <rte_common.h>
@@ -830,6 +833,11 @@ rte_eal_init(int argc, char **argv)
 
 	eal_check_mem_on_local_socket();
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_vmbus_init() < 0)
+		RTE_LOG(ERR, EAL, "Cannot init VMBUS\n");
+#endif
+
 	if (eal_plugins_init() < 0)
 		rte_panic("Cannot init plugins\n");
 
@@ -884,6 +892,11 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_probe())
 		rte_panic("Cannot probe PCI\n");
 
+#ifdef RTE_LIBRTE_HV_PMD
+	if (rte_eal_vmbus_probe() < 0)
+		rte_panic("Cannot probe VMBUS\n");
+#endif
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_vmbus.c b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
new file mode 100644
index 00000000..729f93a9
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
@@ -0,0 +1,911 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
+ *   Copyright(c) 2016 Microsoft Corporation
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *	 notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *	 notice, this list of conditions and the following disclaimer in
+ *	 the documentation and/or other materials provided with the
+ *	 distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *	 contributors may be used to endorse or promote products derived
+ *	 from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include <string.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+
+#include <rte_eal.h>
+#include <rte_tailq.h>
+#include <rte_log.h>
+#include <rte_devargs.h>
+#include <rte_vmbus.h>
+#include <rte_malloc.h>
+
+#include "eal_private.h"
+#include "eal_pci_init.h"
+#include "eal_filesystem.h"
+
+struct vmbus_driver_list vmbus_driver_list =
+	TAILQ_HEAD_INITIALIZER(vmbus_driver_list);
+struct vmbus_device_list vmbus_device_list =
+	TAILQ_HEAD_INITIALIZER(vmbus_device_list);
+
+static void *vmbus_map_addr;
+
+static struct rte_tailq_elem rte_vmbus_uio_tailq = {
+	.name = "UIO_RESOURCE_LIST",
+};
+EAL_REGISTER_TAILQ(rte_vmbus_uio_tailq);
+
+/*
+ * parse a sysfs file containing one integer value
+ * different to the eal version, as it needs to work with 64-bit values
+ */
+static int
+vmbus_get_sysfs_uuid(const char *filename, uuid_t uu)
+{
+	char buf[BUFSIZ];
+	char *cp, *in = buf;
+	FILE *f;
+
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
+				__func__, filename);
+		fclose(f);
+		return -1;
+	}
+	fclose(f);
+
+	cp = strchr(buf, '\n');
+	if (cp)
+		*cp = '\0';
+
+	/* strip { } notation */
+	if (buf[0] == '{') {
+		in = buf + 1;
+		cp = strchr(in, '}');
+		if (cp)
+			*cp = '\0';
+	}
+
+	if (uuid_parse(in, uu) < 0) {
+		RTE_LOG(ERR, EAL, "%s %s not a valid UUID\n",
+			filename, buf);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* map a particular resource from a file */
+static void *
+vmbus_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
+		   int flags)
+{
+	void *mapaddr;
+
+	/* Map the memory resource of device */
+	mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
+		       MAP_SHARED | flags, fd, offset);
+	if (mapaddr == MAP_FAILED ||
+	    (requested_addr != NULL && mapaddr != requested_addr)) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s)\n",
+			__func__, fd, requested_addr,
+			(unsigned long)size, (unsigned long)offset,
+			strerror(errno));
+	} else
+		RTE_LOG(DEBUG, EAL, "  VMBUS memory mapped at %p\n", mapaddr);
+
+	return mapaddr;
+}
+
+/* unmap a particular resource */
+static void
+vmbus_unmap_resource(void *requested_addr, size_t size)
+{
+	if (requested_addr == NULL)
+		return;
+
+	/* Unmap the VMBUS memory resource of device */
+	if (munmap(requested_addr, size)) {
+		RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+			__func__, requested_addr, (unsigned long)size,
+			strerror(errno));
+	} else
+		RTE_LOG(DEBUG, EAL, "  VMBUS memory unmapped at %p\n",
+				requested_addr);
+}
+
+/* Only supports current kernel version
+ * Unlike PCI there is no option (or need) to create UIO device.
+ */
+static int vmbus_get_uio_dev(const char *name,
+			     char *dstbuf, size_t buflen)
+{
+	char dirname[PATH_MAX];
+	unsigned int uio_num;
+	struct dirent *e;
+	DIR *dir;
+
+	snprintf(dirname, sizeof(dirname),
+		 "/sys/bus/vmbus/devices/%s/uio", name);
+
+	dir = opendir(dirname);
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot map uio resources for %s: %s\n",
+			name, strerror(errno));
+		return -1;
+	}
+
+	/* take the first file starting with "uio" */
+	while ((e = readdir(dir)) != NULL) {
+		if (sscanf(e->d_name, "uio%u", &uio_num) != 1)
+			continue;
+
+		snprintf(dstbuf, buflen, "%s/uio%u", dirname, uio_num);
+		break;
+	}
+	closedir(dir);
+
+	return e ? (int) uio_num : -1;
+}
+
+/*
+ * parse a sysfs file containing one integer value
+ * different to the eal version, as it needs to work with 64-bit values
+ */
+static int
+vmbus_parse_sysfs_value(const char *dir, const char *name,
+			uint64_t *val)
+{
+	char filename[PATH_MAX];
+	FILE *f;
+	char buf[BUFSIZ];
+	char *end = NULL;
+
+	snprintf(filename, sizeof(filename), "%s/%s", dir, name);
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+
+	if (fgets(buf, sizeof(buf), f) == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
+				__func__, filename);
+		fclose(f);
+		return -1;
+	}
+	fclose(f);
+
+	*val = strtoull(buf, &end, 0);
+	if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
+		RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
+				__func__, filename);
+		return -1;
+	}
+	return 0;
+}
+
+/* Get mappings out of values provided by uio */
+static int
+vmbus_uio_get_mappings(const char *uioname,
+		       struct vmbus_map maps[])
+{
+	int i;
+
+	for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
+		struct vmbus_map *map = &maps[i];
+		char dirname[PATH_MAX];
+
+		/* check if map directory exists */
+		snprintf(dirname, sizeof(dirname),
+			 "%s/maps/map%d", uioname, i);
+
+		if (access(dirname, F_OK) != 0)
+			break;
+
+		/* get mapping offset */
+		if (vmbus_parse_sysfs_value(dirname, "offset",
+					    &map->offset) < 0)
+			return -1;
+
+		/* get mapping size */
+		if (vmbus_parse_sysfs_value(dirname, "size",
+					    &map->size) < 0)
+			return -1;
+
+		/* get mapping physical address */
+		if (vmbus_parse_sysfs_value(dirname, "addr",
+					    &maps->phaddr) < 0)
+			return -1;
+	}
+
+	return i;
+}
+
+static void
+vmbus_uio_free_resource(struct rte_vmbus_device *dev,
+		struct mapped_vmbus_resource *uio_res)
+{
+	rte_free(uio_res);
+
+	if (dev->intr_handle.fd) {
+		close(dev->intr_handle.fd);
+		dev->intr_handle.fd = -1;
+		dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+	}
+}
+
+static struct mapped_vmbus_resource *
+vmbus_uio_alloc_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	char dirname[PATH_MAX], devname[PATH_MAX];
+	int uio_num, nb_maps;
+
+	uio_num = vmbus_get_uio_dev(dev->sysfs_name, dirname, sizeof(dirname));
+	if (uio_num < 0) {
+		RTE_LOG(WARNING, EAL,
+			"  %s not managed by UIO driver, skipping\n",
+			dev->sysfs_name);
+		return NULL;
+	}
+
+	/* allocate the mapping details for secondary processes*/
+	uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
+	if (uio_res == NULL) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot store uio mmap details\n", __func__);
+		goto error;
+	}
+
+	snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
+	dev->intr_handle.fd = open(devname, O_RDWR);
+	if (dev->intr_handle.fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+			devname, strerror(errno));
+		goto error;
+	}
+
+	dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
+
+	snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
+	uuid_copy(uio_res->uuid, dev->device_id);
+
+	nb_maps = vmbus_uio_get_mappings(dirname, uio_res->maps);
+	if (nb_maps < 0)
+		goto error;
+
+	RTE_LOG(DEBUG, EAL, "Found %d memory maps for device %s\n",
+		nb_maps, dev->sysfs_name);
+
+	return uio_res;
+
+ error:
+	vmbus_uio_free_resource(dev, uio_res);
+	return NULL;
+}
+
+static int
+vmbus_uio_map_resource_by_index(struct rte_vmbus_device *dev,
+				unsigned int res_idx,
+				struct mapped_vmbus_resource *uio_res,
+				unsigned int map_idx)
+{
+	struct vmbus_map *maps = uio_res->maps;
+	char devname[PATH_MAX];
+	void *mapaddr;
+	int fd;
+
+	snprintf(devname, sizeof(devname),
+		 "/sys/bus/vmbus/%s/resource%u", dev->sysfs_name, res_idx);
+
+	fd = open(devname, O_RDWR);
+	if (fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+				devname, strerror(errno));
+		return -1;
+	}
+
+	/* allocate memory to keep path */
+	maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
+	if (maps[map_idx].path == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot allocate memory for path: %s\n",
+				strerror(errno));
+		return -1;
+	}
+
+	/* try mapping somewhere close to the end of hugepages */
+	if (vmbus_map_addr == NULL)
+		vmbus_map_addr = pci_find_max_end_va();
+
+	mapaddr = vmbus_map_resource(vmbus_map_addr, fd, 0,
+				     dev->mem_resource[res_idx].len, 0);
+	close(fd);
+	if (mapaddr == MAP_FAILED) {
+		rte_free(maps[map_idx].path);
+		return -1;
+	}
+
+	vmbus_map_addr = RTE_PTR_ADD(mapaddr,
+				     dev->mem_resource[res_idx].len);
+
+	maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
+	maps[map_idx].size = dev->mem_resource[res_idx].len;
+	maps[map_idx].addr = mapaddr;
+	maps[map_idx].offset = 0;
+	strcpy(maps[map_idx].path, devname);
+	dev->mem_resource[res_idx].addr = mapaddr;
+
+	return 0;
+}
+
+static void
+vmbus_uio_unmap(struct mapped_vmbus_resource *uio_res)
+{
+	int i;
+
+	if (uio_res == NULL)
+		return;
+
+	for (i = 0; i != uio_res->nb_maps; i++) {
+		vmbus_unmap_resource(uio_res->maps[i].addr,
+				     uio_res->maps[i].size);
+
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+			rte_free(uio_res->maps[i].path);
+	}
+}
+
+static struct mapped_vmbus_resource *
+vmbus_uio_find_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	if (dev == NULL)
+		return NULL;
+
+	TAILQ_FOREACH(uio_res, uio_res_list, next) {
+		if (uuid_compare(uio_res->uuid, dev->device_id) == 0)
+			return uio_res;
+	}
+	return NULL;
+}
+
+/* unmap the VMBUS resource of a VMBUS device in virtual memory */
+static void
+vmbus_uio_unmap_resource(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	if (dev == NULL)
+		return;
+
+	/* find an entry for the device */
+	uio_res = vmbus_uio_find_resource(dev);
+	if (uio_res == NULL)
+		return;
+
+	/* secondary processes - just free maps */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return vmbus_uio_unmap(uio_res);
+
+	TAILQ_REMOVE(uio_res_list, uio_res, next);
+
+	/* unmap all resources */
+	vmbus_uio_unmap(uio_res);
+
+	/* free uio resource */
+	rte_free(uio_res);
+
+	/* close fd if in primary process */
+	close(dev->intr_handle.fd);
+	if (dev->intr_handle.uio_cfg_fd >= 0) {
+		close(dev->intr_handle.uio_cfg_fd);
+		dev->intr_handle.uio_cfg_fd = -1;
+	}
+
+	dev->intr_handle.fd = -1;
+	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+
+static int
+vmbus_uio_map_secondary(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+			RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
+				       mapped_vmbus_res_list);
+
+	TAILQ_FOREACH(uio_res, uio_res_list, next) {
+		int i;
+
+		/* skip this element if it doesn't match our id */
+		if (uuid_compare(uio_res->uuid, dev->device_id))
+			continue;
+
+		for (i = 0; i != uio_res->nb_maps; i++) {
+			void *mapaddr;
+			int fd;
+
+			fd = open(uio_res->maps[i].path, O_RDWR);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+					uio_res->maps[i].path, strerror(errno));
+				return -1;
+			}
+
+			mapaddr = vmbus_map_resource(uio_res->maps[i].addr, fd,
+						     uio_res->maps[i].offset,
+						     uio_res->maps[i].size, 0);
+			/* fd is not needed in slave process, close it */
+			close(fd);
+
+			if (mapaddr == uio_res->maps[i].addr)
+				continue;
+
+			RTE_LOG(ERR, EAL,
+				"Cannot mmap device resource file %s to address: %p\n",
+				uio_res->maps[i].path,
+				uio_res->maps[i].addr);
+
+			/* unmap addrs correctly mapped */
+			while (i != 0) {
+				--i;
+				vmbus_unmap_resource(uio_res->maps[i].addr,
+						     uio_res->maps[i].size);
+			}
+			return -1;
+
+		}
+		return 0;
+	}
+
+	RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
+	return 1;
+}
+
+/* map the resources of a vmbus device in virtual memory */
+int
+rte_eal_vmbus_map_device(struct rte_vmbus_device *dev)
+{
+	struct mapped_vmbus_resource *uio_res;
+	struct mapped_vmbus_res_list *uio_res_list =
+		RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head, mapped_vmbus_res_list);
+	int i, ret, map_idx = 0;
+
+	dev->intr_handle.fd = -1;
+	dev->intr_handle.uio_cfg_fd = -1;
+	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+
+	/* secondary processes - use already recorded details */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return vmbus_uio_map_secondary(dev);
+
+	/* allocate uio resource */
+	uio_res = vmbus_uio_alloc_resource(dev);
+	if (uio_res == NULL)
+		return -1;
+
+	/* Map all BARs */
+	for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
+		uint64_t phaddr;
+
+		/* skip empty BAR */
+		phaddr = dev->mem_resource[i].phys_addr;
+		if (phaddr == 0)
+			continue;
+
+		ret = vmbus_uio_map_resource_by_index(dev, i,
+						      uio_res, map_idx);
+		if (ret)
+			goto error;
+
+		map_idx++;
+	}
+
+	uio_res->nb_maps = map_idx;
+
+	TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
+
+	return 0;
+error:
+	for (i = 0; i < map_idx; i++) {
+		vmbus_unmap_resource(uio_res->maps[i].addr,
+				     uio_res->maps[i].size);
+		rte_free(uio_res->maps[i].path);
+	}
+	vmbus_uio_free_resource(dev, uio_res);
+	return -1;
+}
+
+/* Scan one vmbus sysfs entry, and fill the devices list from it. */
+static int
+vmbus_scan_one(const char *name)
+{
+	struct rte_vmbus_device *dev, *dev2;
+	char filename[PATH_MAX];
+	char dirname[PATH_MAX];
+	unsigned long tmp;
+
+	dev = malloc(sizeof(*dev) + strlen(name) + 1);
+	if (dev == NULL)
+		return -1;
+
+	memset(dev, 0, sizeof(*dev));
+	strcpy(dev->sysfs_name, name);
+	if (dev->sysfs_name == NULL)
+		goto error;
+
+	/* sysfs base directory
+	 *   /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
+	 * or on older kernel
+	 *   /sys/bus/vmbus/devices/vmbus_1
+	 */
+	snprintf(dirname, sizeof(dirname), "%s/%s",
+		 SYSFS_VMBUS_DEVICES, name);
+
+	/* get device id */
+	snprintf(filename, sizeof(filename), "%s/device_id", dirname);
+	if (vmbus_get_sysfs_uuid(filename, dev->device_id) < 0)
+		goto error;
+
+	/* get device class  */
+	snprintf(filename, sizeof(filename), "%s/class_id", dirname);
+	if (vmbus_get_sysfs_uuid(filename, dev->class_id) < 0)
+		goto error;
+
+	/* get relid */
+	snprintf(filename, sizeof(filename), "%s/id", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		goto error;
+	dev->relid = tmp;
+
+	/* get monitor id */
+	snprintf(filename, sizeof(filename), "%s/monitor_id", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		goto error;
+	dev->monitor_id = tmp;
+
+	/* get numa node */
+	snprintf(filename, sizeof(filename), "%s/numa_node",
+		 dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0)
+		/* if no NUMA support, set default to 0 */
+		dev->device.numa_node = 0;
+	else
+		dev->device.numa_node = tmp;
+
+	/* device is valid, add in list (sorted) */
+	RTE_LOG(DEBUG, EAL, "Adding vmbus device %s\n", name);
+
+	TAILQ_FOREACH(dev2, &vmbus_device_list, next) {
+		int ret;
+
+		ret = uuid_compare(dev->device_id, dev->device_id);
+		if (ret > 0)
+			continue;
+
+		if (ret < 0) {
+			TAILQ_INSERT_BEFORE(dev2, dev, next);
+			rte_eal_device_insert(&dev->device);
+		} else { /* already registered */
+			memmove(dev2->mem_resource, dev->mem_resource,
+				sizeof(dev->mem_resource));
+			free(dev);
+		}
+		return 0;
+	}
+
+	rte_eal_device_insert(&dev->device);
+	TAILQ_INSERT_TAIL(&vmbus_device_list, dev, next);
+
+	return 0;
+error:
+	free(dev);
+	return -1;
+}
+
+/*
+ * Scan the content of the vmbus, and the devices in the devices list
+ */
+static int
+vmbus_scan(void)
+{
+	struct dirent *e;
+	DIR *dir;
+
+	dir = opendir(SYSFS_VMBUS_DEVICES);
+	if (dir == NULL) {
+		if (errno == ENOENT)
+			return 0;
+
+		RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
+			__func__, strerror(errno));
+		return -1;
+	}
+
+	while ((e = readdir(dir)) != NULL) {
+		if (e->d_name[0] == '.')
+			continue;
+
+		if (vmbus_scan_one(e->d_name) < 0)
+			goto error;
+	}
+	closedir(dir);
+	return 0;
+
+error:
+	closedir(dir);
+	return -1;
+}
+
+/* Init the VMBUS EAL subsystem */
+int rte_eal_vmbus_init(void)
+{
+	/* VMBUS can be disabled */
+	if (internal_config.no_vmbus)
+		return 0;
+
+	if (vmbus_scan() < 0) {
+		RTE_LOG(ERR, EAL, "%s(): Cannot scan vmbus\n", __func__);
+		return -1;
+	}
+	return 0;
+}
+
+/* Below is PROBE part of eal_vmbus library */
+
+/*
+ * If device ID match, call the devinit() function of the driver.
+ */
+static int
+rte_eal_vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
+			       struct rte_vmbus_device *dev)
+{
+	const uuid_t *id_table;
+
+	RTE_LOG(DEBUG, EAL, "  probe driver: %s\n", dr->driver.name);
+
+	for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
+		struct rte_devargs *args;
+		char guid[UUID_BUF_SZ];
+		int ret;
+
+		/* skip devices not assocaited with this device class */
+		if (uuid_compare(*id_table, dev->class_id) != 0)
+			continue;
+
+		uuid_unparse(dev->device_id, guid);
+		RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
+			guid, dev->device.numa_node);
+
+		/* no initialization when blacklisted, return without error */
+		args = dev->device.devargs;
+		if (args && args->type == RTE_DEVTYPE_BLACKLISTED_VMBUS) {
+			RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
+			return 1;
+		}
+
+		RTE_LOG(INFO, EAL, "  probe driver: %s\n", dr->driver.name);
+
+		/* map resources for device */
+		ret = rte_eal_vmbus_map_device(dev);
+		if (ret != 0)
+			return ret;
+
+		/* reference driver structure */
+		dev->driver = dr;
+
+		/* call the driver probe() function */
+		ret = dr->probe(dr, dev);
+		if (ret)
+			dev->driver = NULL;
+
+		return ret;
+	}
+
+	/* return positive value if driver doesn't support this device */
+	return 1;
+}
+
+
+/*
+ * If vendor/device ID match, call the remove() function of the
+ * driver.
+ */
+static int
+vmbus_detach_dev(struct rte_vmbus_driver *dr,
+		 struct rte_vmbus_device *dev)
+{
+	const uuid_t *id_table;
+
+	for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
+		char guid[UUID_BUF_SZ];
+
+		/* skip devices not assocaited with this device class */
+		if (uuid_compare(*id_table, dev->class_id) != 0)
+			continue;
+
+		uuid_unparse(dev->device_id, guid);
+		RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
+			guid, dev->device.numa_node);
+
+		RTE_LOG(DEBUG, EAL, "  remove driver: %s\n", dr->driver.name);
+
+		if (dr->remove && (dr->remove(dev) < 0))
+			return -1;	/* negative value is an error */
+
+		/* clear driver structure */
+		dev->driver = NULL;
+
+		vmbus_uio_unmap_resource(dev);
+		return 0;
+	}
+
+	/* return positive value if driver doesn't support this device */
+	return 1;
+}
+
+/*
+ * call the devinit() function of all
+ * registered drivers for the vmbus device. Return -1 if no driver is
+ * found for this class of vmbus device.
+ * The present assumption is that we have drivers only for vmbus network
+ * devices. That's why we don't check driver's id_table now.
+ */
+static int
+vmbus_probe_all_drivers(struct rte_vmbus_device *dev)
+{
+	struct rte_vmbus_driver *dr = NULL;
+	int ret;
+
+	TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
+		ret = rte_eal_vmbus_probe_one_driver(dr, dev);
+		if (ret < 0) {
+			/* negative value is an error */
+			RTE_LOG(ERR, EAL, "Failed to probe driver %s\n",
+				dr->driver.name);
+			return -1;
+		}
+		/* positive value means driver doesn't support it */
+		if (ret > 0)
+			continue;
+
+		return 0;
+	}
+
+	return 1;
+}
+
+
+/*
+ * If device ID matches, call the remove() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+vmbus_detach_all_drivers(struct rte_vmbus_device *dev)
+{
+	struct rte_vmbus_driver *dr;
+	int rc = 0;
+
+	if (dev == NULL)
+		return -1;
+
+	TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
+		rc = vmbus_detach_dev(dr, dev);
+		if (rc < 0)
+			/* negative value is an error */
+			return -1;
+		if (rc > 0)
+			/* positive value means driver doesn't support it */
+			continue;
+		return 0;
+	}
+	return 1;
+}
+
+/* Detach device specified by its VMBUS id */
+int
+rte_eal_vmbus_detach(uuid_t device_id)
+{
+	struct rte_vmbus_device *dev;
+	char ubuf[UUID_BUF_SZ];
+
+	TAILQ_FOREACH(dev, &vmbus_device_list, next) {
+		if (uuid_compare(dev->device_id, device_id) != 0)
+			continue;
+
+		if (vmbus_detach_all_drivers(dev) < 0)
+			goto err_return;
+
+		TAILQ_REMOVE(&vmbus_device_list, dev, next);
+		free(dev);
+		return 0;
+	}
+	return -1;
+
+err_return:
+	uuid_unparse(device_id, ubuf);
+	RTE_LOG(WARNING, EAL, "Requested device %s cannot be used\n",
+		ubuf);
+	return -1;
+}
+
+/*
+ * Scan the vmbus, and call the devinit() function for
+ * all registered drivers that have a matching entry in its id_table
+ * for discovered devices.
+ */
+int
+rte_eal_vmbus_probe(void)
+{
+	struct rte_vmbus_device *dev = NULL;
+
+	TAILQ_FOREACH(dev, &vmbus_device_list, next) {
+		char ubuf[UUID_BUF_SZ];
+
+		uuid_unparse(dev->device_id, ubuf);
+
+		RTE_LOG(DEBUG, EAL, "Probing driver for device %s ...\n",
+			ubuf);
+		vmbus_probe_all_drivers(dev);
+	}
+	return 0;
+}
+
+/* register vmbus driver */
+void
+rte_eal_vmbus_register(struct rte_vmbus_driver *driver)
+{
+	TAILQ_INSERT_TAIL(&vmbus_driver_list, driver, next);
+}
+
+/* unregister vmbus driver */
+void
+rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver)
+{
+	TAILQ_REMOVE(&vmbus_driver_list, driver, next);
+}
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7c212096..b69af0f0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3334,3 +3334,93 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
 				-ENOTSUP);
 	return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
 }
+
+
+#ifdef RTE_LIBRTE_HV_PMD
+int
+rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
+			struct rte_vmbus_device *vmbus_dev)
+{
+	struct eth_driver  *eth_drv = (struct eth_driver *)vmbus_drv;
+	struct rte_eth_dev *eth_dev;
+	char ustr[UUID_BUF_SZ];
+	int diag;
+
+	uuid_unparse(vmbus_dev->device_id, ustr);
+
+	eth_dev = rte_eth_dev_allocate(ustr);
+	if (eth_dev == NULL)
+		return -ENOMEM;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
+				  eth_drv->dev_private_size,
+				  RTE_CACHE_LINE_SIZE);
+		if (eth_dev->data->dev_private == NULL)
+			rte_panic("Cannot allocate memzone for private port data\n");
+	}
+
+	eth_dev->device = &vmbus_dev->device;
+	eth_dev->driver = eth_drv;
+	eth_dev->data->rx_mbuf_alloc_failed = 0;
+
+	/* init user callbacks */
+	TAILQ_INIT(&(eth_dev->link_intr_cbs));
+
+	/*
+	 * Set the default maximum frame size.
+	 */
+	eth_dev->data->mtu = ETHER_MTU;
+
+	/* Invoke PMD device initialization function */
+	diag = (*eth_drv->eth_dev_init)(eth_dev);
+	if (diag == 0)
+		return 0;
+
+	RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(%s) failed\n",
+			    vmbus_drv->driver.name, ustr);
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_free(eth_dev->data->dev_private);
+
+	return diag;
+}
+
+int
+rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev)
+{
+	const struct eth_driver *eth_drv;
+	struct rte_eth_dev *eth_dev;
+	char ustr[UUID_BUF_SZ];
+	int ret;
+
+	if (vmbus_dev == NULL)
+		return -EINVAL;
+
+	uuid_unparse(vmbus_dev->device_id, ustr);
+	eth_dev = rte_eth_dev_allocated(ustr);
+	if (eth_dev == NULL)
+		return -ENODEV;
+
+	eth_drv = (const struct eth_driver *)vmbus_dev->driver;
+
+	/* Invoke PMD device uninit function */
+	if (*eth_drv->eth_dev_uninit) {
+		ret = (*eth_drv->eth_dev_uninit)(eth_dev);
+		if (ret)
+			return ret;
+	}
+
+	/* free ether device */
+	rte_eth_dev_release_port(eth_dev);
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_free(eth_dev->data->dev_private);
+
+	eth_dev->device = NULL;
+	eth_dev->driver = NULL;
+	eth_dev->data = NULL;
+
+	return 0;
+}
+#endif
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1a62a322..2a8c1eed 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -180,6 +180,9 @@ extern "C" {
 #include <rte_log.h>
 #include <rte_interrupts.h>
 #include <rte_pci.h>
+#ifdef RTE_LIBRTE_HV_PMD
+#include <rte_vmbus.h>
+#endif
 #include <rte_dev.h>
 #include <rte_devargs.h>
 #include <rte_errno.h>
@@ -1908,6 +1911,17 @@ struct rte_pci_eth_driver {
 	struct eth_driver	eth_drv;	/**< Ethernet driver. */
 };
 
+#ifdef RTE_LIBRTE_HV_PMD
+/**
+ * @internal
+ * The structure associated with a PMD VMBUS Ethernet driver.
+ */
+struct rte_vmbus_eth_driver {
+	struct rte_vmbus_driver vmbus_drv;	/**< Underlying VMBUS driver. */
+	struct eth_driver	eth_drv;	/**< Ethernet driver. */
+};
+#endif
+
 /**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
@@ -4543,6 +4557,23 @@ int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
  */
 int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
 
+#ifdef RTE_LIBRTE_HV_PMD
+/**
+ * @internal
+ * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
+ * interface.
+ */
+int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
+			  struct rte_vmbus_device *vmbus_dev);
+
+/**
+ * @internal
+ * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
+ * interface.
+ */
+int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index f75f0e24..6b304084 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -130,6 +130,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST)      += -lrte_pmd_vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)    += -lrte_pmd_vmxnet3_uio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_HV_PMD)	    += -luuid
 
 ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)    += -lrte_pmd_aesni_mb
-- 
2.11.0
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure Stephen Hemminger
@ 2017-01-10 17:27   ` Jan Blunck
  2017-01-10 18:05     ` Stephen Hemminger
  2017-01-11 14:49   ` Jan Blunck
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Blunck @ 2017-01-10 17:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Add support for VMBUS on Hyper-V/Azure. VMBUS is similar to PCI
> but has different addressing and internal API's.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  lib/librte_eal/common/Makefile              |   2 +-
>  lib/librte_eal/common/eal_common_devargs.c  |   7 +
>  lib/librte_eal/common/eal_common_options.c  |  38 ++
>  lib/librte_eal/common/eal_internal_cfg.h    |   1 +
>  lib/librte_eal/common/eal_options.h         |   6 +
>  lib/librte_eal/common/eal_private.h         |   5 +
>  lib/librte_eal/common/include/rte_devargs.h |   8 +
>  lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
>  lib/librte_eal/linuxapp/eal/Makefile        |   6 +
>  lib/librte_eal/linuxapp/eal/eal.c           |  13 +
>  lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 911 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.c               |  90 +++
>  lib/librte_ether/rte_ethdev.h               |  31 +
>  mk/rte.app.mk                               |   1 +
>  14 files changed, 1367 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c
>
> diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
> index 09a3d3af..ceb77bed 100644
> --- a/lib/librte_eal/common/Makefile
> +++ b/lib/librte_eal/common/Makefile
> @@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
>
>  INC := rte_branch_prediction.h rte_common.h
>  INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
> -INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
> +INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h rte_vmbus.h
>  INC += rte_per_lcore.h rte_random.h
>  INC += rte_tailq.h rte_interrupts.h rte_alarm.h
>  INC += rte_string_fns.h rte_version.h
> diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
> index e403717b..934ca840 100644
> --- a/lib/librte_eal/common/eal_common_devargs.c
> +++ b/lib/librte_eal/common/eal_common_devargs.c
> @@ -113,6 +113,13 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
>                         goto fail;
>
>                 break;
> +       case RTE_DEVTYPE_WHITELISTED_VMBUS:
> +       case RTE_DEVTYPE_BLACKLISTED_VMBUS:
> +#ifdef RTE_LIBRTE_HV_PMD
> +               if (uuid_parse(buf, devargs->uuid) == 0)
> +                       break;
> +#endif
> +               goto fail;
>         }
>
>         free(buf);
> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index f36bc556..1a2b418c 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -95,6 +95,11 @@ eal_long_options[] = {
>         {OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
>         {OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
>         {OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
> +#ifdef RTE_LIBRTE_HV_PMD
> +       {OPT_NO_VMBUS,          0, NULL, OPT_NO_VMBUS_NUM         },
> +       {OPT_VMBUS_BLACKLIST,   1, NULL, OPT_VMBUS_BLACKLIST_NUM  },
> +       {OPT_VMBUS_WHITELIST,   1, NULL, OPT_VMBUS_WHITELIST_NUM  },
> +#endif
>         {0,                     0, NULL, 0                        }
>  };
>
> @@ -858,6 +863,21 @@ eal_parse_common_option(int opt, const char *optarg,
>                 conf->no_pci = 1;
>                 break;
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       case OPT_NO_VMBUS_NUM:
> +               conf->no_vmbus = 1;
> +               break;
> +       case OPT_VMBUS_BLACKLIST_NUM:
> +               if (rte_eal_devargs_add(RTE_DEVTYPE_BLACKLISTED_VMBUS,
> +                                       optarg) < 0)
> +                       return -1;
> +               break;
> +       case OPT_VMBUS_WHITELIST_NUM:
> +               if (rte_eal_devargs_add(RTE_DEVTYPE_WHITELISTED_VMBUS,
> +                               optarg) < 0)
> +                       return -1;
> +               break;
> +#endif
>         case OPT_NO_HPET_NUM:
>                 conf->no_hpet = 1;
>                 break;
> @@ -1017,6 +1037,14 @@ eal_check_common_options(struct internal_config *internal_cfg)
>                 return -1;
>         }
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_VMBUS) != 0 &&
> +               rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_VMBUS) != 0) {
> +               RTE_LOG(ERR, EAL, "Options vmbus blacklist and whitelist "
> +                       "cannot be used at the same time\n");
> +               return -1;
> +       }
> +#endif
>         return 0;
>  }
>
> @@ -1066,5 +1094,15 @@ eal_common_usage(void)
>                "  --"OPT_NO_PCI"            Disable PCI\n"
>                "  --"OPT_NO_HPET"           Disable HPET\n"
>                "  --"OPT_NO_SHCONF"         No shared config (mmap'd files)\n"
> +#ifdef RTE_LIBRTE_HV_PMD
> +              "  --"OPT_NO_VMBUS"          Disable VMBUS\n"
> +              "  --"OPT_VMBUS_BLACKLIST" Add a VMBUS device to black list.\n"
> +              "                      Prevent EAL from using this PCI device. The argument\n"
> +              "                      format is device UUID.\n"
> +              "  --"OPT_VMBUS_WHITELIST" Add a VMBUS device to white list.\n"
> +              "                      Only use the specified VMBUS devices. The argument format\n"
> +              "                      is device UUID This option can be present\n"
> +              "                      several times (once per device).\n"
> +#endif
>                "\n", RTE_MAX_LCORE);
>  }
> diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
> index 5f1367eb..4b6af937 100644
> --- a/lib/librte_eal/common/eal_internal_cfg.h
> +++ b/lib/librte_eal/common/eal_internal_cfg.h
> @@ -67,6 +67,7 @@ struct internal_config {
>         unsigned hugepage_unlink;         /**< true to unlink backing files */
>         volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
>         volatile unsigned no_pci;         /**< true to disable PCI */
> +       volatile unsigned no_vmbus;       /**< true to disable VMBUS */
>         volatile unsigned no_hpet;        /**< true to disable HPET */
>         volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping
>                                                                                 * instead of native TSC */
> diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> index a881c62e..156727e7 100644
> --- a/lib/librte_eal/common/eal_options.h
> +++ b/lib/librte_eal/common/eal_options.h
> @@ -83,6 +83,12 @@ enum {
>         OPT_VMWARE_TSC_MAP_NUM,
>  #define OPT_XEN_DOM0          "xen-dom0"
>         OPT_XEN_DOM0_NUM,
> +#define OPT_NO_VMBUS          "no-vmbus"
> +       OPT_NO_VMBUS_NUM,
> +#define OPT_VMBUS_BLACKLIST   "vmbus-blacklist"
> +       OPT_VMBUS_BLACKLIST_NUM,
> +#define OPT_VMBUS_WHITELIST   "vmbus-whitelist"
> +       OPT_VMBUS_WHITELIST_NUM,
>         OPT_LONG_MAX_NUM
>  };
>
> diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
> index 9e7d8f6b..c856c63e 100644
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -210,6 +210,11 @@ int pci_uio_map_resource_by_index(struct rte_pci_device *dev, int res_idx,
>                 struct mapped_pci_resource *uio_res, int map_idx);
>
>  /**
> + * VMBUS related functions and structures
> + */
> +int rte_eal_vmbus_init(void);
> +
> +/**
>   * Init tail queues for non-EAL library structures. This is to allow
>   * the rings, mempools, etc. lists to be shared among multiple processes
>   *
> diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
> index 88120a1c..c079d289 100644
> --- a/lib/librte_eal/common/include/rte_devargs.h
> +++ b/lib/librte_eal/common/include/rte_devargs.h
> @@ -51,6 +51,9 @@ extern "C" {
>  #include <stdio.h>
>  #include <sys/queue.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <uuid/uuid.h>
> +#endif
>
>  /**
>   * Type of generic device
> @@ -59,6 +62,8 @@ enum rte_devtype {
>         RTE_DEVTYPE_WHITELISTED_PCI,
>         RTE_DEVTYPE_BLACKLISTED_PCI,
>         RTE_DEVTYPE_VIRTUAL,
> +       RTE_DEVTYPE_WHITELISTED_VMBUS,
> +       RTE_DEVTYPE_BLACKLISTED_VMBUS,
>  };
>
>  /**
> @@ -88,6 +93,9 @@ struct rte_devargs {
>                         /** Driver name. */
>                         char drv_name[32];
>                 } virt;
> +#ifdef RTE_LIBRTE_HV_PMD
> +               uuid_t uuid;
> +#endif
>         };
>         /** Arguments string as given by user or "" for no argument. */
>         char *args;
> diff --git a/lib/librte_eal/common/include/rte_vmbus.h b/lib/librte_eal/common/include/rte_vmbus.h
> new file mode 100644
> index 00000000..f96d753e
> --- /dev/null
> +++ b/lib/librte_eal/common/include/rte_vmbus.h
> @@ -0,0 +1,249 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
> + *   Copyright(c) 2016 Microsoft Corporation
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + */
> +
> +#ifndef _RTE_VMBUS_H_
> +#define _RTE_VMBUS_H_
> +
> +/**
> + * @file
> + *
> + * RTE VMBUS Interface
> + */
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <limits.h>
> +#include <errno.h>
> +#include <uuid/uuid.h>
> +#include <sys/queue.h>
> +#include <stdint.h>
> +#include <inttypes.h>
> +
> +#include <rte_debug.h>
> +#include <rte_interrupts.h>
> +#include <rte_dev.h>
> +
> +TAILQ_HEAD(vmbus_device_list, rte_vmbus_device);
> +TAILQ_HEAD(vmbus_driver_list, rte_vmbus_driver);
> +
> +extern struct vmbus_driver_list vmbus_driver_list;
> +extern struct vmbus_device_list vmbus_device_list;
> +
> +/** Pathname of VMBUS devices directory. */
> +#define SYSFS_VMBUS_DEVICES "/sys/bus/vmbus/devices"
> +
> +#define UUID_BUF_SZ    (36 + 1)
> +
> +
> +/** Maximum number of VMBUS resources. */
> +#define VMBUS_MAX_RESOURCE 7
> +
> +/**
> + * A structure describing a VMBUS device.
> + */
> +struct rte_vmbus_device {
> +       TAILQ_ENTRY(rte_vmbus_device) next;     /**< Next probed VMBUS device. */
> +       struct rte_device device;               /**< Inherit core device */
> +       uuid_t device_id;                       /**< VMBUS device id */
> +       uuid_t class_id;                        /**< VMBUS device type */
> +       uint32_t relid;                         /**< VMBUS id for notification */
> +       uint8_t monitor_id;
> +       struct rte_intr_handle intr_handle;     /**< Interrupt handle */
> +       const struct rte_vmbus_driver *driver;  /**< Associated driver */
> +
> +       struct rte_mem_resource mem_resource[VMBUS_MAX_RESOURCE];
> +                                               /**< VMBUS Memory Resource */
> +       char sysfs_name[];                      /**< Name in sysfs bus directory */
> +};
> +
> +struct rte_vmbus_driver;
> +
> +/**
> + * Initialisation function for the driver called during VMBUS probing.
> + */
> +typedef int (vmbus_probe_t)(struct rte_vmbus_driver *,
> +                           struct rte_vmbus_device *);
> +
> +/**
> + * Uninitialisation function for the driver called during hotplugging.
> + */
> +typedef int (vmbus_remove_t)(struct rte_vmbus_device *);
> +
> +/**
> + * A structure describing a VMBUS driver.
> + */
> +struct rte_vmbus_driver {
> +       TAILQ_ENTRY(rte_vmbus_driver) next;     /**< Next in list. */
> +       struct rte_driver driver;
> +       vmbus_probe_t *probe;                   /**< Device Probe function. */
> +       vmbus_remove_t *remove;                 /**< Device Remove function. */
> +
> +       const uuid_t *id_table;                 /**< ID table. */
> +};
> +
> +struct vmbus_map {
> +       void *addr;
> +       char *path;
> +       uint64_t offset;
> +       uint64_t size;
> +       uint64_t phaddr;
> +};
> +
> +/*
> + * For multi-process we need to reproduce all vmbus mappings in secondary
> + * processes, so save them in a tailq.
> + */
> +struct mapped_vmbus_resource {
> +       TAILQ_ENTRY(mapped_vmbus_resource) next;
> +
> +       uuid_t uuid;
> +       char path[PATH_MAX];
> +       int nb_maps;
> +       struct vmbus_map maps[VMBUS_MAX_RESOURCE];
> +};
> +
> +TAILQ_HEAD(mapped_vmbus_res_list, mapped_vmbus_resource);
> +
> +/**
> + * Scan the content of the VMBUS bus, and the devices in the devices list
> + *
> + * @return
> + *  0 on success, negative on error
> + */
> +int rte_eal_vmbus_scan(void);
> +
> +/**
> + * Probe the VMBUS bus for registered drivers.
> + *
> + * Scan the content of the VMBUS bus, and call the probe() function for
> + * all registered drivers that have a matching entry in its id_table
> + * for discovered devices.
> + *
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_probe(void);
> +
> +/**
> + * Map the VMBUS device resources in user space virtual memory address
> + *
> + * @param dev
> + *   A pointer to a rte_vmbus_device structure describing the device
> + *   to use
> + *
> + * @return
> + *   0 on success, negative on error and positive if no driver
> + *   is found for the device.
> + */
> +int rte_eal_vmbus_map_device(struct rte_vmbus_device *dev);
> +
> +/**
> + * Unmap this device
> + *
> + * @param dev
> + *   A pointer to a rte_vmbus_device structure describing the device
> + *   to use
> + */
> +void rte_eal_vmbus_unmap_device(struct rte_vmbus_device *dev);
> +
> +/**
> + * Probe the single VMBUS device.
> + *
> + * Scan the content of the VMBUS bus, and find the vmbus device
> + * specified by device uuid, then call the probe() function for
> + * registered driver that has a matching entry in its id_table for
> + * discovered device.
> + *
> + * @param id
> + *   The VMBUS device uuid.
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_probe_one(uuid_t id);
> +
> +/**
> + * Close the single VMBUS device.
> + *
> + * Scan the content of the VMBUS bus, and find the vmbus device id,
> + * then call the remove() function for registered driver that has a
> + * matching entry in its id_table for discovered device.
> + *
> + * @param id
> + *   The VMBUS device uuid.
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_detach(uuid_t id);
> +
> +/**
> + * Register a VMBUS driver.
> + *
> + * @param driver
> + *   A pointer to a rte_vmbus_driver structure describing the driver
> + *   to be registered.
> + */
> +void rte_eal_vmbus_register(struct rte_vmbus_driver *driver);
> +
> +/** Helper for VMBUS device registration from driver nstance */
> +#define RTE_PMD_REGISTER_VMBUS(nm, vmbus_drv) \
> +RTE_INIT(vmbusinitfn_ ##nm); \
> +static void vmbusinitfn_ ##nm(void) \
> +{\
> +       (vmbus_drv).driver.name = RTE_STR(nm);\
> +       (vmbus_drv).driver.type = PMD_VMBUS; \
> +       rte_eal_vmbus_register(&vmbus_drv); \
> +} \
> +RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
> +
> +/**
> + * Unregister a VMBUS driver.
> + *
> + * @param driver
> + *   A pointer to a rte_vmbus_driver structure describing the driver
> + *   to be unregistered.
> + */
> +void rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_VMBUS_H_ */
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> index 4e206f09..f6ca3848 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -71,6 +71,11 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
>
> +ifeq ($(CONFIG_RTE_LIBRTE_HV_PMD),y)
> +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_vmbus.c
> +LDLIBS += -luuid
> +endif
> +
>  # from common dir
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
> @@ -114,6 +119,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci_uio.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> +CFLAGS_eal_vmbux.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_options.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 16dd5b9c..1bc0814a 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -70,6 +70,9 @@
>  #include <rte_cpuflags.h>
>  #include <rte_interrupts.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <rte_vmbus.h>
> +#endif
>  #include <rte_dev.h>
>  #include <rte_devargs.h>
>  #include <rte_common.h>
> @@ -830,6 +833,11 @@ rte_eal_init(int argc, char **argv)
>
>         eal_check_mem_on_local_socket();
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_vmbus_init() < 0)
> +               RTE_LOG(ERR, EAL, "Cannot init VMBUS\n");
> +#endif
> +
>         if (eal_plugins_init() < 0)
>                 rte_panic("Cannot init plugins\n");
>
> @@ -884,6 +892,11 @@ rte_eal_init(int argc, char **argv)
>         if (rte_eal_pci_probe())
>                 rte_panic("Cannot probe PCI\n");
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_vmbus_probe() < 0)
> +               rte_panic("Cannot probe VMBUS\n");
> +#endif
> +
>         if (rte_eal_dev_init() < 0)
>                 rte_panic("Cannot init pmd devices\n");
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vmbus.c b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
> new file mode 100644
> index 00000000..729f93a9
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
> @@ -0,0 +1,911 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
> + *   Copyright(c) 2016 Microsoft Corporation
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *      notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *      notice, this list of conditions and the following disclaimer in
> + *      the documentation and/or other materials provided with the
> + *      distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *      contributors may be used to endorse or promote products derived
> + *      from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + */
> +
> +#include <string.h>
> +#include <unistd.h>
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <sys/mman.h>
> +
> +#include <rte_eal.h>
> +#include <rte_tailq.h>
> +#include <rte_log.h>
> +#include <rte_devargs.h>
> +#include <rte_vmbus.h>
> +#include <rte_malloc.h>
> +
> +#include "eal_private.h"
> +#include "eal_pci_init.h"
> +#include "eal_filesystem.h"
> +
> +struct vmbus_driver_list vmbus_driver_list =
> +       TAILQ_HEAD_INITIALIZER(vmbus_driver_list);
> +struct vmbus_device_list vmbus_device_list =
> +       TAILQ_HEAD_INITIALIZER(vmbus_device_list);
> +
> +static void *vmbus_map_addr;
> +
> +static struct rte_tailq_elem rte_vmbus_uio_tailq = {
> +       .name = "UIO_RESOURCE_LIST",
> +};
> +EAL_REGISTER_TAILQ(rte_vmbus_uio_tailq);
> +
> +/*
> + * parse a sysfs file containing one integer value
> + * different to the eal version, as it needs to work with 64-bit values
> + */
> +static int
> +vmbus_get_sysfs_uuid(const char *filename, uuid_t uu)
> +{
> +       char buf[BUFSIZ];
> +       char *cp, *in = buf;
> +       FILE *f;
> +
> +       f = fopen(filename, "r");
> +       if (f == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +
> +       if (fgets(buf, sizeof(buf), f) == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
> +                               __func__, filename);
> +               fclose(f);
> +               return -1;
> +       }
> +       fclose(f);
> +
> +       cp = strchr(buf, '\n');
> +       if (cp)
> +               *cp = '\0';
> +
> +       /* strip { } notation */
> +       if (buf[0] == '{') {
> +               in = buf + 1;
> +               cp = strchr(in, '}');
> +               if (cp)
> +                       *cp = '\0';
> +       }
> +
> +       if (uuid_parse(in, uu) < 0) {
> +               RTE_LOG(ERR, EAL, "%s %s not a valid UUID\n",
> +                       filename, buf);
> +               return -1;
> +       }
> +
> +       return 0;
> +}
> +
> +/* map a particular resource from a file */
> +static void *
> +vmbus_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
> +                  int flags)
> +{
> +       void *mapaddr;
> +
> +       /* Map the memory resource of device */
> +       mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
> +                      MAP_SHARED | flags, fd, offset);
> +       if (mapaddr == MAP_FAILED ||
> +           (requested_addr != NULL && mapaddr != requested_addr)) {
> +               RTE_LOG(ERR, EAL,
> +                       "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s)\n",
> +                       __func__, fd, requested_addr,
> +                       (unsigned long)size, (unsigned long)offset,
> +                       strerror(errno));
> +       } else
> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory mapped at %p\n", mapaddr);
> +
> +       return mapaddr;
> +}
> +
> +/* unmap a particular resource */
> +static void
> +vmbus_unmap_resource(void *requested_addr, size_t size)
> +{
> +       if (requested_addr == NULL)
> +               return;
> +
> +       /* Unmap the VMBUS memory resource of device */
> +       if (munmap(requested_addr, size)) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
> +                       __func__, requested_addr, (unsigned long)size,
> +                       strerror(errno));
> +       } else
> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory unmapped at %p\n",
> +                               requested_addr);
> +}
> +
> +/* Only supports current kernel version
> + * Unlike PCI there is no option (or need) to create UIO device.
> + */
> +static int vmbus_get_uio_dev(const char *name,
> +                            char *dstbuf, size_t buflen)
> +{
> +       char dirname[PATH_MAX];
> +       unsigned int uio_num;
> +       struct dirent *e;
> +       DIR *dir;
> +
> +       snprintf(dirname, sizeof(dirname),
> +                "/sys/bus/vmbus/devices/%s/uio", name);
> +
> +       dir = opendir(dirname);
> +       if (dir == NULL) {
> +               RTE_LOG(ERR, EAL, "Cannot map uio resources for %s: %s\n",
> +                       name, strerror(errno));
> +               return -1;
> +       }
> +
> +       /* take the first file starting with "uio" */
> +       while ((e = readdir(dir)) != NULL) {
> +               if (sscanf(e->d_name, "uio%u", &uio_num) != 1)
> +                       continue;
> +
> +               snprintf(dstbuf, buflen, "%s/uio%u", dirname, uio_num);
> +               break;
> +       }
> +       closedir(dir);
> +
> +       return e ? (int) uio_num : -1;
> +}
> +
> +/*
> + * parse a sysfs file containing one integer value
> + * different to the eal version, as it needs to work with 64-bit values
> + */
> +static int
> +vmbus_parse_sysfs_value(const char *dir, const char *name,
> +                       uint64_t *val)
> +{
> +       char filename[PATH_MAX];
> +       FILE *f;
> +       char buf[BUFSIZ];
> +       char *end = NULL;
> +
> +       snprintf(filename, sizeof(filename), "%s/%s", dir, name);
> +       f = fopen(filename, "r");
> +       if (f == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +
> +       if (fgets(buf, sizeof(buf), f) == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
> +                               __func__, filename);
> +               fclose(f);
> +               return -1;
> +       }
> +       fclose(f);
> +
> +       *val = strtoull(buf, &end, 0);
> +       if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +       return 0;
> +}
> +
> +/* Get mappings out of values provided by uio */
> +static int
> +vmbus_uio_get_mappings(const char *uioname,
> +                      struct vmbus_map maps[])
> +{
> +       int i;
> +
> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
> +               struct vmbus_map *map = &maps[i];
> +               char dirname[PATH_MAX];
> +
> +               /* check if map directory exists */
> +               snprintf(dirname, sizeof(dirname),
> +                        "%s/maps/map%d", uioname, i);
> +
> +               if (access(dirname, F_OK) != 0)
> +                       break;
> +
> +               /* get mapping offset */
> +               if (vmbus_parse_sysfs_value(dirname, "offset",
> +                                           &map->offset) < 0)
> +                       return -1;
> +
> +               /* get mapping size */
> +               if (vmbus_parse_sysfs_value(dirname, "size",
> +                                           &map->size) < 0)
> +                       return -1;
> +
> +               /* get mapping physical address */
> +               if (vmbus_parse_sysfs_value(dirname, "addr",
> +                                           &maps->phaddr) < 0)
> +                       return -1;
> +       }
> +
> +       return i;
> +}
> +
> +static void
> +vmbus_uio_free_resource(struct rte_vmbus_device *dev,
> +               struct mapped_vmbus_resource *uio_res)
> +{
> +       rte_free(uio_res);
> +
> +       if (dev->intr_handle.fd) {
> +               close(dev->intr_handle.fd);
> +               dev->intr_handle.fd = -1;
> +               dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +       }
> +}
> +
> +static struct mapped_vmbus_resource *
> +vmbus_uio_alloc_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       char dirname[PATH_MAX], devname[PATH_MAX];
> +       int uio_num, nb_maps;
> +
> +       uio_num = vmbus_get_uio_dev(dev->sysfs_name, dirname, sizeof(dirname));
> +       if (uio_num < 0) {
> +               RTE_LOG(WARNING, EAL,
> +                       "  %s not managed by UIO driver, skipping\n",
> +                       dev->sysfs_name);
> +               return NULL;
> +       }
> +
> +       /* allocate the mapping details for secondary processes*/
> +       uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
> +       if (uio_res == NULL) {
> +               RTE_LOG(ERR, EAL,
> +                       "%s(): cannot store uio mmap details\n", __func__);
> +               goto error;
> +       }
> +
> +       snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
> +       dev->intr_handle.fd = open(devname, O_RDWR);
> +       if (dev->intr_handle.fd < 0) {
> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                       devname, strerror(errno));
> +               goto error;
> +       }
> +
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> +
> +       snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
> +       uuid_copy(uio_res->uuid, dev->device_id);
> +
> +       nb_maps = vmbus_uio_get_mappings(dirname, uio_res->maps);
> +       if (nb_maps < 0)
> +               goto error;
> +
> +       RTE_LOG(DEBUG, EAL, "Found %d memory maps for device %s\n",
> +               nb_maps, dev->sysfs_name);
> +
> +       return uio_res;
> +
> + error:
> +       vmbus_uio_free_resource(dev, uio_res);
> +       return NULL;
> +}
> +
> +static int
> +vmbus_uio_map_resource_by_index(struct rte_vmbus_device *dev,
> +                               unsigned int res_idx,
> +                               struct mapped_vmbus_resource *uio_res,
> +                               unsigned int map_idx)
> +{
> +       struct vmbus_map *maps = uio_res->maps;
> +       char devname[PATH_MAX];
> +       void *mapaddr;
> +       int fd;
> +
> +       snprintf(devname, sizeof(devname),
> +                "/sys/bus/vmbus/%s/resource%u", dev->sysfs_name, res_idx);
> +
> +       fd = open(devname, O_RDWR);
> +       if (fd < 0) {
> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                               devname, strerror(errno));
> +               return -1;
> +       }
> +
> +       /* allocate memory to keep path */
> +       maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
> +       if (maps[map_idx].path == NULL) {
> +               RTE_LOG(ERR, EAL, "Cannot allocate memory for path: %s\n",
> +                               strerror(errno));
> +               return -1;
> +       }
> +
> +       /* try mapping somewhere close to the end of hugepages */
> +       if (vmbus_map_addr == NULL)
> +               vmbus_map_addr = pci_find_max_end_va();
> +
> +       mapaddr = vmbus_map_resource(vmbus_map_addr, fd, 0,
> +                                    dev->mem_resource[res_idx].len, 0);
> +       close(fd);
> +       if (mapaddr == MAP_FAILED) {
> +               rte_free(maps[map_idx].path);
> +               return -1;
> +       }
> +
> +       vmbus_map_addr = RTE_PTR_ADD(mapaddr,
> +                                    dev->mem_resource[res_idx].len);
> +
> +       maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
> +       maps[map_idx].size = dev->mem_resource[res_idx].len;
> +       maps[map_idx].addr = mapaddr;
> +       maps[map_idx].offset = 0;
> +       strcpy(maps[map_idx].path, devname);
> +       dev->mem_resource[res_idx].addr = mapaddr;
> +
> +       return 0;
> +}
> +
> +static void
> +vmbus_uio_unmap(struct mapped_vmbus_resource *uio_res)
> +{
> +       int i;
> +
> +       if (uio_res == NULL)
> +               return;
> +
> +       for (i = 0; i != uio_res->nb_maps; i++) {
> +               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                    uio_res->maps[i].size);
> +
> +               if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +                       rte_free(uio_res->maps[i].path);
> +       }
> +}
> +
> +static struct mapped_vmbus_resource *
> +vmbus_uio_find_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       if (dev == NULL)
> +               return NULL;
> +
> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
> +               if (uuid_compare(uio_res->uuid, dev->device_id) == 0)
> +                       return uio_res;
> +       }
> +       return NULL;
> +}
> +
> +/* unmap the VMBUS resource of a VMBUS device in virtual memory */
> +static void
> +vmbus_uio_unmap_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       if (dev == NULL)
> +               return;
> +
> +       /* find an entry for the device */
> +       uio_res = vmbus_uio_find_resource(dev);
> +       if (uio_res == NULL)
> +               return;
> +
> +       /* secondary processes - just free maps */
> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +               return vmbus_uio_unmap(uio_res);
> +
> +       TAILQ_REMOVE(uio_res_list, uio_res, next);
> +
> +       /* unmap all resources */
> +       vmbus_uio_unmap(uio_res);
> +
> +       /* free uio resource */
> +       rte_free(uio_res);
> +
> +       /* close fd if in primary process */
> +       close(dev->intr_handle.fd);
> +       if (dev->intr_handle.uio_cfg_fd >= 0) {
> +               close(dev->intr_handle.uio_cfg_fd);
> +               dev->intr_handle.uio_cfg_fd = -1;
> +       }
> +
> +       dev->intr_handle.fd = -1;
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +}
> +
> +static int
> +vmbus_uio_map_secondary(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
> +               int i;
> +
> +               /* skip this element if it doesn't match our id */
> +               if (uuid_compare(uio_res->uuid, dev->device_id))
> +                       continue;
> +
> +               for (i = 0; i != uio_res->nb_maps; i++) {
> +                       void *mapaddr;
> +                       int fd;
> +
> +                       fd = open(uio_res->maps[i].path, O_RDWR);
> +                       if (fd < 0) {
> +                               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                                       uio_res->maps[i].path, strerror(errno));
> +                               return -1;
> +                       }
> +
> +                       mapaddr = vmbus_map_resource(uio_res->maps[i].addr, fd,
> +                                                    uio_res->maps[i].offset,
> +                                                    uio_res->maps[i].size, 0);
> +                       /* fd is not needed in slave process, close it */
> +                       close(fd);
> +
> +                       if (mapaddr == uio_res->maps[i].addr)
> +                               continue;
> +
> +                       RTE_LOG(ERR, EAL,
> +                               "Cannot mmap device resource file %s to address: %p\n",
> +                               uio_res->maps[i].path,
> +                               uio_res->maps[i].addr);
> +
> +                       /* unmap addrs correctly mapped */
> +                       while (i != 0) {
> +                               --i;
> +                               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                                    uio_res->maps[i].size);
> +                       }
> +                       return -1;
> +
> +               }
> +               return 0;
> +       }
> +
> +       RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
> +       return 1;
> +}
> +
> +/* map the resources of a vmbus device in virtual memory */
> +int
> +rte_eal_vmbus_map_device(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +               RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head, mapped_vmbus_res_list);
> +       int i, ret, map_idx = 0;
> +
> +       dev->intr_handle.fd = -1;
> +       dev->intr_handle.uio_cfg_fd = -1;
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +
> +       /* secondary processes - use already recorded details */
> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +               return vmbus_uio_map_secondary(dev);
> +
> +       /* allocate uio resource */
> +       uio_res = vmbus_uio_alloc_resource(dev);
> +       if (uio_res == NULL)
> +               return -1;
> +
> +       /* Map all BARs */
> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
> +               uint64_t phaddr;
> +
> +               /* skip empty BAR */
> +               phaddr = dev->mem_resource[i].phys_addr;
> +               if (phaddr == 0)
> +                       continue;
> +
> +               ret = vmbus_uio_map_resource_by_index(dev, i,
> +                                                     uio_res, map_idx);
> +               if (ret)
> +                       goto error;
> +
> +               map_idx++;
> +       }
> +
> +       uio_res->nb_maps = map_idx;
> +
> +       TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
> +
> +       return 0;
> +error:
> +       for (i = 0; i < map_idx; i++) {
> +               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                    uio_res->maps[i].size);
> +               rte_free(uio_res->maps[i].path);
> +       }
> +       vmbus_uio_free_resource(dev, uio_res);
> +       return -1;
> +}
> +
> +/* Scan one vmbus sysfs entry, and fill the devices list from it. */
> +static int
> +vmbus_scan_one(const char *name)
> +{
> +       struct rte_vmbus_device *dev, *dev2;
> +       char filename[PATH_MAX];
> +       char dirname[PATH_MAX];
> +       unsigned long tmp;
> +
> +       dev = malloc(sizeof(*dev) + strlen(name) + 1);
> +       if (dev == NULL)
> +               return -1;
> +
> +       memset(dev, 0, sizeof(*dev));
> +       strcpy(dev->sysfs_name, name);
> +       if (dev->sysfs_name == NULL)
> +               goto error;
> +
> +       /* sysfs base directory
> +        *   /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
> +        * or on older kernel
> +        *   /sys/bus/vmbus/devices/vmbus_1
> +        */
> +       snprintf(dirname, sizeof(dirname), "%s/%s",
> +                SYSFS_VMBUS_DEVICES, name);
> +
> +       /* get device id */
> +       snprintf(filename, sizeof(filename), "%s/device_id", dirname);
> +       if (vmbus_get_sysfs_uuid(filename, dev->device_id) < 0)
> +               goto error;
> +
> +       /* get device class  */
> +       snprintf(filename, sizeof(filename), "%s/class_id", dirname);
> +       if (vmbus_get_sysfs_uuid(filename, dev->class_id) < 0)
> +               goto error;
> +
> +       /* get relid */
> +       snprintf(filename, sizeof(filename), "%s/id", dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               goto error;
> +       dev->relid = tmp;
> +
> +       /* get monitor id */
> +       snprintf(filename, sizeof(filename), "%s/monitor_id", dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               goto error;
> +       dev->monitor_id = tmp;
> +
> +       /* get numa node */
> +       snprintf(filename, sizeof(filename), "%s/numa_node",
> +                dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               /* if no NUMA support, set default to 0 */
> +               dev->device.numa_node = 0;
> +       else
> +               dev->device.numa_node = tmp;
> +
> +       /* device is valid, add in list (sorted) */
> +       RTE_LOG(DEBUG, EAL, "Adding vmbus device %s\n", name);
> +
> +       TAILQ_FOREACH(dev2, &vmbus_device_list, next) {
> +               int ret;
> +
> +               ret = uuid_compare(dev->device_id, dev->device_id);
> +               if (ret > 0)
> +                       continue;
> +
> +               if (ret < 0) {
> +                       TAILQ_INSERT_BEFORE(dev2, dev, next);
> +                       rte_eal_device_insert(&dev->device);
> +               } else { /* already registered */
> +                       memmove(dev2->mem_resource, dev->mem_resource,
> +                               sizeof(dev->mem_resource));
> +                       free(dev);
> +               }
> +               return 0;
> +       }
> +
> +       rte_eal_device_insert(&dev->device);
> +       TAILQ_INSERT_TAIL(&vmbus_device_list, dev, next);
> +
> +       return 0;
> +error:
> +       free(dev);
> +       return -1;
> +}
> +
> +/*
> + * Scan the content of the vmbus, and the devices in the devices list
> + */
> +static int
> +vmbus_scan(void)
> +{
> +       struct dirent *e;
> +       DIR *dir;
> +
> +       dir = opendir(SYSFS_VMBUS_DEVICES);
> +       if (dir == NULL) {
> +               if (errno == ENOENT)
> +                       return 0;
> +
> +               RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
> +                       __func__, strerror(errno));
> +               return -1;
> +       }
> +
> +       while ((e = readdir(dir)) != NULL) {
> +               if (e->d_name[0] == '.')
> +                       continue;
> +
> +               if (vmbus_scan_one(e->d_name) < 0)
> +                       goto error;
> +       }
> +       closedir(dir);
> +       return 0;
> +
> +error:
> +       closedir(dir);
> +       return -1;
> +}
> +
> +/* Init the VMBUS EAL subsystem */
> +int rte_eal_vmbus_init(void)
> +{
> +       /* VMBUS can be disabled */
> +       if (internal_config.no_vmbus)
> +               return 0;
> +
> +       if (vmbus_scan() < 0) {
> +               RTE_LOG(ERR, EAL, "%s(): Cannot scan vmbus\n", __func__);
> +               return -1;
> +       }
> +       return 0;
> +}
> +
> +/* Below is PROBE part of eal_vmbus library */
> +
> +/*
> + * If device ID match, call the devinit() function of the driver.
> + */
> +static int
> +rte_eal_vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
> +                              struct rte_vmbus_device *dev)
> +{
> +       const uuid_t *id_table;
> +
> +       RTE_LOG(DEBUG, EAL, "  probe driver: %s\n", dr->driver.name);
> +
> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
> +               struct rte_devargs *args;
> +               char guid[UUID_BUF_SZ];
> +               int ret;
> +
> +               /* skip devices not assocaited with this device class */
> +               if (uuid_compare(*id_table, dev->class_id) != 0)
> +                       continue;
> +
> +               uuid_unparse(dev->device_id, guid);
> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
> +                       guid, dev->device.numa_node);
> +
> +               /* no initialization when blacklisted, return without error */
> +               args = dev->device.devargs;
> +               if (args && args->type == RTE_DEVTYPE_BLACKLISTED_VMBUS) {
> +                       RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
> +                       return 1;
> +               }
> +
> +               RTE_LOG(INFO, EAL, "  probe driver: %s\n", dr->driver.name);
> +
> +               /* map resources for device */
> +               ret = rte_eal_vmbus_map_device(dev);
> +               if (ret != 0)
> +                       return ret;
> +
> +               /* reference driver structure */
> +               dev->driver = dr;
> +
> +               /* call the driver probe() function */
> +               ret = dr->probe(dr, dev);
> +               if (ret)
> +                       dev->driver = NULL;
> +
> +               return ret;
> +       }
> +
> +       /* return positive value if driver doesn't support this device */
> +       return 1;
> +}
> +
> +
> +/*
> + * If vendor/device ID match, call the remove() function of the
> + * driver.
> + */
> +static int
> +vmbus_detach_dev(struct rte_vmbus_driver *dr,
> +                struct rte_vmbus_device *dev)
> +{
> +       const uuid_t *id_table;
> +
> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
> +               char guid[UUID_BUF_SZ];
> +
> +               /* skip devices not assocaited with this device class */
> +               if (uuid_compare(*id_table, dev->class_id) != 0)
> +                       continue;
> +
> +               uuid_unparse(dev->device_id, guid);
> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
> +                       guid, dev->device.numa_node);
> +
> +               RTE_LOG(DEBUG, EAL, "  remove driver: %s\n", dr->driver.name);
> +
> +               if (dr->remove && (dr->remove(dev) < 0))
> +                       return -1;      /* negative value is an error */
> +
> +               /* clear driver structure */
> +               dev->driver = NULL;
> +
> +               vmbus_uio_unmap_resource(dev);
> +               return 0;
> +       }
> +
> +       /* return positive value if driver doesn't support this device */
> +       return 1;
> +}
> +
> +/*
> + * call the devinit() function of all
> + * registered drivers for the vmbus device. Return -1 if no driver is
> + * found for this class of vmbus device.
> + * The present assumption is that we have drivers only for vmbus network
> + * devices. That's why we don't check driver's id_table now.
> + */
> +static int
> +vmbus_probe_all_drivers(struct rte_vmbus_device *dev)
> +{
> +       struct rte_vmbus_driver *dr = NULL;
> +       int ret;
> +
> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
> +               ret = rte_eal_vmbus_probe_one_driver(dr, dev);
> +               if (ret < 0) {
> +                       /* negative value is an error */
> +                       RTE_LOG(ERR, EAL, "Failed to probe driver %s\n",
> +                               dr->driver.name);
> +                       return -1;
> +               }
> +               /* positive value means driver doesn't support it */
> +               if (ret > 0)
> +                       continue;
> +
> +               return 0;
> +       }
> +
> +       return 1;
> +}
> +
> +
> +/*
> + * If device ID matches, call the remove() function of all
> + * registered driver for the given device. Return -1 if initialization
> + * failed, return 1 if no driver is found for this device.
> + */
> +static int
> +vmbus_detach_all_drivers(struct rte_vmbus_device *dev)
> +{
> +       struct rte_vmbus_driver *dr;
> +       int rc = 0;
> +
> +       if (dev == NULL)
> +               return -1;
> +
> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
> +               rc = vmbus_detach_dev(dr, dev);
> +               if (rc < 0)
> +                       /* negative value is an error */
> +                       return -1;
> +               if (rc > 0)
> +                       /* positive value means driver doesn't support it */
> +                       continue;
> +               return 0;
> +       }
> +       return 1;
> +}
> +
> +/* Detach device specified by its VMBUS id */
> +int
> +rte_eal_vmbus_detach(uuid_t device_id)
> +{
> +       struct rte_vmbus_device *dev;
> +       char ubuf[UUID_BUF_SZ];
> +
> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
> +               if (uuid_compare(dev->device_id, device_id) != 0)
> +                       continue;
> +
> +               if (vmbus_detach_all_drivers(dev) < 0)
> +                       goto err_return;
> +
> +               TAILQ_REMOVE(&vmbus_device_list, dev, next);
> +               free(dev);
> +               return 0;
> +       }
> +       return -1;
> +
> +err_return:
> +       uuid_unparse(device_id, ubuf);
> +       RTE_LOG(WARNING, EAL, "Requested device %s cannot be used\n",
> +               ubuf);
> +       return -1;
> +}
> +
> +/*
> + * Scan the vmbus, and call the devinit() function for
> + * all registered drivers that have a matching entry in its id_table
> + * for discovered devices.
> + */
> +int
> +rte_eal_vmbus_probe(void)
> +{
> +       struct rte_vmbus_device *dev = NULL;
> +
> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
> +               char ubuf[UUID_BUF_SZ];
> +
> +               uuid_unparse(dev->device_id, ubuf);
> +
> +               RTE_LOG(DEBUG, EAL, "Probing driver for device %s ...\n",
> +                       ubuf);
> +               vmbus_probe_all_drivers(dev);
> +       }
> +       return 0;
> +}
> +
> +/* register vmbus driver */
> +void
> +rte_eal_vmbus_register(struct rte_vmbus_driver *driver)
> +{
> +       TAILQ_INSERT_TAIL(&vmbus_driver_list, driver, next);
> +}
> +
> +/* unregister vmbus driver */
> +void
> +rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver)
> +{
> +       TAILQ_REMOVE(&vmbus_driver_list, driver, next);
> +}
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7c212096..b69af0f0 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -3334,3 +3334,93 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
>                                 -ENOTSUP);
>         return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
>  }
> +
> +
> +#ifdef RTE_LIBRTE_HV_PMD
> +int
> +rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
> +                       struct rte_vmbus_device *vmbus_dev)
> +{
> +       struct eth_driver  *eth_drv = (struct eth_driver *)vmbus_drv;
> +       struct rte_eth_dev *eth_dev;
> +       char ustr[UUID_BUF_SZ];
> +       int diag;
> +
> +       uuid_unparse(vmbus_dev->device_id, ustr);
> +
> +       eth_dev = rte_eth_dev_allocate(ustr);
> +       if (eth_dev == NULL)
> +               return -ENOMEM;
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +               eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
> +                                 eth_drv->dev_private_size,
> +                                 RTE_CACHE_LINE_SIZE);
> +               if (eth_dev->data->dev_private == NULL)
> +                       rte_panic("Cannot allocate memzone for private port data\n");
> +       }
> +
> +       eth_dev->device = &vmbus_dev->device;
> +       eth_dev->driver = eth_drv;
> +       eth_dev->data->rx_mbuf_alloc_failed = 0;
> +
> +       /* init user callbacks */
> +       TAILQ_INIT(&(eth_dev->link_intr_cbs));
> +
> +       /*
> +        * Set the default maximum frame size.
> +        */
> +       eth_dev->data->mtu = ETHER_MTU;
Initialization of default values has moved into rte_eth_dev_allocate().
> +
> +       /* Invoke PMD device initialization function */
> +       diag = (*eth_drv->eth_dev_init)(eth_dev);
> +       if (diag == 0)
> +               return 0;
> +
> +       RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(%s) failed\n",
> +                           vmbus_drv->driver.name, ustr);
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +               rte_free(eth_dev->data->dev_private);
> +
> +       return diag;
> +}
> +
> +int
> +rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev)
> +{
> +       const struct eth_driver *eth_drv;
> +       struct rte_eth_dev *eth_dev;
> +       char ustr[UUID_BUF_SZ];
> +       int ret;
> +
> +       if (vmbus_dev == NULL)
> +               return -EINVAL;
> +
> +       uuid_unparse(vmbus_dev->device_id, ustr);
> +       eth_dev = rte_eth_dev_allocated(ustr);
> +       if (eth_dev == NULL)
> +               return -ENODEV;
> +
> +       eth_drv = (const struct eth_driver *)vmbus_dev->driver;
> +
> +       /* Invoke PMD device uninit function */
> +       if (*eth_drv->eth_dev_uninit) {
> +               ret = (*eth_drv->eth_dev_uninit)(eth_dev);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       /* free ether device */
> +       rte_eth_dev_release_port(eth_dev);
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +               rte_free(eth_dev->data->dev_private);
> +
> +       eth_dev->device = NULL;
> +       eth_dev->driver = NULL;
> +       eth_dev->data = NULL;
> +
> +       return 0;
> +}
> +#endif
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 1a62a322..2a8c1eed 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -180,6 +180,9 @@ extern "C" {
>  #include <rte_log.h>
>  #include <rte_interrupts.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <rte_vmbus.h>
> +#endif
>  #include <rte_dev.h>
>  #include <rte_devargs.h>
>  #include <rte_errno.h>
> @@ -1908,6 +1911,17 @@ struct rte_pci_eth_driver {
>         struct eth_driver       eth_drv;        /**< Ethernet driver. */
>  };
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +/**
> + * @internal
> + * The structure associated with a PMD VMBUS Ethernet driver.
> + */
> +struct rte_vmbus_eth_driver {
> +       struct rte_vmbus_driver vmbus_drv;      /**< Underlying VMBUS driver. */
> +       struct eth_driver       eth_drv;        /**< Ethernet driver. */
> +};
> +#endif
> +
>  /**
>   * Convert a numerical speed in Mbps to a bitmap flag that can be used in
>   * the bitmap link_speeds of the struct rte_eth_conf
> @@ -4543,6 +4557,23 @@ int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>   */
>  int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +/**
> + * @internal
> + * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
> + * interface.
> + */
> +int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
> +                         struct rte_vmbus_device *vmbus_dev);
> +
> +/**
> + * @internal
> + * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
> + * interface.
> + */
> +int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
> +#endif
I don't think that replicating the PCI probe/remove wrappers is the
right thing to do. To me it looks like this should move into the
rte_vmbus_driver's probe function instead. That way the ethdev header
can decoupled from the low-level device implementations.
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index f75f0e24..6b304084 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -130,6 +130,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST)      += -lrte_pmd_vhost
>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)    += -lrte_pmd_vmxnet3_uio
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_HV_PMD)        += -luuid
>
>  ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)    += -lrte_pmd_aesni_mb
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-10 17:27   ` Jan Blunck
@ 2017-01-10 18:05     ` Stephen Hemminger
  0 siblings, 0 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-10 18:05 UTC (permalink / raw)
  To: Jan Blunck; +Cc: dev, Stephen Hemminger
On Tue, 10 Jan 2017 18:27:31 +0100
Jan Blunck <jblunck@infradead.org> wrote:
> > +#ifdef RTE_LIBRTE_HV_PMD
> > +/**
> > + * @internal
> > + * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
> > + * interface.
> > + */
> > +int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
> > +                         struct rte_vmbus_device *vmbus_dev);
> > +
> > +/**
> > + * @internal
> > + * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
> > + * interface.
> > + */
> > +int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
> > +#endif  
> 
> I don't think that replicating the PCI probe/remove wrappers is the
> right thing to do. To me it looks like this should move into the
> rte_vmbus_driver's probe function instead. That way the ethdev header
> can decoupled from the low-level device implementations.
With a real bus model. There would be registration of busses. And the probe would
be:
   foreach bus
       foreach device on bus
...
^ permalink raw reply	[flat|nested] 30+ messages in thread
 
- * Re: [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-07 18:17 ` [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure Stephen Hemminger
  2017-01-10 17:27   ` Jan Blunck
@ 2017-01-11 14:49   ` Jan Blunck
  2017-01-11 21:13     ` Jan Blunck
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Blunck @ 2017-01-11 14:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Add support for VMBUS on Hyper-V/Azure. VMBUS is similar to PCI
> but has different addressing and internal API's.
>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>  lib/librte_eal/common/Makefile              |   2 +-
>  lib/librte_eal/common/eal_common_devargs.c  |   7 +
>  lib/librte_eal/common/eal_common_options.c  |  38 ++
>  lib/librte_eal/common/eal_internal_cfg.h    |   1 +
>  lib/librte_eal/common/eal_options.h         |   6 +
>  lib/librte_eal/common/eal_private.h         |   5 +
>  lib/librte_eal/common/include/rte_devargs.h |   8 +
>  lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
>  lib/librte_eal/linuxapp/eal/Makefile        |   6 +
>  lib/librte_eal/linuxapp/eal/eal.c           |  13 +
>  lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 911 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.c               |  90 +++
>  lib/librte_ether/rte_ethdev.h               |  31 +
>  mk/rte.app.mk                               |   1 +
>  14 files changed, 1367 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c
>
> diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
> index 09a3d3af..ceb77bed 100644
> --- a/lib/librte_eal/common/Makefile
> +++ b/lib/librte_eal/common/Makefile
> @@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
>
>  INC := rte_branch_prediction.h rte_common.h
>  INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
> -INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
> +INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h rte_vmbus.h
>  INC += rte_per_lcore.h rte_random.h
>  INC += rte_tailq.h rte_interrupts.h rte_alarm.h
>  INC += rte_string_fns.h rte_version.h
> diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
> index e403717b..934ca840 100644
> --- a/lib/librte_eal/common/eal_common_devargs.c
> +++ b/lib/librte_eal/common/eal_common_devargs.c
> @@ -113,6 +113,13 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
>                         goto fail;
>
>                 break;
> +       case RTE_DEVTYPE_WHITELISTED_VMBUS:
> +       case RTE_DEVTYPE_BLACKLISTED_VMBUS:
> +#ifdef RTE_LIBRTE_HV_PMD
> +               if (uuid_parse(buf, devargs->uuid) == 0)
> +                       break;
> +#endif
> +               goto fail;
>         }
>
>         free(buf);
> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index f36bc556..1a2b418c 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -95,6 +95,11 @@ eal_long_options[] = {
>         {OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
>         {OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
>         {OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
> +#ifdef RTE_LIBRTE_HV_PMD
> +       {OPT_NO_VMBUS,          0, NULL, OPT_NO_VMBUS_NUM         },
> +       {OPT_VMBUS_BLACKLIST,   1, NULL, OPT_VMBUS_BLACKLIST_NUM  },
> +       {OPT_VMBUS_WHITELIST,   1, NULL, OPT_VMBUS_WHITELIST_NUM  },
> +#endif
>         {0,                     0, NULL, 0                        }
>  };
>
> @@ -858,6 +863,21 @@ eal_parse_common_option(int opt, const char *optarg,
>                 conf->no_pci = 1;
>                 break;
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       case OPT_NO_VMBUS_NUM:
> +               conf->no_vmbus = 1;
> +               break;
> +       case OPT_VMBUS_BLACKLIST_NUM:
> +               if (rte_eal_devargs_add(RTE_DEVTYPE_BLACKLISTED_VMBUS,
> +                                       optarg) < 0)
> +                       return -1;
> +               break;
> +       case OPT_VMBUS_WHITELIST_NUM:
> +               if (rte_eal_devargs_add(RTE_DEVTYPE_WHITELISTED_VMBUS,
> +                               optarg) < 0)
> +                       return -1;
> +               break;
> +#endif
>         case OPT_NO_HPET_NUM:
>                 conf->no_hpet = 1;
>                 break;
> @@ -1017,6 +1037,14 @@ eal_check_common_options(struct internal_config *internal_cfg)
>                 return -1;
>         }
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_VMBUS) != 0 &&
> +               rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_VMBUS) != 0) {
> +               RTE_LOG(ERR, EAL, "Options vmbus blacklist and whitelist "
> +                       "cannot be used at the same time\n");
> +               return -1;
> +       }
> +#endif
>         return 0;
>  }
>
> @@ -1066,5 +1094,15 @@ eal_common_usage(void)
>                "  --"OPT_NO_PCI"            Disable PCI\n"
>                "  --"OPT_NO_HPET"           Disable HPET\n"
>                "  --"OPT_NO_SHCONF"         No shared config (mmap'd files)\n"
> +#ifdef RTE_LIBRTE_HV_PMD
> +              "  --"OPT_NO_VMBUS"          Disable VMBUS\n"
> +              "  --"OPT_VMBUS_BLACKLIST" Add a VMBUS device to black list.\n"
> +              "                      Prevent EAL from using this PCI device. The argument\n"
> +              "                      format is device UUID.\n"
> +              "  --"OPT_VMBUS_WHITELIST" Add a VMBUS device to white list.\n"
> +              "                      Only use the specified VMBUS devices. The argument format\n"
> +              "                      is device UUID This option can be present\n"
> +              "                      several times (once per device).\n"
> +#endif
>                "\n", RTE_MAX_LCORE);
>  }
> diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
> index 5f1367eb..4b6af937 100644
> --- a/lib/librte_eal/common/eal_internal_cfg.h
> +++ b/lib/librte_eal/common/eal_internal_cfg.h
> @@ -67,6 +67,7 @@ struct internal_config {
>         unsigned hugepage_unlink;         /**< true to unlink backing files */
>         volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
>         volatile unsigned no_pci;         /**< true to disable PCI */
> +       volatile unsigned no_vmbus;       /**< true to disable VMBUS */
>         volatile unsigned no_hpet;        /**< true to disable HPET */
>         volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping
>                                                                                 * instead of native TSC */
> diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> index a881c62e..156727e7 100644
> --- a/lib/librte_eal/common/eal_options.h
> +++ b/lib/librte_eal/common/eal_options.h
> @@ -83,6 +83,12 @@ enum {
>         OPT_VMWARE_TSC_MAP_NUM,
>  #define OPT_XEN_DOM0          "xen-dom0"
>         OPT_XEN_DOM0_NUM,
> +#define OPT_NO_VMBUS          "no-vmbus"
> +       OPT_NO_VMBUS_NUM,
> +#define OPT_VMBUS_BLACKLIST   "vmbus-blacklist"
> +       OPT_VMBUS_BLACKLIST_NUM,
> +#define OPT_VMBUS_WHITELIST   "vmbus-whitelist"
> +       OPT_VMBUS_WHITELIST_NUM,
>         OPT_LONG_MAX_NUM
>  };
>
> diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
> index 9e7d8f6b..c856c63e 100644
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -210,6 +210,11 @@ int pci_uio_map_resource_by_index(struct rte_pci_device *dev, int res_idx,
>                 struct mapped_pci_resource *uio_res, int map_idx);
>
>  /**
> + * VMBUS related functions and structures
> + */
> +int rte_eal_vmbus_init(void);
> +
> +/**
>   * Init tail queues for non-EAL library structures. This is to allow
>   * the rings, mempools, etc. lists to be shared among multiple processes
>   *
> diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
> index 88120a1c..c079d289 100644
> --- a/lib/librte_eal/common/include/rte_devargs.h
> +++ b/lib/librte_eal/common/include/rte_devargs.h
> @@ -51,6 +51,9 @@ extern "C" {
>  #include <stdio.h>
>  #include <sys/queue.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <uuid/uuid.h>
> +#endif
>
>  /**
>   * Type of generic device
> @@ -59,6 +62,8 @@ enum rte_devtype {
>         RTE_DEVTYPE_WHITELISTED_PCI,
>         RTE_DEVTYPE_BLACKLISTED_PCI,
>         RTE_DEVTYPE_VIRTUAL,
> +       RTE_DEVTYPE_WHITELISTED_VMBUS,
> +       RTE_DEVTYPE_BLACKLISTED_VMBUS,
>  };
>
>  /**
> @@ -88,6 +93,9 @@ struct rte_devargs {
>                         /** Driver name. */
>                         char drv_name[32];
>                 } virt;
> +#ifdef RTE_LIBRTE_HV_PMD
> +               uuid_t uuid;
> +#endif
>         };
>         /** Arguments string as given by user or "" for no argument. */
>         char *args;
> diff --git a/lib/librte_eal/common/include/rte_vmbus.h b/lib/librte_eal/common/include/rte_vmbus.h
> new file mode 100644
> index 00000000..f96d753e
> --- /dev/null
> +++ b/lib/librte_eal/common/include/rte_vmbus.h
> @@ -0,0 +1,249 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
> + *   Copyright(c) 2016 Microsoft Corporation
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + */
> +
> +#ifndef _RTE_VMBUS_H_
> +#define _RTE_VMBUS_H_
> +
> +/**
> + * @file
> + *
> + * RTE VMBUS Interface
> + */
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <limits.h>
> +#include <errno.h>
> +#include <uuid/uuid.h>
> +#include <sys/queue.h>
> +#include <stdint.h>
> +#include <inttypes.h>
> +
> +#include <rte_debug.h>
> +#include <rte_interrupts.h>
> +#include <rte_dev.h>
> +
> +TAILQ_HEAD(vmbus_device_list, rte_vmbus_device);
> +TAILQ_HEAD(vmbus_driver_list, rte_vmbus_driver);
> +
> +extern struct vmbus_driver_list vmbus_driver_list;
> +extern struct vmbus_device_list vmbus_device_list;
> +
> +/** Pathname of VMBUS devices directory. */
> +#define SYSFS_VMBUS_DEVICES "/sys/bus/vmbus/devices"
> +
> +#define UUID_BUF_SZ    (36 + 1)
> +
> +
> +/** Maximum number of VMBUS resources. */
> +#define VMBUS_MAX_RESOURCE 7
> +
> +/**
> + * A structure describing a VMBUS device.
> + */
> +struct rte_vmbus_device {
> +       TAILQ_ENTRY(rte_vmbus_device) next;     /**< Next probed VMBUS device. */
> +       struct rte_device device;               /**< Inherit core device */
> +       uuid_t device_id;                       /**< VMBUS device id */
> +       uuid_t class_id;                        /**< VMBUS device type */
> +       uint32_t relid;                         /**< VMBUS id for notification */
> +       uint8_t monitor_id;
> +       struct rte_intr_handle intr_handle;     /**< Interrupt handle */
> +       const struct rte_vmbus_driver *driver;  /**< Associated driver */
> +
> +       struct rte_mem_resource mem_resource[VMBUS_MAX_RESOURCE];
> +                                               /**< VMBUS Memory Resource */
> +       char sysfs_name[];                      /**< Name in sysfs bus directory */
> +};
> +
> +struct rte_vmbus_driver;
> +
> +/**
> + * Initialisation function for the driver called during VMBUS probing.
> + */
> +typedef int (vmbus_probe_t)(struct rte_vmbus_driver *,
> +                           struct rte_vmbus_device *);
> +
> +/**
> + * Uninitialisation function for the driver called during hotplugging.
> + */
> +typedef int (vmbus_remove_t)(struct rte_vmbus_device *);
> +
> +/**
> + * A structure describing a VMBUS driver.
> + */
> +struct rte_vmbus_driver {
> +       TAILQ_ENTRY(rte_vmbus_driver) next;     /**< Next in list. */
> +       struct rte_driver driver;
> +       vmbus_probe_t *probe;                   /**< Device Probe function. */
> +       vmbus_remove_t *remove;                 /**< Device Remove function. */
> +
> +       const uuid_t *id_table;                 /**< ID table. */
> +};
> +
> +struct vmbus_map {
> +       void *addr;
> +       char *path;
> +       uint64_t offset;
> +       uint64_t size;
> +       uint64_t phaddr;
> +};
> +
> +/*
> + * For multi-process we need to reproduce all vmbus mappings in secondary
> + * processes, so save them in a tailq.
> + */
> +struct mapped_vmbus_resource {
> +       TAILQ_ENTRY(mapped_vmbus_resource) next;
> +
> +       uuid_t uuid;
> +       char path[PATH_MAX];
> +       int nb_maps;
> +       struct vmbus_map maps[VMBUS_MAX_RESOURCE];
> +};
> +
> +TAILQ_HEAD(mapped_vmbus_res_list, mapped_vmbus_resource);
> +
> +/**
> + * Scan the content of the VMBUS bus, and the devices in the devices list
> + *
> + * @return
> + *  0 on success, negative on error
> + */
> +int rte_eal_vmbus_scan(void);
> +
> +/**
> + * Probe the VMBUS bus for registered drivers.
> + *
> + * Scan the content of the VMBUS bus, and call the probe() function for
> + * all registered drivers that have a matching entry in its id_table
> + * for discovered devices.
> + *
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_probe(void);
> +
> +/**
> + * Map the VMBUS device resources in user space virtual memory address
> + *
> + * @param dev
> + *   A pointer to a rte_vmbus_device structure describing the device
> + *   to use
> + *
> + * @return
> + *   0 on success, negative on error and positive if no driver
> + *   is found for the device.
> + */
> +int rte_eal_vmbus_map_device(struct rte_vmbus_device *dev);
> +
> +/**
> + * Unmap this device
> + *
> + * @param dev
> + *   A pointer to a rte_vmbus_device structure describing the device
> + *   to use
> + */
> +void rte_eal_vmbus_unmap_device(struct rte_vmbus_device *dev);
> +
> +/**
> + * Probe the single VMBUS device.
> + *
> + * Scan the content of the VMBUS bus, and find the vmbus device
> + * specified by device uuid, then call the probe() function for
> + * registered driver that has a matching entry in its id_table for
> + * discovered device.
> + *
> + * @param id
> + *   The VMBUS device uuid.
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_probe_one(uuid_t id);
> +
> +/**
> + * Close the single VMBUS device.
> + *
> + * Scan the content of the VMBUS bus, and find the vmbus device id,
> + * then call the remove() function for registered driver that has a
> + * matching entry in its id_table for discovered device.
> + *
> + * @param id
> + *   The VMBUS device uuid.
> + * @return
> + *   - 0 on success.
> + *   - Negative on error.
> + */
> +int rte_eal_vmbus_detach(uuid_t id);
> +
> +/**
> + * Register a VMBUS driver.
> + *
> + * @param driver
> + *   A pointer to a rte_vmbus_driver structure describing the driver
> + *   to be registered.
> + */
> +void rte_eal_vmbus_register(struct rte_vmbus_driver *driver);
> +
> +/** Helper for VMBUS device registration from driver nstance */
> +#define RTE_PMD_REGISTER_VMBUS(nm, vmbus_drv) \
> +RTE_INIT(vmbusinitfn_ ##nm); \
> +static void vmbusinitfn_ ##nm(void) \
> +{\
> +       (vmbus_drv).driver.name = RTE_STR(nm);\
> +       (vmbus_drv).driver.type = PMD_VMBUS; \
> +       rte_eal_vmbus_register(&vmbus_drv); \
> +} \
> +RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
> +
> +/**
> + * Unregister a VMBUS driver.
> + *
> + * @param driver
> + *   A pointer to a rte_vmbus_driver structure describing the driver
> + *   to be unregistered.
> + */
> +void rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver);
The register/unregister need to get exported via the map file too.
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_VMBUS_H_ */
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> index 4e206f09..f6ca3848 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -71,6 +71,11 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
>
> +ifeq ($(CONFIG_RTE_LIBRTE_HV_PMD),y)
> +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_vmbus.c
> +LDLIBS += -luuid
> +endif
> +
>  # from common dir
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
> @@ -114,6 +119,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci_uio.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> +CFLAGS_eal_vmbux.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_options.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 16dd5b9c..1bc0814a 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -70,6 +70,9 @@
>  #include <rte_cpuflags.h>
>  #include <rte_interrupts.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <rte_vmbus.h>
> +#endif
>  #include <rte_dev.h>
>  #include <rte_devargs.h>
>  #include <rte_common.h>
> @@ -830,6 +833,11 @@ rte_eal_init(int argc, char **argv)
>
>         eal_check_mem_on_local_socket();
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_vmbus_init() < 0)
> +               RTE_LOG(ERR, EAL, "Cannot init VMBUS\n");
> +#endif
> +
>         if (eal_plugins_init() < 0)
>                 rte_panic("Cannot init plugins\n");
>
> @@ -884,6 +892,11 @@ rte_eal_init(int argc, char **argv)
>         if (rte_eal_pci_probe())
>                 rte_panic("Cannot probe PCI\n");
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +       if (rte_eal_vmbus_probe() < 0)
> +               rte_panic("Cannot probe VMBUS\n");
> +#endif
> +
>         if (rte_eal_dev_init() < 0)
>                 rte_panic("Cannot init pmd devices\n");
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vmbus.c b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
> new file mode 100644
> index 00000000..729f93a9
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
> @@ -0,0 +1,911 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
> + *   Copyright(c) 2016 Microsoft Corporation
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *      notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *      notice, this list of conditions and the following disclaimer in
> + *      the documentation and/or other materials provided with the
> + *      distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *      contributors may be used to endorse or promote products derived
> + *      from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + */
> +
> +#include <string.h>
> +#include <unistd.h>
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <sys/mman.h>
> +
> +#include <rte_eal.h>
> +#include <rte_tailq.h>
> +#include <rte_log.h>
> +#include <rte_devargs.h>
> +#include <rte_vmbus.h>
> +#include <rte_malloc.h>
> +
> +#include "eal_private.h"
> +#include "eal_pci_init.h"
> +#include "eal_filesystem.h"
> +
> +struct vmbus_driver_list vmbus_driver_list =
> +       TAILQ_HEAD_INITIALIZER(vmbus_driver_list);
> +struct vmbus_device_list vmbus_device_list =
> +       TAILQ_HEAD_INITIALIZER(vmbus_device_list);
> +
> +static void *vmbus_map_addr;
> +
> +static struct rte_tailq_elem rte_vmbus_uio_tailq = {
> +       .name = "UIO_RESOURCE_LIST",
> +};
> +EAL_REGISTER_TAILQ(rte_vmbus_uio_tailq);
> +
> +/*
> + * parse a sysfs file containing one integer value
> + * different to the eal version, as it needs to work with 64-bit values
> + */
> +static int
> +vmbus_get_sysfs_uuid(const char *filename, uuid_t uu)
> +{
> +       char buf[BUFSIZ];
> +       char *cp, *in = buf;
> +       FILE *f;
> +
> +       f = fopen(filename, "r");
> +       if (f == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +
> +       if (fgets(buf, sizeof(buf), f) == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
> +                               __func__, filename);
> +               fclose(f);
> +               return -1;
> +       }
> +       fclose(f);
> +
> +       cp = strchr(buf, '\n');
> +       if (cp)
> +               *cp = '\0';
> +
> +       /* strip { } notation */
> +       if (buf[0] == '{') {
> +               in = buf + 1;
> +               cp = strchr(in, '}');
> +               if (cp)
> +                       *cp = '\0';
> +       }
> +
> +       if (uuid_parse(in, uu) < 0) {
> +               RTE_LOG(ERR, EAL, "%s %s not a valid UUID\n",
> +                       filename, buf);
> +               return -1;
> +       }
> +
> +       return 0;
> +}
> +
> +/* map a particular resource from a file */
> +static void *
> +vmbus_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
> +                  int flags)
> +{
> +       void *mapaddr;
> +
> +       /* Map the memory resource of device */
> +       mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
> +                      MAP_SHARED | flags, fd, offset);
> +       if (mapaddr == MAP_FAILED ||
> +           (requested_addr != NULL && mapaddr != requested_addr)) {
> +               RTE_LOG(ERR, EAL,
> +                       "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s)\n",
> +                       __func__, fd, requested_addr,
> +                       (unsigned long)size, (unsigned long)offset,
> +                       strerror(errno));
> +       } else
> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory mapped at %p\n", mapaddr);
> +
> +       return mapaddr;
> +}
> +
> +/* unmap a particular resource */
> +static void
> +vmbus_unmap_resource(void *requested_addr, size_t size)
> +{
> +       if (requested_addr == NULL)
> +               return;
> +
> +       /* Unmap the VMBUS memory resource of device */
> +       if (munmap(requested_addr, size)) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
> +                       __func__, requested_addr, (unsigned long)size,
> +                       strerror(errno));
> +       } else
> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory unmapped at %p\n",
> +                               requested_addr);
> +}
> +
> +/* Only supports current kernel version
> + * Unlike PCI there is no option (or need) to create UIO device.
> + */
> +static int vmbus_get_uio_dev(const char *name,
> +                            char *dstbuf, size_t buflen)
> +{
> +       char dirname[PATH_MAX];
> +       unsigned int uio_num;
> +       struct dirent *e;
> +       DIR *dir;
> +
> +       snprintf(dirname, sizeof(dirname),
> +                "/sys/bus/vmbus/devices/%s/uio", name);
> +
> +       dir = opendir(dirname);
> +       if (dir == NULL) {
> +               RTE_LOG(ERR, EAL, "Cannot map uio resources for %s: %s\n",
> +                       name, strerror(errno));
> +               return -1;
> +       }
> +
> +       /* take the first file starting with "uio" */
> +       while ((e = readdir(dir)) != NULL) {
> +               if (sscanf(e->d_name, "uio%u", &uio_num) != 1)
> +                       continue;
> +
> +               snprintf(dstbuf, buflen, "%s/uio%u", dirname, uio_num);
> +               break;
> +       }
> +       closedir(dir);
> +
> +       return e ? (int) uio_num : -1;
> +}
> +
> +/*
> + * parse a sysfs file containing one integer value
> + * different to the eal version, as it needs to work with 64-bit values
> + */
> +static int
> +vmbus_parse_sysfs_value(const char *dir, const char *name,
> +                       uint64_t *val)
> +{
> +       char filename[PATH_MAX];
> +       FILE *f;
> +       char buf[BUFSIZ];
> +       char *end = NULL;
> +
> +       snprintf(filename, sizeof(filename), "%s/%s", dir, name);
> +       f = fopen(filename, "r");
> +       if (f == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +
> +       if (fgets(buf, sizeof(buf), f) == NULL) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
> +                               __func__, filename);
> +               fclose(f);
> +               return -1;
> +       }
> +       fclose(f);
> +
> +       *val = strtoull(buf, &end, 0);
> +       if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
> +               RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
> +                               __func__, filename);
> +               return -1;
> +       }
> +       return 0;
> +}
> +
> +/* Get mappings out of values provided by uio */
> +static int
> +vmbus_uio_get_mappings(const char *uioname,
> +                      struct vmbus_map maps[])
> +{
> +       int i;
> +
> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
> +               struct vmbus_map *map = &maps[i];
> +               char dirname[PATH_MAX];
> +
> +               /* check if map directory exists */
> +               snprintf(dirname, sizeof(dirname),
> +                        "%s/maps/map%d", uioname, i);
> +
> +               if (access(dirname, F_OK) != 0)
> +                       break;
> +
> +               /* get mapping offset */
> +               if (vmbus_parse_sysfs_value(dirname, "offset",
> +                                           &map->offset) < 0)
> +                       return -1;
> +
> +               /* get mapping size */
> +               if (vmbus_parse_sysfs_value(dirname, "size",
> +                                           &map->size) < 0)
> +                       return -1;
> +
> +               /* get mapping physical address */
> +               if (vmbus_parse_sysfs_value(dirname, "addr",
> +                                           &maps->phaddr) < 0)
> +                       return -1;
> +       }
> +
> +       return i;
> +}
> +
> +static void
> +vmbus_uio_free_resource(struct rte_vmbus_device *dev,
> +               struct mapped_vmbus_resource *uio_res)
> +{
> +       rte_free(uio_res);
> +
> +       if (dev->intr_handle.fd) {
> +               close(dev->intr_handle.fd);
> +               dev->intr_handle.fd = -1;
> +               dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +       }
> +}
> +
> +static struct mapped_vmbus_resource *
> +vmbus_uio_alloc_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       char dirname[PATH_MAX], devname[PATH_MAX];
> +       int uio_num, nb_maps;
> +
> +       uio_num = vmbus_get_uio_dev(dev->sysfs_name, dirname, sizeof(dirname));
> +       if (uio_num < 0) {
> +               RTE_LOG(WARNING, EAL,
> +                       "  %s not managed by UIO driver, skipping\n",
> +                       dev->sysfs_name);
> +               return NULL;
> +       }
> +
> +       /* allocate the mapping details for secondary processes*/
> +       uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
> +       if (uio_res == NULL) {
> +               RTE_LOG(ERR, EAL,
> +                       "%s(): cannot store uio mmap details\n", __func__);
> +               goto error;
> +       }
> +
> +       snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
> +       dev->intr_handle.fd = open(devname, O_RDWR);
> +       if (dev->intr_handle.fd < 0) {
> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                       devname, strerror(errno));
> +               goto error;
> +       }
> +
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> +
> +       snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
> +       uuid_copy(uio_res->uuid, dev->device_id);
> +
> +       nb_maps = vmbus_uio_get_mappings(dirname, uio_res->maps);
> +       if (nb_maps < 0)
> +               goto error;
> +
> +       RTE_LOG(DEBUG, EAL, "Found %d memory maps for device %s\n",
> +               nb_maps, dev->sysfs_name);
> +
> +       return uio_res;
> +
> + error:
> +       vmbus_uio_free_resource(dev, uio_res);
> +       return NULL;
> +}
> +
> +static int
> +vmbus_uio_map_resource_by_index(struct rte_vmbus_device *dev,
> +                               unsigned int res_idx,
> +                               struct mapped_vmbus_resource *uio_res,
> +                               unsigned int map_idx)
> +{
> +       struct vmbus_map *maps = uio_res->maps;
> +       char devname[PATH_MAX];
> +       void *mapaddr;
> +       int fd;
> +
> +       snprintf(devname, sizeof(devname),
> +                "/sys/bus/vmbus/%s/resource%u", dev->sysfs_name, res_idx);
> +
> +       fd = open(devname, O_RDWR);
> +       if (fd < 0) {
> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                               devname, strerror(errno));
> +               return -1;
> +       }
> +
> +       /* allocate memory to keep path */
> +       maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
> +       if (maps[map_idx].path == NULL) {
> +               RTE_LOG(ERR, EAL, "Cannot allocate memory for path: %s\n",
> +                               strerror(errno));
> +               return -1;
> +       }
> +
> +       /* try mapping somewhere close to the end of hugepages */
> +       if (vmbus_map_addr == NULL)
> +               vmbus_map_addr = pci_find_max_end_va();
> +
> +       mapaddr = vmbus_map_resource(vmbus_map_addr, fd, 0,
> +                                    dev->mem_resource[res_idx].len, 0);
> +       close(fd);
> +       if (mapaddr == MAP_FAILED) {
> +               rte_free(maps[map_idx].path);
> +               return -1;
> +       }
> +
> +       vmbus_map_addr = RTE_PTR_ADD(mapaddr,
> +                                    dev->mem_resource[res_idx].len);
> +
> +       maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
> +       maps[map_idx].size = dev->mem_resource[res_idx].len;
> +       maps[map_idx].addr = mapaddr;
> +       maps[map_idx].offset = 0;
> +       strcpy(maps[map_idx].path, devname);
> +       dev->mem_resource[res_idx].addr = mapaddr;
> +
> +       return 0;
> +}
> +
> +static void
> +vmbus_uio_unmap(struct mapped_vmbus_resource *uio_res)
> +{
> +       int i;
> +
> +       if (uio_res == NULL)
> +               return;
> +
> +       for (i = 0; i != uio_res->nb_maps; i++) {
> +               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                    uio_res->maps[i].size);
> +
> +               if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +                       rte_free(uio_res->maps[i].path);
> +       }
> +}
> +
> +static struct mapped_vmbus_resource *
> +vmbus_uio_find_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       if (dev == NULL)
> +               return NULL;
> +
> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
> +               if (uuid_compare(uio_res->uuid, dev->device_id) == 0)
> +                       return uio_res;
> +       }
> +       return NULL;
> +}
> +
> +/* unmap the VMBUS resource of a VMBUS device in virtual memory */
> +static void
> +vmbus_uio_unmap_resource(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       if (dev == NULL)
> +               return;
> +
> +       /* find an entry for the device */
> +       uio_res = vmbus_uio_find_resource(dev);
> +       if (uio_res == NULL)
> +               return;
> +
> +       /* secondary processes - just free maps */
> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +               return vmbus_uio_unmap(uio_res);
> +
> +       TAILQ_REMOVE(uio_res_list, uio_res, next);
> +
> +       /* unmap all resources */
> +       vmbus_uio_unmap(uio_res);
> +
> +       /* free uio resource */
> +       rte_free(uio_res);
> +
> +       /* close fd if in primary process */
> +       close(dev->intr_handle.fd);
> +       if (dev->intr_handle.uio_cfg_fd >= 0) {
> +               close(dev->intr_handle.uio_cfg_fd);
> +               dev->intr_handle.uio_cfg_fd = -1;
> +       }
> +
> +       dev->intr_handle.fd = -1;
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +}
> +
> +static int
> +vmbus_uio_map_secondary(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
> +                                      mapped_vmbus_res_list);
> +
> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
> +               int i;
> +
> +               /* skip this element if it doesn't match our id */
> +               if (uuid_compare(uio_res->uuid, dev->device_id))
> +                       continue;
> +
> +               for (i = 0; i != uio_res->nb_maps; i++) {
> +                       void *mapaddr;
> +                       int fd;
> +
> +                       fd = open(uio_res->maps[i].path, O_RDWR);
> +                       if (fd < 0) {
> +                               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                                       uio_res->maps[i].path, strerror(errno));
> +                               return -1;
> +                       }
> +
> +                       mapaddr = vmbus_map_resource(uio_res->maps[i].addr, fd,
> +                                                    uio_res->maps[i].offset,
> +                                                    uio_res->maps[i].size, 0);
> +                       /* fd is not needed in slave process, close it */
> +                       close(fd);
> +
> +                       if (mapaddr == uio_res->maps[i].addr)
> +                               continue;
> +
> +                       RTE_LOG(ERR, EAL,
> +                               "Cannot mmap device resource file %s to address: %p\n",
> +                               uio_res->maps[i].path,
> +                               uio_res->maps[i].addr);
> +
> +                       /* unmap addrs correctly mapped */
> +                       while (i != 0) {
> +                               --i;
> +                               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                                    uio_res->maps[i].size);
> +                       }
> +                       return -1;
> +
> +               }
> +               return 0;
> +       }
> +
> +       RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
> +       return 1;
> +}
> +
> +/* map the resources of a vmbus device in virtual memory */
> +int
> +rte_eal_vmbus_map_device(struct rte_vmbus_device *dev)
> +{
> +       struct mapped_vmbus_resource *uio_res;
> +       struct mapped_vmbus_res_list *uio_res_list =
> +               RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head, mapped_vmbus_res_list);
> +       int i, ret, map_idx = 0;
> +
> +       dev->intr_handle.fd = -1;
> +       dev->intr_handle.uio_cfg_fd = -1;
> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +
> +       /* secondary processes - use already recorded details */
> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +               return vmbus_uio_map_secondary(dev);
> +
> +       /* allocate uio resource */
> +       uio_res = vmbus_uio_alloc_resource(dev);
> +       if (uio_res == NULL)
> +               return -1;
> +
> +       /* Map all BARs */
> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
> +               uint64_t phaddr;
> +
> +               /* skip empty BAR */
> +               phaddr = dev->mem_resource[i].phys_addr;
> +               if (phaddr == 0)
> +                       continue;
> +
> +               ret = vmbus_uio_map_resource_by_index(dev, i,
> +                                                     uio_res, map_idx);
> +               if (ret)
> +                       goto error;
> +
> +               map_idx++;
> +       }
> +
> +       uio_res->nb_maps = map_idx;
> +
> +       TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
> +
> +       return 0;
> +error:
> +       for (i = 0; i < map_idx; i++) {
> +               vmbus_unmap_resource(uio_res->maps[i].addr,
> +                                    uio_res->maps[i].size);
> +               rte_free(uio_res->maps[i].path);
> +       }
> +       vmbus_uio_free_resource(dev, uio_res);
> +       return -1;
> +}
> +
> +/* Scan one vmbus sysfs entry, and fill the devices list from it. */
> +static int
> +vmbus_scan_one(const char *name)
> +{
> +       struct rte_vmbus_device *dev, *dev2;
> +       char filename[PATH_MAX];
> +       char dirname[PATH_MAX];
> +       unsigned long tmp;
> +
> +       dev = malloc(sizeof(*dev) + strlen(name) + 1);
> +       if (dev == NULL)
> +               return -1;
> +
> +       memset(dev, 0, sizeof(*dev));
> +       strcpy(dev->sysfs_name, name);
> +       if (dev->sysfs_name == NULL)
> +               goto error;
> +
> +       /* sysfs base directory
> +        *   /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
> +        * or on older kernel
> +        *   /sys/bus/vmbus/devices/vmbus_1
> +        */
> +       snprintf(dirname, sizeof(dirname), "%s/%s",
> +                SYSFS_VMBUS_DEVICES, name);
> +
> +       /* get device id */
> +       snprintf(filename, sizeof(filename), "%s/device_id", dirname);
> +       if (vmbus_get_sysfs_uuid(filename, dev->device_id) < 0)
> +               goto error;
> +
> +       /* get device class  */
> +       snprintf(filename, sizeof(filename), "%s/class_id", dirname);
> +       if (vmbus_get_sysfs_uuid(filename, dev->class_id) < 0)
> +               goto error;
> +
> +       /* get relid */
> +       snprintf(filename, sizeof(filename), "%s/id", dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               goto error;
> +       dev->relid = tmp;
> +
> +       /* get monitor id */
> +       snprintf(filename, sizeof(filename), "%s/monitor_id", dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               goto error;
> +       dev->monitor_id = tmp;
> +
> +       /* get numa node */
> +       snprintf(filename, sizeof(filename), "%s/numa_node",
> +                dirname);
> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
> +               /* if no NUMA support, set default to 0 */
> +               dev->device.numa_node = 0;
> +       else
> +               dev->device.numa_node = tmp;
> +
> +       /* device is valid, add in list (sorted) */
> +       RTE_LOG(DEBUG, EAL, "Adding vmbus device %s\n", name);
> +
> +       TAILQ_FOREACH(dev2, &vmbus_device_list, next) {
> +               int ret;
> +
> +               ret = uuid_compare(dev->device_id, dev->device_id);
> +               if (ret > 0)
> +                       continue;
> +
> +               if (ret < 0) {
> +                       TAILQ_INSERT_BEFORE(dev2, dev, next);
> +                       rte_eal_device_insert(&dev->device);
> +               } else { /* already registered */
> +                       memmove(dev2->mem_resource, dev->mem_resource,
> +                               sizeof(dev->mem_resource));
> +                       free(dev);
> +               }
> +               return 0;
> +       }
> +
> +       rte_eal_device_insert(&dev->device);
> +       TAILQ_INSERT_TAIL(&vmbus_device_list, dev, next);
> +
> +       return 0;
> +error:
> +       free(dev);
> +       return -1;
> +}
> +
> +/*
> + * Scan the content of the vmbus, and the devices in the devices list
> + */
> +static int
> +vmbus_scan(void)
> +{
> +       struct dirent *e;
> +       DIR *dir;
> +
> +       dir = opendir(SYSFS_VMBUS_DEVICES);
> +       if (dir == NULL) {
> +               if (errno == ENOENT)
> +                       return 0;
> +
> +               RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
> +                       __func__, strerror(errno));
> +               return -1;
> +       }
> +
> +       while ((e = readdir(dir)) != NULL) {
> +               if (e->d_name[0] == '.')
> +                       continue;
> +
> +               if (vmbus_scan_one(e->d_name) < 0)
> +                       goto error;
> +       }
> +       closedir(dir);
> +       return 0;
> +
> +error:
> +       closedir(dir);
> +       return -1;
> +}
> +
> +/* Init the VMBUS EAL subsystem */
> +int rte_eal_vmbus_init(void)
> +{
> +       /* VMBUS can be disabled */
> +       if (internal_config.no_vmbus)
> +               return 0;
> +
> +       if (vmbus_scan() < 0) {
> +               RTE_LOG(ERR, EAL, "%s(): Cannot scan vmbus\n", __func__);
> +               return -1;
> +       }
> +       return 0;
> +}
> +
> +/* Below is PROBE part of eal_vmbus library */
> +
> +/*
> + * If device ID match, call the devinit() function of the driver.
> + */
> +static int
> +rte_eal_vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
> +                              struct rte_vmbus_device *dev)
> +{
> +       const uuid_t *id_table;
> +
> +       RTE_LOG(DEBUG, EAL, "  probe driver: %s\n", dr->driver.name);
> +
> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
> +               struct rte_devargs *args;
> +               char guid[UUID_BUF_SZ];
> +               int ret;
> +
> +               /* skip devices not assocaited with this device class */
> +               if (uuid_compare(*id_table, dev->class_id) != 0)
> +                       continue;
> +
> +               uuid_unparse(dev->device_id, guid);
> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
> +                       guid, dev->device.numa_node);
> +
> +               /* no initialization when blacklisted, return without error */
> +               args = dev->device.devargs;
> +               if (args && args->type == RTE_DEVTYPE_BLACKLISTED_VMBUS) {
> +                       RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
> +                       return 1;
> +               }
> +
> +               RTE_LOG(INFO, EAL, "  probe driver: %s\n", dr->driver.name);
> +
> +               /* map resources for device */
> +               ret = rte_eal_vmbus_map_device(dev);
> +               if (ret != 0)
> +                       return ret;
> +
> +               /* reference driver structure */
> +               dev->driver = dr;
> +
> +               /* call the driver probe() function */
> +               ret = dr->probe(dr, dev);
> +               if (ret)
> +                       dev->driver = NULL;
> +
> +               return ret;
> +       }
> +
> +       /* return positive value if driver doesn't support this device */
> +       return 1;
> +}
> +
> +
> +/*
> + * If vendor/device ID match, call the remove() function of the
> + * driver.
> + */
> +static int
> +vmbus_detach_dev(struct rte_vmbus_driver *dr,
> +                struct rte_vmbus_device *dev)
> +{
> +       const uuid_t *id_table;
> +
> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
> +               char guid[UUID_BUF_SZ];
> +
> +               /* skip devices not assocaited with this device class */
> +               if (uuid_compare(*id_table, dev->class_id) != 0)
> +                       continue;
> +
> +               uuid_unparse(dev->device_id, guid);
> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
> +                       guid, dev->device.numa_node);
> +
> +               RTE_LOG(DEBUG, EAL, "  remove driver: %s\n", dr->driver.name);
> +
> +               if (dr->remove && (dr->remove(dev) < 0))
> +                       return -1;      /* negative value is an error */
> +
> +               /* clear driver structure */
> +               dev->driver = NULL;
> +
> +               vmbus_uio_unmap_resource(dev);
> +               return 0;
> +       }
> +
> +       /* return positive value if driver doesn't support this device */
> +       return 1;
> +}
> +
> +/*
> + * call the devinit() function of all
> + * registered drivers for the vmbus device. Return -1 if no driver is
> + * found for this class of vmbus device.
> + * The present assumption is that we have drivers only for vmbus network
> + * devices. That's why we don't check driver's id_table now.
> + */
> +static int
> +vmbus_probe_all_drivers(struct rte_vmbus_device *dev)
> +{
> +       struct rte_vmbus_driver *dr = NULL;
> +       int ret;
> +
> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
> +               ret = rte_eal_vmbus_probe_one_driver(dr, dev);
> +               if (ret < 0) {
> +                       /* negative value is an error */
> +                       RTE_LOG(ERR, EAL, "Failed to probe driver %s\n",
> +                               dr->driver.name);
> +                       return -1;
> +               }
> +               /* positive value means driver doesn't support it */
> +               if (ret > 0)
> +                       continue;
> +
> +               return 0;
> +       }
> +
> +       return 1;
> +}
> +
> +
> +/*
> + * If device ID matches, call the remove() function of all
> + * registered driver for the given device. Return -1 if initialization
> + * failed, return 1 if no driver is found for this device.
> + */
> +static int
> +vmbus_detach_all_drivers(struct rte_vmbus_device *dev)
> +{
> +       struct rte_vmbus_driver *dr;
> +       int rc = 0;
> +
> +       if (dev == NULL)
> +               return -1;
> +
> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
> +               rc = vmbus_detach_dev(dr, dev);
> +               if (rc < 0)
> +                       /* negative value is an error */
> +                       return -1;
> +               if (rc > 0)
> +                       /* positive value means driver doesn't support it */
> +                       continue;
> +               return 0;
> +       }
> +       return 1;
> +}
> +
> +/* Detach device specified by its VMBUS id */
> +int
> +rte_eal_vmbus_detach(uuid_t device_id)
> +{
> +       struct rte_vmbus_device *dev;
> +       char ubuf[UUID_BUF_SZ];
> +
> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
> +               if (uuid_compare(dev->device_id, device_id) != 0)
> +                       continue;
> +
> +               if (vmbus_detach_all_drivers(dev) < 0)
> +                       goto err_return;
> +
> +               TAILQ_REMOVE(&vmbus_device_list, dev, next);
> +               free(dev);
> +               return 0;
> +       }
> +       return -1;
> +
> +err_return:
> +       uuid_unparse(device_id, ubuf);
> +       RTE_LOG(WARNING, EAL, "Requested device %s cannot be used\n",
> +               ubuf);
> +       return -1;
> +}
> +
> +/*
> + * Scan the vmbus, and call the devinit() function for
> + * all registered drivers that have a matching entry in its id_table
> + * for discovered devices.
> + */
> +int
> +rte_eal_vmbus_probe(void)
> +{
> +       struct rte_vmbus_device *dev = NULL;
> +
> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
> +               char ubuf[UUID_BUF_SZ];
> +
> +               uuid_unparse(dev->device_id, ubuf);
> +
> +               RTE_LOG(DEBUG, EAL, "Probing driver for device %s ...\n",
> +                       ubuf);
> +               vmbus_probe_all_drivers(dev);
> +       }
> +       return 0;
> +}
> +
> +/* register vmbus driver */
> +void
> +rte_eal_vmbus_register(struct rte_vmbus_driver *driver)
> +{
> +       TAILQ_INSERT_TAIL(&vmbus_driver_list, driver, next);
> +}
> +
> +/* unregister vmbus driver */
> +void
> +rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver)
> +{
> +       TAILQ_REMOVE(&vmbus_driver_list, driver, next);
> +}
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7c212096..b69af0f0 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -3334,3 +3334,93 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
>                                 -ENOTSUP);
>         return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
>  }
> +
> +
> +#ifdef RTE_LIBRTE_HV_PMD
> +int
> +rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
> +                       struct rte_vmbus_device *vmbus_dev)
> +{
> +       struct eth_driver  *eth_drv = (struct eth_driver *)vmbus_drv;
> +       struct rte_eth_dev *eth_dev;
> +       char ustr[UUID_BUF_SZ];
> +       int diag;
> +
> +       uuid_unparse(vmbus_dev->device_id, ustr);
> +
> +       eth_dev = rte_eth_dev_allocate(ustr);
> +       if (eth_dev == NULL)
> +               return -ENOMEM;
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +               eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
> +                                 eth_drv->dev_private_size,
> +                                 RTE_CACHE_LINE_SIZE);
> +               if (eth_dev->data->dev_private == NULL)
> +                       rte_panic("Cannot allocate memzone for private port data\n");
> +       }
> +
> +       eth_dev->device = &vmbus_dev->device;
> +       eth_dev->driver = eth_drv;
> +       eth_dev->data->rx_mbuf_alloc_failed = 0;
> +
> +       /* init user callbacks */
> +       TAILQ_INIT(&(eth_dev->link_intr_cbs));
> +
> +       /*
> +        * Set the default maximum frame size.
> +        */
> +       eth_dev->data->mtu = ETHER_MTU;
> +
> +       /* Invoke PMD device initialization function */
> +       diag = (*eth_drv->eth_dev_init)(eth_dev);
> +       if (diag == 0)
> +               return 0;
> +
> +       RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(%s) failed\n",
> +                           vmbus_drv->driver.name, ustr);
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +               rte_free(eth_dev->data->dev_private);
> +
> +       return diag;
> +}
> +
> +int
> +rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev)
> +{
> +       const struct eth_driver *eth_drv;
> +       struct rte_eth_dev *eth_dev;
> +       char ustr[UUID_BUF_SZ];
> +       int ret;
> +
> +       if (vmbus_dev == NULL)
> +               return -EINVAL;
> +
> +       uuid_unparse(vmbus_dev->device_id, ustr);
> +       eth_dev = rte_eth_dev_allocated(ustr);
> +       if (eth_dev == NULL)
> +               return -ENODEV;
> +
> +       eth_drv = (const struct eth_driver *)vmbus_dev->driver;
> +
> +       /* Invoke PMD device uninit function */
> +       if (*eth_drv->eth_dev_uninit) {
> +               ret = (*eth_drv->eth_dev_uninit)(eth_dev);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       /* free ether device */
> +       rte_eth_dev_release_port(eth_dev);
> +
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +               rte_free(eth_dev->data->dev_private);
> +
> +       eth_dev->device = NULL;
> +       eth_dev->driver = NULL;
> +       eth_dev->data = NULL;
> +
> +       return 0;
> +}
> +#endif
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 1a62a322..2a8c1eed 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -180,6 +180,9 @@ extern "C" {
>  #include <rte_log.h>
>  #include <rte_interrupts.h>
>  #include <rte_pci.h>
> +#ifdef RTE_LIBRTE_HV_PMD
> +#include <rte_vmbus.h>
> +#endif
>  #include <rte_dev.h>
>  #include <rte_devargs.h>
>  #include <rte_errno.h>
> @@ -1908,6 +1911,17 @@ struct rte_pci_eth_driver {
>         struct eth_driver       eth_drv;        /**< Ethernet driver. */
>  };
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +/**
> + * @internal
> + * The structure associated with a PMD VMBUS Ethernet driver.
> + */
> +struct rte_vmbus_eth_driver {
> +       struct rte_vmbus_driver vmbus_drv;      /**< Underlying VMBUS driver. */
> +       struct eth_driver       eth_drv;        /**< Ethernet driver. */
> +};
> +#endif
> +
>  /**
>   * Convert a numerical speed in Mbps to a bitmap flag that can be used in
>   * the bitmap link_speeds of the struct rte_eth_conf
> @@ -4543,6 +4557,23 @@ int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>   */
>  int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
>
> +#ifdef RTE_LIBRTE_HV_PMD
> +/**
> + * @internal
> + * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
> + * interface.
> + */
> +int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
> +                         struct rte_vmbus_device *vmbus_dev);
> +
> +/**
> + * @internal
> + * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
> + * interface.
> + */
> +int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
> +#endif
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index f75f0e24..6b304084 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -130,6 +130,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST)      += -lrte_pmd_vhost
>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)    += -lrte_pmd_vmxnet3_uio
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_HV_PMD)        += -luuid
>
>  ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)    += -lrte_pmd_aesni_mb
> --
> 2.11.0
>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-11 14:49   ` Jan Blunck
@ 2017-01-11 21:13     ` Jan Blunck
  2017-01-12  1:20       ` Stephen Hemminger
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Blunck @ 2017-01-11 21:13 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger
On Wed, Jan 11, 2017 at 3:49 PM, Jan Blunck <jblunck@infradead.org> wrote:
> On Sat, Jan 7, 2017 at 7:17 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> Add support for VMBUS on Hyper-V/Azure. VMBUS is similar to PCI
>> but has different addressing and internal API's.
>>
>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>> ---
>>  lib/librte_eal/common/Makefile              |   2 +-
>>  lib/librte_eal/common/eal_common_devargs.c  |   7 +
>>  lib/librte_eal/common/eal_common_options.c  |  38 ++
>>  lib/librte_eal/common/eal_internal_cfg.h    |   1 +
>>  lib/librte_eal/common/eal_options.h         |   6 +
>>  lib/librte_eal/common/eal_private.h         |   5 +
>>  lib/librte_eal/common/include/rte_devargs.h |   8 +
>>  lib/librte_eal/common/include/rte_vmbus.h   | 249 ++++++++
>>  lib/librte_eal/linuxapp/eal/Makefile        |   6 +
>>  lib/librte_eal/linuxapp/eal/eal.c           |  13 +
>>  lib/librte_eal/linuxapp/eal/eal_vmbus.c     | 911 ++++++++++++++++++++++++++++
>>  lib/librte_ether/rte_ethdev.c               |  90 +++
>>  lib/librte_ether/rte_ethdev.h               |  31 +
>>  mk/rte.app.mk                               |   1 +
>>  14 files changed, 1367 insertions(+), 1 deletion(-)
>>  create mode 100644 lib/librte_eal/common/include/rte_vmbus.h
>>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_vmbus.c
>>
>> diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
>> index 09a3d3af..ceb77bed 100644
>> --- a/lib/librte_eal/common/Makefile
>> +++ b/lib/librte_eal/common/Makefile
>> @@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
>>
>>  INC := rte_branch_prediction.h rte_common.h
>>  INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
>> -INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
>> +INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h rte_vmbus.h
>>  INC += rte_per_lcore.h rte_random.h
>>  INC += rte_tailq.h rte_interrupts.h rte_alarm.h
>>  INC += rte_string_fns.h rte_version.h
>> diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
>> index e403717b..934ca840 100644
>> --- a/lib/librte_eal/common/eal_common_devargs.c
>> +++ b/lib/librte_eal/common/eal_common_devargs.c
>> @@ -113,6 +113,13 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
>>                         goto fail;
>>
>>                 break;
>> +       case RTE_DEVTYPE_WHITELISTED_VMBUS:
>> +       case RTE_DEVTYPE_BLACKLISTED_VMBUS:
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +               if (uuid_parse(buf, devargs->uuid) == 0)
>> +                       break;
>> +#endif
>> +               goto fail;
>>         }
>>
>>         free(buf);
>> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
>> index f36bc556..1a2b418c 100644
>> --- a/lib/librte_eal/common/eal_common_options.c
>> +++ b/lib/librte_eal/common/eal_common_options.c
>> @@ -95,6 +95,11 @@ eal_long_options[] = {
>>         {OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
>>         {OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
>>         {OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +       {OPT_NO_VMBUS,          0, NULL, OPT_NO_VMBUS_NUM         },
>> +       {OPT_VMBUS_BLACKLIST,   1, NULL, OPT_VMBUS_BLACKLIST_NUM  },
>> +       {OPT_VMBUS_WHITELIST,   1, NULL, OPT_VMBUS_WHITELIST_NUM  },
>> +#endif
>>         {0,                     0, NULL, 0                        }
>>  };
>>
>> @@ -858,6 +863,21 @@ eal_parse_common_option(int opt, const char *optarg,
>>                 conf->no_pci = 1;
>>                 break;
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +       case OPT_NO_VMBUS_NUM:
>> +               conf->no_vmbus = 1;
>> +               break;
>> +       case OPT_VMBUS_BLACKLIST_NUM:
>> +               if (rte_eal_devargs_add(RTE_DEVTYPE_BLACKLISTED_VMBUS,
>> +                                       optarg) < 0)
>> +                       return -1;
>> +               break;
>> +       case OPT_VMBUS_WHITELIST_NUM:
>> +               if (rte_eal_devargs_add(RTE_DEVTYPE_WHITELISTED_VMBUS,
>> +                               optarg) < 0)
>> +                       return -1;
>> +               break;
>> +#endif
>>         case OPT_NO_HPET_NUM:
>>                 conf->no_hpet = 1;
>>                 break;
>> @@ -1017,6 +1037,14 @@ eal_check_common_options(struct internal_config *internal_cfg)
>>                 return -1;
>>         }
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +       if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_VMBUS) != 0 &&
>> +               rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_VMBUS) != 0) {
>> +               RTE_LOG(ERR, EAL, "Options vmbus blacklist and whitelist "
>> +                       "cannot be used at the same time\n");
>> +               return -1;
>> +       }
>> +#endif
>>         return 0;
>>  }
>>
>> @@ -1066,5 +1094,15 @@ eal_common_usage(void)
>>                "  --"OPT_NO_PCI"            Disable PCI\n"
>>                "  --"OPT_NO_HPET"           Disable HPET\n"
>>                "  --"OPT_NO_SHCONF"         No shared config (mmap'd files)\n"
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +              "  --"OPT_NO_VMBUS"          Disable VMBUS\n"
>> +              "  --"OPT_VMBUS_BLACKLIST" Add a VMBUS device to black list.\n"
>> +              "                      Prevent EAL from using this PCI device. The argument\n"
>> +              "                      format is device UUID.\n"
>> +              "  --"OPT_VMBUS_WHITELIST" Add a VMBUS device to white list.\n"
>> +              "                      Only use the specified VMBUS devices. The argument format\n"
>> +              "                      is device UUID This option can be present\n"
>> +              "                      several times (once per device).\n"
>> +#endif
>>                "\n", RTE_MAX_LCORE);
>>  }
>> diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
>> index 5f1367eb..4b6af937 100644
>> --- a/lib/librte_eal/common/eal_internal_cfg.h
>> +++ b/lib/librte_eal/common/eal_internal_cfg.h
>> @@ -67,6 +67,7 @@ struct internal_config {
>>         unsigned hugepage_unlink;         /**< true to unlink backing files */
>>         volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
>>         volatile unsigned no_pci;         /**< true to disable PCI */
>> +       volatile unsigned no_vmbus;       /**< true to disable VMBUS */
>>         volatile unsigned no_hpet;        /**< true to disable HPET */
>>         volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping
>>                                                                                 * instead of native TSC */
>> diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
>> index a881c62e..156727e7 100644
>> --- a/lib/librte_eal/common/eal_options.h
>> +++ b/lib/librte_eal/common/eal_options.h
>> @@ -83,6 +83,12 @@ enum {
>>         OPT_VMWARE_TSC_MAP_NUM,
>>  #define OPT_XEN_DOM0          "xen-dom0"
>>         OPT_XEN_DOM0_NUM,
>> +#define OPT_NO_VMBUS          "no-vmbus"
>> +       OPT_NO_VMBUS_NUM,
>> +#define OPT_VMBUS_BLACKLIST   "vmbus-blacklist"
>> +       OPT_VMBUS_BLACKLIST_NUM,
>> +#define OPT_VMBUS_WHITELIST   "vmbus-whitelist"
>> +       OPT_VMBUS_WHITELIST_NUM,
>>         OPT_LONG_MAX_NUM
>>  };
>>
>> diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
>> index 9e7d8f6b..c856c63e 100644
>> --- a/lib/librte_eal/common/eal_private.h
>> +++ b/lib/librte_eal/common/eal_private.h
>> @@ -210,6 +210,11 @@ int pci_uio_map_resource_by_index(struct rte_pci_device *dev, int res_idx,
>>                 struct mapped_pci_resource *uio_res, int map_idx);
>>
>>  /**
>> + * VMBUS related functions and structures
>> + */
>> +int rte_eal_vmbus_init(void);
>> +
>> +/**
>>   * Init tail queues for non-EAL library structures. This is to allow
>>   * the rings, mempools, etc. lists to be shared among multiple processes
>>   *
>> diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
>> index 88120a1c..c079d289 100644
>> --- a/lib/librte_eal/common/include/rte_devargs.h
>> +++ b/lib/librte_eal/common/include/rte_devargs.h
>> @@ -51,6 +51,9 @@ extern "C" {
>>  #include <stdio.h>
>>  #include <sys/queue.h>
>>  #include <rte_pci.h>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +#include <uuid/uuid.h>
>> +#endif
>>
>>  /**
>>   * Type of generic device
>> @@ -59,6 +62,8 @@ enum rte_devtype {
>>         RTE_DEVTYPE_WHITELISTED_PCI,
>>         RTE_DEVTYPE_BLACKLISTED_PCI,
>>         RTE_DEVTYPE_VIRTUAL,
>> +       RTE_DEVTYPE_WHITELISTED_VMBUS,
>> +       RTE_DEVTYPE_BLACKLISTED_VMBUS,
>>  };
>>
>>  /**
>> @@ -88,6 +93,9 @@ struct rte_devargs {
>>                         /** Driver name. */
>>                         char drv_name[32];
>>                 } virt;
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +               uuid_t uuid;
>> +#endif
>>         };
>>         /** Arguments string as given by user or "" for no argument. */
>>         char *args;
>> diff --git a/lib/librte_eal/common/include/rte_vmbus.h b/lib/librte_eal/common/include/rte_vmbus.h
>> new file mode 100644
>> index 00000000..f96d753e
>> --- /dev/null
>> +++ b/lib/librte_eal/common/include/rte_vmbus.h
>> @@ -0,0 +1,249 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
>> + *   Copyright(c) 2016 Microsoft Corporation
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + *
>> + */
>> +
>> +#ifndef _RTE_VMBUS_H_
>> +#define _RTE_VMBUS_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE VMBUS Interface
>> + */
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <limits.h>
>> +#include <errno.h>
>> +#include <uuid/uuid.h>
>> +#include <sys/queue.h>
>> +#include <stdint.h>
>> +#include <inttypes.h>
>> +
>> +#include <rte_debug.h>
>> +#include <rte_interrupts.h>
>> +#include <rte_dev.h>
>> +
>> +TAILQ_HEAD(vmbus_device_list, rte_vmbus_device);
>> +TAILQ_HEAD(vmbus_driver_list, rte_vmbus_driver);
>> +
>> +extern struct vmbus_driver_list vmbus_driver_list;
>> +extern struct vmbus_device_list vmbus_device_list;
>> +
>> +/** Pathname of VMBUS devices directory. */
>> +#define SYSFS_VMBUS_DEVICES "/sys/bus/vmbus/devices"
>> +
>> +#define UUID_BUF_SZ    (36 + 1)
>> +
>> +
>> +/** Maximum number of VMBUS resources. */
>> +#define VMBUS_MAX_RESOURCE 7
>> +
>> +/**
>> + * A structure describing a VMBUS device.
>> + */
>> +struct rte_vmbus_device {
>> +       TAILQ_ENTRY(rte_vmbus_device) next;     /**< Next probed VMBUS device. */
>> +       struct rte_device device;               /**< Inherit core device */
>> +       uuid_t device_id;                       /**< VMBUS device id */
>> +       uuid_t class_id;                        /**< VMBUS device type */
>> +       uint32_t relid;                         /**< VMBUS id for notification */
>> +       uint8_t monitor_id;
>> +       struct rte_intr_handle intr_handle;     /**< Interrupt handle */
>> +       const struct rte_vmbus_driver *driver;  /**< Associated driver */
>> +
>> +       struct rte_mem_resource mem_resource[VMBUS_MAX_RESOURCE];
>> +                                               /**< VMBUS Memory Resource */
>> +       char sysfs_name[];                      /**< Name in sysfs bus directory */
>> +};
>> +
>> +struct rte_vmbus_driver;
>> +
>> +/**
>> + * Initialisation function for the driver called during VMBUS probing.
>> + */
>> +typedef int (vmbus_probe_t)(struct rte_vmbus_driver *,
>> +                           struct rte_vmbus_device *);
>> +
>> +/**
>> + * Uninitialisation function for the driver called during hotplugging.
>> + */
>> +typedef int (vmbus_remove_t)(struct rte_vmbus_device *);
>> +
>> +/**
>> + * A structure describing a VMBUS driver.
>> + */
>> +struct rte_vmbus_driver {
>> +       TAILQ_ENTRY(rte_vmbus_driver) next;     /**< Next in list. */
>> +       struct rte_driver driver;
>> +       vmbus_probe_t *probe;                   /**< Device Probe function. */
>> +       vmbus_remove_t *remove;                 /**< Device Remove function. */
>> +
>> +       const uuid_t *id_table;                 /**< ID table. */
>> +};
>> +
>> +struct vmbus_map {
>> +       void *addr;
>> +       char *path;
>> +       uint64_t offset;
>> +       uint64_t size;
>> +       uint64_t phaddr;
>> +};
>> +
>> +/*
>> + * For multi-process we need to reproduce all vmbus mappings in secondary
>> + * processes, so save them in a tailq.
>> + */
>> +struct mapped_vmbus_resource {
>> +       TAILQ_ENTRY(mapped_vmbus_resource) next;
>> +
>> +       uuid_t uuid;
>> +       char path[PATH_MAX];
>> +       int nb_maps;
>> +       struct vmbus_map maps[VMBUS_MAX_RESOURCE];
>> +};
>> +
>> +TAILQ_HEAD(mapped_vmbus_res_list, mapped_vmbus_resource);
>> +
>> +/**
>> + * Scan the content of the VMBUS bus, and the devices in the devices list
>> + *
>> + * @return
>> + *  0 on success, negative on error
>> + */
>> +int rte_eal_vmbus_scan(void);
>> +
>> +/**
>> + * Probe the VMBUS bus for registered drivers.
>> + *
>> + * Scan the content of the VMBUS bus, and call the probe() function for
>> + * all registered drivers that have a matching entry in its id_table
>> + * for discovered devices.
>> + *
>> + * @return
>> + *   - 0 on success.
>> + *   - Negative on error.
>> + */
>> +int rte_eal_vmbus_probe(void);
>> +
>> +/**
>> + * Map the VMBUS device resources in user space virtual memory address
>> + *
>> + * @param dev
>> + *   A pointer to a rte_vmbus_device structure describing the device
>> + *   to use
>> + *
>> + * @return
>> + *   0 on success, negative on error and positive if no driver
>> + *   is found for the device.
>> + */
>> +int rte_eal_vmbus_map_device(struct rte_vmbus_device *dev);
>> +
>> +/**
>> + * Unmap this device
>> + *
>> + * @param dev
>> + *   A pointer to a rte_vmbus_device structure describing the device
>> + *   to use
>> + */
>> +void rte_eal_vmbus_unmap_device(struct rte_vmbus_device *dev);
>> +
>> +/**
>> + * Probe the single VMBUS device.
>> + *
>> + * Scan the content of the VMBUS bus, and find the vmbus device
>> + * specified by device uuid, then call the probe() function for
>> + * registered driver that has a matching entry in its id_table for
>> + * discovered device.
>> + *
>> + * @param id
>> + *   The VMBUS device uuid.
>> + * @return
>> + *   - 0 on success.
>> + *   - Negative on error.
>> + */
>> +int rte_eal_vmbus_probe_one(uuid_t id);
>> +
>> +/**
>> + * Close the single VMBUS device.
>> + *
>> + * Scan the content of the VMBUS bus, and find the vmbus device id,
>> + * then call the remove() function for registered driver that has a
>> + * matching entry in its id_table for discovered device.
>> + *
>> + * @param id
>> + *   The VMBUS device uuid.
>> + * @return
>> + *   - 0 on success.
>> + *   - Negative on error.
>> + */
>> +int rte_eal_vmbus_detach(uuid_t id);
>> +
>> +/**
>> + * Register a VMBUS driver.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_vmbus_driver structure describing the driver
>> + *   to be registered.
>> + */
>> +void rte_eal_vmbus_register(struct rte_vmbus_driver *driver);
>> +
>> +/** Helper for VMBUS device registration from driver nstance */
>> +#define RTE_PMD_REGISTER_VMBUS(nm, vmbus_drv) \
>> +RTE_INIT(vmbusinitfn_ ##nm); \
>> +static void vmbusinitfn_ ##nm(void) \
>> +{\
>> +       (vmbus_drv).driver.name = RTE_STR(nm);\
>> +       (vmbus_drv).driver.type = PMD_VMBUS; \
>> +       rte_eal_vmbus_register(&vmbus_drv); \
>> +} \
>> +RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
>> +
>> +/**
>> + * Unregister a VMBUS driver.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_vmbus_driver structure describing the driver
>> + *   to be unregistered.
>> + */
>> +void rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver);
>
> The register/unregister need to get exported via the map file too.
>
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_VMBUS_H_ */
>> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
>> index 4e206f09..f6ca3848 100644
>> --- a/lib/librte_eal/linuxapp/eal/Makefile
>> +++ b/lib/librte_eal/linuxapp/eal/Makefile
>> @@ -71,6 +71,11 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
>>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
>>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
>>
>> +ifeq ($(CONFIG_RTE_LIBRTE_HV_PMD),y)
>> +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_vmbus.c
>> +LDLIBS += -luuid
>> +endif
>> +
>>  # from common dir
>>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
>>  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
>> @@ -114,6 +119,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
>>  CFLAGS_eal_pci.o := -D_GNU_SOURCE
>>  CFLAGS_eal_pci_uio.o := -D_GNU_SOURCE
>>  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
>> +CFLAGS_eal_vmbux.o := -D_GNU_SOURCE
>>  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
>>  CFLAGS_eal_common_options.o := -D_GNU_SOURCE
>>  CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index 16dd5b9c..1bc0814a 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -70,6 +70,9 @@
>>  #include <rte_cpuflags.h>
>>  #include <rte_interrupts.h>
>>  #include <rte_pci.h>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +#include <rte_vmbus.h>
>> +#endif
>>  #include <rte_dev.h>
>>  #include <rte_devargs.h>
>>  #include <rte_common.h>
>> @@ -830,6 +833,11 @@ rte_eal_init(int argc, char **argv)
>>
>>         eal_check_mem_on_local_socket();
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +       if (rte_eal_vmbus_init() < 0)
>> +               RTE_LOG(ERR, EAL, "Cannot init VMBUS\n");
>> +#endif
>> +
>>         if (eal_plugins_init() < 0)
>>                 rte_panic("Cannot init plugins\n");
>>
>> @@ -884,6 +892,11 @@ rte_eal_init(int argc, char **argv)
>>         if (rte_eal_pci_probe())
>>                 rte_panic("Cannot probe PCI\n");
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +       if (rte_eal_vmbus_probe() < 0)
>> +               rte_panic("Cannot probe VMBUS\n");
>> +#endif
>> +
>>         if (rte_eal_dev_init() < 0)
>>                 rte_panic("Cannot init pmd devices\n");
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_vmbus.c b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
>> new file mode 100644
>> index 00000000..729f93a9
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/eal/eal_vmbus.c
>> @@ -0,0 +1,911 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2013-2016 Brocade Communications Systems, Inc.
>> + *   Copyright(c) 2016 Microsoft Corporation
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *      notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *      notice, this list of conditions and the following disclaimer in
>> + *      the documentation and/or other materials provided with the
>> + *      distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *      contributors may be used to endorse or promote products derived
>> + *      from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + *
>> + */
>> +
>> +#include <string.h>
>> +#include <unistd.h>
>> +#include <dirent.h>
>> +#include <fcntl.h>
>> +#include <sys/mman.h>
>> +
>> +#include <rte_eal.h>
>> +#include <rte_tailq.h>
>> +#include <rte_log.h>
>> +#include <rte_devargs.h>
>> +#include <rte_vmbus.h>
>> +#include <rte_malloc.h>
>> +
>> +#include "eal_private.h"
>> +#include "eal_pci_init.h"
>> +#include "eal_filesystem.h"
>> +
>> +struct vmbus_driver_list vmbus_driver_list =
>> +       TAILQ_HEAD_INITIALIZER(vmbus_driver_list);
>> +struct vmbus_device_list vmbus_device_list =
>> +       TAILQ_HEAD_INITIALIZER(vmbus_device_list);
>> +
>> +static void *vmbus_map_addr;
>> +
>> +static struct rte_tailq_elem rte_vmbus_uio_tailq = {
>> +       .name = "UIO_RESOURCE_LIST",
This should be VMBUS_UIO_RESOURCE_LIST to not collide with rte_uio_tailq.
>> +};
>> +EAL_REGISTER_TAILQ(rte_vmbus_uio_tailq);
>> +
>> +/*
>> + * parse a sysfs file containing one integer value
>> + * different to the eal version, as it needs to work with 64-bit values
>> + */
>> +static int
>> +vmbus_get_sysfs_uuid(const char *filename, uuid_t uu)
>> +{
>> +       char buf[BUFSIZ];
>> +       char *cp, *in = buf;
>> +       FILE *f;
>> +
>> +       f = fopen(filename, "r");
>> +       if (f == NULL) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
>> +                               __func__, filename);
>> +               return -1;
>> +       }
>> +
>> +       if (fgets(buf, sizeof(buf), f) == NULL) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
>> +                               __func__, filename);
>> +               fclose(f);
>> +               return -1;
>> +       }
>> +       fclose(f);
>> +
>> +       cp = strchr(buf, '\n');
>> +       if (cp)
>> +               *cp = '\0';
>> +
>> +       /* strip { } notation */
>> +       if (buf[0] == '{') {
>> +               in = buf + 1;
>> +               cp = strchr(in, '}');
>> +               if (cp)
>> +                       *cp = '\0';
>> +       }
>> +
>> +       if (uuid_parse(in, uu) < 0) {
>> +               RTE_LOG(ERR, EAL, "%s %s not a valid UUID\n",
>> +                       filename, buf);
>> +               return -1;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +/* map a particular resource from a file */
>> +static void *
>> +vmbus_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
>> +                  int flags)
>> +{
>> +       void *mapaddr;
>> +
>> +       /* Map the memory resource of device */
>> +       mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
>> +                      MAP_SHARED | flags, fd, offset);
>> +       if (mapaddr == MAP_FAILED ||
>> +           (requested_addr != NULL && mapaddr != requested_addr)) {
>> +               RTE_LOG(ERR, EAL,
>> +                       "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s)\n",
>> +                       __func__, fd, requested_addr,
>> +                       (unsigned long)size, (unsigned long)offset,
>> +                       strerror(errno));
>> +       } else
>> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory mapped at %p\n", mapaddr);
>> +
>> +       return mapaddr;
>> +}
>> +
>> +/* unmap a particular resource */
>> +static void
>> +vmbus_unmap_resource(void *requested_addr, size_t size)
>> +{
>> +       if (requested_addr == NULL)
>> +               return;
>> +
>> +       /* Unmap the VMBUS memory resource of device */
>> +       if (munmap(requested_addr, size)) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
>> +                       __func__, requested_addr, (unsigned long)size,
>> +                       strerror(errno));
>> +       } else
>> +               RTE_LOG(DEBUG, EAL, "  VMBUS memory unmapped at %p\n",
>> +                               requested_addr);
>> +}
>> +
>> +/* Only supports current kernel version
>> + * Unlike PCI there is no option (or need) to create UIO device.
>> + */
>> +static int vmbus_get_uio_dev(const char *name,
>> +                            char *dstbuf, size_t buflen)
>> +{
>> +       char dirname[PATH_MAX];
>> +       unsigned int uio_num;
>> +       struct dirent *e;
>> +       DIR *dir;
>> +
>> +       snprintf(dirname, sizeof(dirname),
>> +                "/sys/bus/vmbus/devices/%s/uio", name);
>> +
>> +       dir = opendir(dirname);
>> +       if (dir == NULL) {
>> +               RTE_LOG(ERR, EAL, "Cannot map uio resources for %s: %s\n",
>> +                       name, strerror(errno));
>> +               return -1;
>> +       }
>> +
>> +       /* take the first file starting with "uio" */
>> +       while ((e = readdir(dir)) != NULL) {
>> +               if (sscanf(e->d_name, "uio%u", &uio_num) != 1)
>> +                       continue;
>> +
>> +               snprintf(dstbuf, buflen, "%s/uio%u", dirname, uio_num);
>> +               break;
>> +       }
>> +       closedir(dir);
>> +
>> +       return e ? (int) uio_num : -1;
>> +}
>> +
>> +/*
>> + * parse a sysfs file containing one integer value
>> + * different to the eal version, as it needs to work with 64-bit values
>> + */
>> +static int
>> +vmbus_parse_sysfs_value(const char *dir, const char *name,
>> +                       uint64_t *val)
>> +{
>> +       char filename[PATH_MAX];
>> +       FILE *f;
>> +       char buf[BUFSIZ];
>> +       char *end = NULL;
>> +
>> +       snprintf(filename, sizeof(filename), "%s/%s", dir, name);
>> +       f = fopen(filename, "r");
>> +       if (f == NULL) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
>> +                               __func__, filename);
>> +               return -1;
>> +       }
>> +
>> +       if (fgets(buf, sizeof(buf), f) == NULL) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
>> +                               __func__, filename);
>> +               fclose(f);
>> +               return -1;
>> +       }
>> +       fclose(f);
>> +
>> +       *val = strtoull(buf, &end, 0);
>> +       if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
>> +               RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
>> +                               __func__, filename);
>> +               return -1;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* Get mappings out of values provided by uio */
>> +static int
>> +vmbus_uio_get_mappings(const char *uioname,
>> +                      struct vmbus_map maps[])
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
>> +               struct vmbus_map *map = &maps[i];
>> +               char dirname[PATH_MAX];
>> +
>> +               /* check if map directory exists */
>> +               snprintf(dirname, sizeof(dirname),
>> +                        "%s/maps/map%d", uioname, i);
>> +
>> +               if (access(dirname, F_OK) != 0)
>> +                       break;
>> +
>> +               /* get mapping offset */
>> +               if (vmbus_parse_sysfs_value(dirname, "offset",
>> +                                           &map->offset) < 0)
>> +                       return -1;
>> +
>> +               /* get mapping size */
>> +               if (vmbus_parse_sysfs_value(dirname, "size",
>> +                                           &map->size) < 0)
>> +                       return -1;
>> +
>> +               /* get mapping physical address */
>> +               if (vmbus_parse_sysfs_value(dirname, "addr",
>> +                                           &maps->phaddr) < 0)
>> +                       return -1;
>> +       }
>> +
>> +       return i;
>> +}
>> +
>> +static void
>> +vmbus_uio_free_resource(struct rte_vmbus_device *dev,
>> +               struct mapped_vmbus_resource *uio_res)
>> +{
>> +       rte_free(uio_res);
>> +
>> +       if (dev->intr_handle.fd) {
>> +               close(dev->intr_handle.fd);
>> +               dev->intr_handle.fd = -1;
>> +               dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>> +       }
>> +}
>> +
>> +static struct mapped_vmbus_resource *
>> +vmbus_uio_alloc_resource(struct rte_vmbus_device *dev)
>> +{
>> +       struct mapped_vmbus_resource *uio_res;
>> +       char dirname[PATH_MAX], devname[PATH_MAX];
>> +       int uio_num, nb_maps;
>> +
>> +       uio_num = vmbus_get_uio_dev(dev->sysfs_name, dirname, sizeof(dirname));
>> +       if (uio_num < 0) {
>> +               RTE_LOG(WARNING, EAL,
>> +                       "  %s not managed by UIO driver, skipping\n",
>> +                       dev->sysfs_name);
>> +               return NULL;
>> +       }
>> +
>> +       /* allocate the mapping details for secondary processes*/
>> +       uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
>> +       if (uio_res == NULL) {
>> +               RTE_LOG(ERR, EAL,
>> +                       "%s(): cannot store uio mmap details\n", __func__);
>> +               goto error;
>> +       }
>> +
>> +       snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
>> +       dev->intr_handle.fd = open(devname, O_RDWR);
>> +       if (dev->intr_handle.fd < 0) {
>> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
>> +                       devname, strerror(errno));
>> +               goto error;
>> +       }
>> +
>> +       dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
>> +
>> +       snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
>> +       uuid_copy(uio_res->uuid, dev->device_id);
>> +
>> +       nb_maps = vmbus_uio_get_mappings(dirname, uio_res->maps);
>> +       if (nb_maps < 0)
>> +               goto error;
>> +
>> +       RTE_LOG(DEBUG, EAL, "Found %d memory maps for device %s\n",
>> +               nb_maps, dev->sysfs_name);
>> +
>> +       return uio_res;
>> +
>> + error:
>> +       vmbus_uio_free_resource(dev, uio_res);
>> +       return NULL;
>> +}
>> +
>> +static int
>> +vmbus_uio_map_resource_by_index(struct rte_vmbus_device *dev,
>> +                               unsigned int res_idx,
>> +                               struct mapped_vmbus_resource *uio_res,
>> +                               unsigned int map_idx)
>> +{
>> +       struct vmbus_map *maps = uio_res->maps;
>> +       char devname[PATH_MAX];
>> +       void *mapaddr;
>> +       int fd;
>> +
>> +       snprintf(devname, sizeof(devname),
>> +                "/sys/bus/vmbus/%s/resource%u", dev->sysfs_name, res_idx);
>> +
>> +       fd = open(devname, O_RDWR);
>> +       if (fd < 0) {
>> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
>> +                               devname, strerror(errno));
>> +               return -1;
>> +       }
>> +
>> +       /* allocate memory to keep path */
>> +       maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
>> +       if (maps[map_idx].path == NULL) {
>> +               RTE_LOG(ERR, EAL, "Cannot allocate memory for path: %s\n",
>> +                               strerror(errno));
>> +               return -1;
>> +       }
>> +
>> +       /* try mapping somewhere close to the end of hugepages */
>> +       if (vmbus_map_addr == NULL)
>> +               vmbus_map_addr = pci_find_max_end_va();
>> +
>> +       mapaddr = vmbus_map_resource(vmbus_map_addr, fd, 0,
>> +                                    dev->mem_resource[res_idx].len, 0);
>> +       close(fd);
>> +       if (mapaddr == MAP_FAILED) {
>> +               rte_free(maps[map_idx].path);
>> +               return -1;
>> +       }
>> +
>> +       vmbus_map_addr = RTE_PTR_ADD(mapaddr,
>> +                                    dev->mem_resource[res_idx].len);
>> +
>> +       maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
>> +       maps[map_idx].size = dev->mem_resource[res_idx].len;
>> +       maps[map_idx].addr = mapaddr;
>> +       maps[map_idx].offset = 0;
>> +       strcpy(maps[map_idx].path, devname);
>> +       dev->mem_resource[res_idx].addr = mapaddr;
>> +
>> +       return 0;
>> +}
>> +
>> +static void
>> +vmbus_uio_unmap(struct mapped_vmbus_resource *uio_res)
>> +{
>> +       int i;
>> +
>> +       if (uio_res == NULL)
>> +               return;
>> +
>> +       for (i = 0; i != uio_res->nb_maps; i++) {
>> +               vmbus_unmap_resource(uio_res->maps[i].addr,
>> +                                    uio_res->maps[i].size);
>> +
>> +               if (rte_eal_process_type() == RTE_PROC_PRIMARY)
>> +                       rte_free(uio_res->maps[i].path);
>> +       }
>> +}
>> +
>> +static struct mapped_vmbus_resource *
>> +vmbus_uio_find_resource(struct rte_vmbus_device *dev)
>> +{
>> +       struct mapped_vmbus_resource *uio_res;
>> +       struct mapped_vmbus_res_list *uio_res_list =
>> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
>> +                                      mapped_vmbus_res_list);
>> +
>> +       if (dev == NULL)
>> +               return NULL;
>> +
>> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
>> +               if (uuid_compare(uio_res->uuid, dev->device_id) == 0)
>> +                       return uio_res;
>> +       }
>> +       return NULL;
>> +}
>> +
>> +/* unmap the VMBUS resource of a VMBUS device in virtual memory */
>> +static void
>> +vmbus_uio_unmap_resource(struct rte_vmbus_device *dev)
>> +{
>> +       struct mapped_vmbus_resource *uio_res;
>> +       struct mapped_vmbus_res_list *uio_res_list =
>> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
>> +                                      mapped_vmbus_res_list);
>> +
>> +       if (dev == NULL)
>> +               return;
>> +
>> +       /* find an entry for the device */
>> +       uio_res = vmbus_uio_find_resource(dev);
>> +       if (uio_res == NULL)
>> +               return;
>> +
>> +       /* secondary processes - just free maps */
>> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>> +               return vmbus_uio_unmap(uio_res);
>> +
>> +       TAILQ_REMOVE(uio_res_list, uio_res, next);
>> +
>> +       /* unmap all resources */
>> +       vmbus_uio_unmap(uio_res);
>> +
>> +       /* free uio resource */
>> +       rte_free(uio_res);
>> +
>> +       /* close fd if in primary process */
>> +       close(dev->intr_handle.fd);
>> +       if (dev->intr_handle.uio_cfg_fd >= 0) {
>> +               close(dev->intr_handle.uio_cfg_fd);
>> +               dev->intr_handle.uio_cfg_fd = -1;
>> +       }
>> +
>> +       dev->intr_handle.fd = -1;
>> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>> +}
>> +
>> +static int
>> +vmbus_uio_map_secondary(struct rte_vmbus_device *dev)
>> +{
>> +       struct mapped_vmbus_resource *uio_res;
>> +       struct mapped_vmbus_res_list *uio_res_list =
>> +                       RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head,
>> +                                      mapped_vmbus_res_list);
>> +
>> +       TAILQ_FOREACH(uio_res, uio_res_list, next) {
>> +               int i;
>> +
>> +               /* skip this element if it doesn't match our id */
>> +               if (uuid_compare(uio_res->uuid, dev->device_id))
>> +                       continue;
>> +
>> +               for (i = 0; i != uio_res->nb_maps; i++) {
>> +                       void *mapaddr;
>> +                       int fd;
>> +
>> +                       fd = open(uio_res->maps[i].path, O_RDWR);
>> +                       if (fd < 0) {
>> +                               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
>> +                                       uio_res->maps[i].path, strerror(errno));
>> +                               return -1;
>> +                       }
>> +
>> +                       mapaddr = vmbus_map_resource(uio_res->maps[i].addr, fd,
>> +                                                    uio_res->maps[i].offset,
>> +                                                    uio_res->maps[i].size, 0);
>> +                       /* fd is not needed in slave process, close it */
>> +                       close(fd);
>> +
>> +                       if (mapaddr == uio_res->maps[i].addr)
>> +                               continue;
>> +
>> +                       RTE_LOG(ERR, EAL,
>> +                               "Cannot mmap device resource file %s to address: %p\n",
>> +                               uio_res->maps[i].path,
>> +                               uio_res->maps[i].addr);
>> +
>> +                       /* unmap addrs correctly mapped */
>> +                       while (i != 0) {
>> +                               --i;
>> +                               vmbus_unmap_resource(uio_res->maps[i].addr,
>> +                                                    uio_res->maps[i].size);
>> +                       }
>> +                       return -1;
>> +
>> +               }
>> +               return 0;
>> +       }
>> +
>> +       RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
>> +       return 1;
>> +}
>> +
>> +/* map the resources of a vmbus device in virtual memory */
>> +int
>> +rte_eal_vmbus_map_device(struct rte_vmbus_device *dev)
>> +{
>> +       struct mapped_vmbus_resource *uio_res;
>> +       struct mapped_vmbus_res_list *uio_res_list =
>> +               RTE_TAILQ_CAST(rte_vmbus_uio_tailq.head, mapped_vmbus_res_list);
>> +       int i, ret, map_idx = 0;
>> +
>> +       dev->intr_handle.fd = -1;
>> +       dev->intr_handle.uio_cfg_fd = -1;
>> +       dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>> +
>> +       /* secondary processes - use already recorded details */
>> +       if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>> +               return vmbus_uio_map_secondary(dev);
>> +
>> +       /* allocate uio resource */
>> +       uio_res = vmbus_uio_alloc_resource(dev);
>> +       if (uio_res == NULL)
>> +               return -1;
>> +
>> +       /* Map all BARs */
>> +       for (i = 0; i != VMBUS_MAX_RESOURCE; i++) {
>> +               uint64_t phaddr;
>> +
>> +               /* skip empty BAR */
>> +               phaddr = dev->mem_resource[i].phys_addr;
>> +               if (phaddr == 0)
>> +                       continue;
>> +
>> +               ret = vmbus_uio_map_resource_by_index(dev, i,
>> +                                                     uio_res, map_idx);
>> +               if (ret)
>> +                       goto error;
>> +
>> +               map_idx++;
>> +       }
>> +
>> +       uio_res->nb_maps = map_idx;
>> +
>> +       TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
>> +
>> +       return 0;
>> +error:
>> +       for (i = 0; i < map_idx; i++) {
>> +               vmbus_unmap_resource(uio_res->maps[i].addr,
>> +                                    uio_res->maps[i].size);
>> +               rte_free(uio_res->maps[i].path);
>> +       }
>> +       vmbus_uio_free_resource(dev, uio_res);
>> +       return -1;
>> +}
>> +
>> +/* Scan one vmbus sysfs entry, and fill the devices list from it. */
>> +static int
>> +vmbus_scan_one(const char *name)
>> +{
>> +       struct rte_vmbus_device *dev, *dev2;
>> +       char filename[PATH_MAX];
>> +       char dirname[PATH_MAX];
>> +       unsigned long tmp;
>> +
>> +       dev = malloc(sizeof(*dev) + strlen(name) + 1);
>> +       if (dev == NULL)
>> +               return -1;
>> +
>> +       memset(dev, 0, sizeof(*dev));
>> +       strcpy(dev->sysfs_name, name);
>> +       if (dev->sysfs_name == NULL)
>> +               goto error;
>> +
>> +       /* sysfs base directory
>> +        *   /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
>> +        * or on older kernel
>> +        *   /sys/bus/vmbus/devices/vmbus_1
>> +        */
>> +       snprintf(dirname, sizeof(dirname), "%s/%s",
>> +                SYSFS_VMBUS_DEVICES, name);
>> +
>> +       /* get device id */
>> +       snprintf(filename, sizeof(filename), "%s/device_id", dirname);
>> +       if (vmbus_get_sysfs_uuid(filename, dev->device_id) < 0)
>> +               goto error;
>> +
>> +       /* get device class  */
>> +       snprintf(filename, sizeof(filename), "%s/class_id", dirname);
>> +       if (vmbus_get_sysfs_uuid(filename, dev->class_id) < 0)
>> +               goto error;
>> +
>> +       /* get relid */
>> +       snprintf(filename, sizeof(filename), "%s/id", dirname);
>> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
>> +               goto error;
>> +       dev->relid = tmp;
>> +
>> +       /* get monitor id */
>> +       snprintf(filename, sizeof(filename), "%s/monitor_id", dirname);
>> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
>> +               goto error;
>> +       dev->monitor_id = tmp;
>> +
>> +       /* get numa node */
>> +       snprintf(filename, sizeof(filename), "%s/numa_node",
>> +                dirname);
>> +       if (eal_parse_sysfs_value(filename, &tmp) < 0)
>> +               /* if no NUMA support, set default to 0 */
>> +               dev->device.numa_node = 0;
>> +       else
>> +               dev->device.numa_node = tmp;
>> +
>> +       /* device is valid, add in list (sorted) */
>> +       RTE_LOG(DEBUG, EAL, "Adding vmbus device %s\n", name);
>> +
>> +       TAILQ_FOREACH(dev2, &vmbus_device_list, next) {
>> +               int ret;
>> +
>> +               ret = uuid_compare(dev->device_id, dev->device_id);
>> +               if (ret > 0)
>> +                       continue;
>> +
>> +               if (ret < 0) {
>> +                       TAILQ_INSERT_BEFORE(dev2, dev, next);
>> +                       rte_eal_device_insert(&dev->device);
>> +               } else { /* already registered */
>> +                       memmove(dev2->mem_resource, dev->mem_resource,
>> +                               sizeof(dev->mem_resource));
>> +                       free(dev);
>> +               }
>> +               return 0;
>> +       }
>> +
>> +       rte_eal_device_insert(&dev->device);
>> +       TAILQ_INSERT_TAIL(&vmbus_device_list, dev, next);
>> +
>> +       return 0;
>> +error:
>> +       free(dev);
>> +       return -1;
>> +}
>> +
>> +/*
>> + * Scan the content of the vmbus, and the devices in the devices list
>> + */
>> +static int
>> +vmbus_scan(void)
>> +{
>> +       struct dirent *e;
>> +       DIR *dir;
>> +
>> +       dir = opendir(SYSFS_VMBUS_DEVICES);
>> +       if (dir == NULL) {
>> +               if (errno == ENOENT)
>> +                       return 0;
>> +
>> +               RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
>> +                       __func__, strerror(errno));
>> +               return -1;
>> +       }
>> +
>> +       while ((e = readdir(dir)) != NULL) {
>> +               if (e->d_name[0] == '.')
>> +                       continue;
>> +
>> +               if (vmbus_scan_one(e->d_name) < 0)
>> +                       goto error;
>> +       }
>> +       closedir(dir);
>> +       return 0;
>> +
>> +error:
>> +       closedir(dir);
>> +       return -1;
>> +}
>> +
>> +/* Init the VMBUS EAL subsystem */
>> +int rte_eal_vmbus_init(void)
>> +{
>> +       /* VMBUS can be disabled */
>> +       if (internal_config.no_vmbus)
>> +               return 0;
>> +
>> +       if (vmbus_scan() < 0) {
>> +               RTE_LOG(ERR, EAL, "%s(): Cannot scan vmbus\n", __func__);
>> +               return -1;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* Below is PROBE part of eal_vmbus library */
>> +
>> +/*
>> + * If device ID match, call the devinit() function of the driver.
>> + */
>> +static int
>> +rte_eal_vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
>> +                              struct rte_vmbus_device *dev)
>> +{
>> +       const uuid_t *id_table;
>> +
>> +       RTE_LOG(DEBUG, EAL, "  probe driver: %s\n", dr->driver.name);
>> +
>> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
>> +               struct rte_devargs *args;
>> +               char guid[UUID_BUF_SZ];
>> +               int ret;
>> +
>> +               /* skip devices not assocaited with this device class */
>> +               if (uuid_compare(*id_table, dev->class_id) != 0)
>> +                       continue;
>> +
>> +               uuid_unparse(dev->device_id, guid);
>> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
>> +                       guid, dev->device.numa_node);
>> +
>> +               /* no initialization when blacklisted, return without error */
>> +               args = dev->device.devargs;
>> +               if (args && args->type == RTE_DEVTYPE_BLACKLISTED_VMBUS) {
>> +                       RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
>> +                       return 1;
>> +               }
>> +
>> +               RTE_LOG(INFO, EAL, "  probe driver: %s\n", dr->driver.name);
>> +
>> +               /* map resources for device */
>> +               ret = rte_eal_vmbus_map_device(dev);
>> +               if (ret != 0)
>> +                       return ret;
>> +
>> +               /* reference driver structure */
>> +               dev->driver = dr;
>> +
>> +               /* call the driver probe() function */
>> +               ret = dr->probe(dr, dev);
>> +               if (ret)
>> +                       dev->driver = NULL;
>> +
>> +               return ret;
>> +       }
>> +
>> +       /* return positive value if driver doesn't support this device */
>> +       return 1;
>> +}
>> +
>> +
>> +/*
>> + * If vendor/device ID match, call the remove() function of the
>> + * driver.
>> + */
>> +static int
>> +vmbus_detach_dev(struct rte_vmbus_driver *dr,
>> +                struct rte_vmbus_device *dev)
>> +{
>> +       const uuid_t *id_table;
>> +
>> +       for (id_table = dr->id_table; !uuid_is_null(*id_table); ++id_table) {
>> +               char guid[UUID_BUF_SZ];
>> +
>> +               /* skip devices not assocaited with this device class */
>> +               if (uuid_compare(*id_table, dev->class_id) != 0)
>> +                       continue;
>> +
>> +               uuid_unparse(dev->device_id, guid);
>> +               RTE_LOG(INFO, EAL, "VMBUS device %s on NUMA socket %i\n",
>> +                       guid, dev->device.numa_node);
>> +
>> +               RTE_LOG(DEBUG, EAL, "  remove driver: %s\n", dr->driver.name);
>> +
>> +               if (dr->remove && (dr->remove(dev) < 0))
>> +                       return -1;      /* negative value is an error */
>> +
>> +               /* clear driver structure */
>> +               dev->driver = NULL;
>> +
>> +               vmbus_uio_unmap_resource(dev);
>> +               return 0;
>> +       }
>> +
>> +       /* return positive value if driver doesn't support this device */
>> +       return 1;
>> +}
>> +
>> +/*
>> + * call the devinit() function of all
>> + * registered drivers for the vmbus device. Return -1 if no driver is
>> + * found for this class of vmbus device.
>> + * The present assumption is that we have drivers only for vmbus network
>> + * devices. That's why we don't check driver's id_table now.
>> + */
>> +static int
>> +vmbus_probe_all_drivers(struct rte_vmbus_device *dev)
>> +{
>> +       struct rte_vmbus_driver *dr = NULL;
>> +       int ret;
>> +
>> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
>> +               ret = rte_eal_vmbus_probe_one_driver(dr, dev);
>> +               if (ret < 0) {
>> +                       /* negative value is an error */
>> +                       RTE_LOG(ERR, EAL, "Failed to probe driver %s\n",
>> +                               dr->driver.name);
>> +                       return -1;
>> +               }
>> +               /* positive value means driver doesn't support it */
>> +               if (ret > 0)
>> +                       continue;
>> +
>> +               return 0;
>> +       }
>> +
>> +       return 1;
>> +}
>> +
>> +
>> +/*
>> + * If device ID matches, call the remove() function of all
>> + * registered driver for the given device. Return -1 if initialization
>> + * failed, return 1 if no driver is found for this device.
>> + */
>> +static int
>> +vmbus_detach_all_drivers(struct rte_vmbus_device *dev)
>> +{
>> +       struct rte_vmbus_driver *dr;
>> +       int rc = 0;
>> +
>> +       if (dev == NULL)
>> +               return -1;
>> +
>> +       TAILQ_FOREACH(dr, &vmbus_driver_list, next) {
>> +               rc = vmbus_detach_dev(dr, dev);
>> +               if (rc < 0)
>> +                       /* negative value is an error */
>> +                       return -1;
>> +               if (rc > 0)
>> +                       /* positive value means driver doesn't support it */
>> +                       continue;
>> +               return 0;
>> +       }
>> +       return 1;
>> +}
>> +
>> +/* Detach device specified by its VMBUS id */
>> +int
>> +rte_eal_vmbus_detach(uuid_t device_id)
>> +{
>> +       struct rte_vmbus_device *dev;
>> +       char ubuf[UUID_BUF_SZ];
>> +
>> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
>> +               if (uuid_compare(dev->device_id, device_id) != 0)
>> +                       continue;
>> +
>> +               if (vmbus_detach_all_drivers(dev) < 0)
>> +                       goto err_return;
>> +
>> +               TAILQ_REMOVE(&vmbus_device_list, dev, next);
>> +               free(dev);
>> +               return 0;
>> +       }
>> +       return -1;
>> +
>> +err_return:
>> +       uuid_unparse(device_id, ubuf);
>> +       RTE_LOG(WARNING, EAL, "Requested device %s cannot be used\n",
>> +               ubuf);
>> +       return -1;
>> +}
>> +
>> +/*
>> + * Scan the vmbus, and call the devinit() function for
>> + * all registered drivers that have a matching entry in its id_table
>> + * for discovered devices.
>> + */
>> +int
>> +rte_eal_vmbus_probe(void)
>> +{
>> +       struct rte_vmbus_device *dev = NULL;
>> +
>> +       TAILQ_FOREACH(dev, &vmbus_device_list, next) {
>> +               char ubuf[UUID_BUF_SZ];
>> +
>> +               uuid_unparse(dev->device_id, ubuf);
>> +
>> +               RTE_LOG(DEBUG, EAL, "Probing driver for device %s ...\n",
>> +                       ubuf);
>> +               vmbus_probe_all_drivers(dev);
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* register vmbus driver */
>> +void
>> +rte_eal_vmbus_register(struct rte_vmbus_driver *driver)
>> +{
>> +       TAILQ_INSERT_TAIL(&vmbus_driver_list, driver, next);
>> +}
>> +
>> +/* unregister vmbus driver */
>> +void
>> +rte_eal_vmbus_unregister(struct rte_vmbus_driver *driver)
>> +{
>> +       TAILQ_REMOVE(&vmbus_driver_list, driver, next);
>> +}
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>> index 7c212096..b69af0f0 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -3334,3 +3334,93 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
>>                                 -ENOTSUP);
>>         return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
>>  }
>> +
>> +
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +int
>> +rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
>> +                       struct rte_vmbus_device *vmbus_dev)
>> +{
>> +       struct eth_driver  *eth_drv = (struct eth_driver *)vmbus_drv;
>> +       struct rte_eth_dev *eth_dev;
>> +       char ustr[UUID_BUF_SZ];
>> +       int diag;
>> +
>> +       uuid_unparse(vmbus_dev->device_id, ustr);
>> +
>> +       eth_dev = rte_eth_dev_allocate(ustr);
>> +       if (eth_dev == NULL)
>> +               return -ENOMEM;
>> +
>> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>> +               eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
>> +                                 eth_drv->dev_private_size,
>> +                                 RTE_CACHE_LINE_SIZE);
>> +               if (eth_dev->data->dev_private == NULL)
>> +                       rte_panic("Cannot allocate memzone for private port data\n");
>> +       }
>> +
>> +       eth_dev->device = &vmbus_dev->device;
>> +       eth_dev->driver = eth_drv;
>> +       eth_dev->data->rx_mbuf_alloc_failed = 0;
>> +
>> +       /* init user callbacks */
>> +       TAILQ_INIT(&(eth_dev->link_intr_cbs));
>> +
>> +       /*
>> +        * Set the default maximum frame size.
>> +        */
>> +       eth_dev->data->mtu = ETHER_MTU;
>> +
>> +       /* Invoke PMD device initialization function */
>> +       diag = (*eth_drv->eth_dev_init)(eth_dev);
>> +       if (diag == 0)
>> +               return 0;
>> +
>> +       RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(%s) failed\n",
>> +                           vmbus_drv->driver.name, ustr);
>> +
>> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
>> +               rte_free(eth_dev->data->dev_private);
>> +
>> +       return diag;
>> +}
>> +
>> +int
>> +rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev)
>> +{
>> +       const struct eth_driver *eth_drv;
>> +       struct rte_eth_dev *eth_dev;
>> +       char ustr[UUID_BUF_SZ];
>> +       int ret;
>> +
>> +       if (vmbus_dev == NULL)
>> +               return -EINVAL;
>> +
>> +       uuid_unparse(vmbus_dev->device_id, ustr);
>> +       eth_dev = rte_eth_dev_allocated(ustr);
>> +       if (eth_dev == NULL)
>> +               return -ENODEV;
>> +
>> +       eth_drv = (const struct eth_driver *)vmbus_dev->driver;
>> +
>> +       /* Invoke PMD device uninit function */
>> +       if (*eth_drv->eth_dev_uninit) {
>> +               ret = (*eth_drv->eth_dev_uninit)(eth_dev);
>> +               if (ret)
>> +                       return ret;
>> +       }
>> +
>> +       /* free ether device */
>> +       rte_eth_dev_release_port(eth_dev);
>> +
>> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
>> +               rte_free(eth_dev->data->dev_private);
>> +
>> +       eth_dev->device = NULL;
>> +       eth_dev->driver = NULL;
>> +       eth_dev->data = NULL;
>> +
>> +       return 0;
>> +}
>> +#endif
>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>> index 1a62a322..2a8c1eed 100644
>> --- a/lib/librte_ether/rte_ethdev.h
>> +++ b/lib/librte_ether/rte_ethdev.h
>> @@ -180,6 +180,9 @@ extern "C" {
>>  #include <rte_log.h>
>>  #include <rte_interrupts.h>
>>  #include <rte_pci.h>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +#include <rte_vmbus.h>
>> +#endif
>>  #include <rte_dev.h>
>>  #include <rte_devargs.h>
>>  #include <rte_errno.h>
>> @@ -1908,6 +1911,17 @@ struct rte_pci_eth_driver {
>>         struct eth_driver       eth_drv;        /**< Ethernet driver. */
>>  };
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +/**
>> + * @internal
>> + * The structure associated with a PMD VMBUS Ethernet driver.
>> + */
>> +struct rte_vmbus_eth_driver {
>> +       struct rte_vmbus_driver vmbus_drv;      /**< Underlying VMBUS driver. */
>> +       struct eth_driver       eth_drv;        /**< Ethernet driver. */
>> +};
>> +#endif
>> +
>>  /**
>>   * Convert a numerical speed in Mbps to a bitmap flag that can be used in
>>   * the bitmap link_speeds of the struct rte_eth_conf
>> @@ -4543,6 +4557,23 @@ int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>>   */
>>  int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
>>
>> +#ifdef RTE_LIBRTE_HV_PMD
>> +/**
>> + * @internal
>> + * Wrapper for use by vmbus drivers as a .probe function to attach to a ethdev
>> + * interface.
>> + */
>> +int rte_eth_dev_vmbus_probe(struct rte_vmbus_driver *vmbus_drv,
>> +                         struct rte_vmbus_device *vmbus_dev);
>> +
>> +/**
>> + * @internal
>> + * Wrapper for use by vmbus drivers as a .remove function to detach a ethdev
>> + * interface.
>> + */
>> +int rte_eth_dev_vmbus_remove(struct rte_vmbus_device *vmbus_dev);
>> +#endif
>> +
>>  #ifdef __cplusplus
>>  }
>>  #endif
>> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
>> index f75f0e24..6b304084 100644
>> --- a/mk/rte.app.mk
>> +++ b/mk/rte.app.mk
>> @@ -130,6 +130,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST)      += -lrte_pmd_vhost
>>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)    += -lrte_pmd_vmxnet3_uio
>> +_LDLIBS-$(CONFIG_RTE_LIBRTE_HV_PMD)        += -luuid
>>
>>  ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)    += -lrte_pmd_aesni_mb
>> --
>> 2.11.0
>>
^ permalink raw reply	[flat|nested] 30+ messages in thread
- * Re: [dpdk-dev] [PATCH 8/8] eal: VMBUS infrastructure
  2017-01-11 21:13     ` Jan Blunck
@ 2017-01-12  1:20       ` Stephen Hemminger
  0 siblings, 0 replies; 30+ messages in thread
From: Stephen Hemminger @ 2017-01-12  1:20 UTC (permalink / raw)
  To: Jan Blunck; +Cc: dev, Stephen Hemminger
On Wed, 11 Jan 2017 22:13:32 +0100
Jan Blunck <jblunck@infradead.org> wrote:
> >> +static void *vmbus_map_addr;
> >> +
> >> +static struct rte_tailq_elem rte_vmbus_uio_tailq = {
> >> +       .name = "UIO_RESOURCE_LIST",  
> 
> This should be VMBUS_UIO_RESOURCE_LIST to not collide with rte_uio_tailq.
Ok, please trim review comments. Trying to find comment in middle of
patch is a nuisance.
^ permalink raw reply	[flat|nested] 30+ messages in thread