DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
@ 2019-12-25 15:19 Matan Azrad
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class Matan Azrad
                   ` (4 more replies)
  0 siblings, 5 replies; 50+ messages in thread
From: Matan Azrad @ 2019-12-25 15:19 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

As discussed and as described in RFC "[RFC] net: new vdpa PMD for Mellanox devices",
new vDPA driver is going to be added for Mellanox devices - vDPA mlx5 and more.

The only vDPA driver now is the IFC driver that is located in net directory.

The IFC driver and the new vDPA mlx5 driver provide the vDPA ops introduced in librte_vhost and not the eth-dev ops.
All the others drivers in net class provide the eth-dev ops.
The set of features is also different.

Create a new class for vDPA drivers and move IFC to this class.
Later, all the new drivers that implement the vDPA ops will be added to the vDPA class.

Also, a vDPA device driver features list was added to vDPA documentation.

Please review the features list and the series.

Later on, I'm going to send the vDPA mlx5 driver.

Thanks.


Matan Azrad (3):
  drivers: introduce vDPA class
  doc: add vDPA feature table
  drivers: move ifc driver to the vDPA class

 MAINTAINERS                               |    6 +-
 doc/guides/conf.py                        |    5 +
 doc/guides/index.rst                      |    1 +
 doc/guides/nics/features/ifcvf.ini        |    8 -
 doc/guides/nics/ifc.rst                   |  106 ---
 doc/guides/nics/index.rst                 |    1 -
 doc/guides/vdpadevs/features/default.ini  |   55 ++
 doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
 doc/guides/vdpadevs/features_overview.rst |   65 ++
 doc/guides/vdpadevs/ifc.rst               |  106 +++
 doc/guides/vdpadevs/index.rst             |   15 +
 drivers/Makefile                          |    2 +
 drivers/meson.build                       |    1 +
 drivers/net/Makefile                      |    3 -
 drivers/net/ifc/Makefile                  |   34 -
 drivers/net/ifc/base/ifcvf.c              |  329 --------
 drivers/net/ifc/base/ifcvf.h              |  162 ----
 drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
 drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
 drivers/net/ifc/meson.build               |    9 -
 drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
 drivers/net/meson.build                   |    1 -
 drivers/vdpa/Makefile                     |   14 +
 drivers/vdpa/ifc/Makefile                 |   34 +
 drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
 drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
 drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
 drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280 +++++++++++++++++++++++++++++
 drivers/vdpa/ifc/meson.build              |    9 +
 drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
 drivers/vdpa/meson.build                  |    8 +
 31 files changed, 2152 insertions(+), 1991 deletions(-)
 delete mode 100644 doc/guides/nics/features/ifcvf.ini
 delete mode 100644 doc/guides/nics/ifc.rst
 create mode 100644 doc/guides/vdpadevs/features/default.ini
 create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
 create mode 100644 doc/guides/vdpadevs/features_overview.rst
 create mode 100644 doc/guides/vdpadevs/ifc.rst
 create mode 100644 doc/guides/vdpadevs/index.rst
 delete mode 100644 drivers/net/ifc/Makefile
 delete mode 100644 drivers/net/ifc/base/ifcvf.c
 delete mode 100644 drivers/net/ifc/base/ifcvf.h
 delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 delete mode 100644 drivers/net/ifc/meson.build
 delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/Makefile
 create mode 100644 drivers/vdpa/ifc/Makefile
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
 create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/vdpa/ifc/meson.build
 create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/meson.build

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class
  2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
@ 2019-12-25 15:19 ` Matan Azrad
  2020-01-07 17:32   ` Maxime Coquelin
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table Matan Azrad
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2019-12-25 15:19 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

The vDPA (vhost data path acceleration) drivers provide support for
the vDPA operations introduced by the rte_vhost library.

Any driver which provides the vDPA operations should be moved\added to
the vdpa class under drivers/vdpa/.

Create the general files for vDPA class in drivers and in documentation.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/index.rst          |  1 +
 doc/guides/vdpadevs/index.rst | 13 +++++++++++++
 drivers/Makefile              |  2 ++
 drivers/meson.build           |  1 +
 drivers/vdpa/Makefile         |  8 ++++++++
 drivers/vdpa/meson.build      |  8 ++++++++
 6 files changed, 33 insertions(+)
 create mode 100644 doc/guides/vdpadevs/index.rst
 create mode 100644 drivers/vdpa/Makefile
 create mode 100644 drivers/vdpa/meson.build

diff --git a/doc/guides/index.rst b/doc/guides/index.rst
index 8a1601b..988c6ea 100644
--- a/doc/guides/index.rst
+++ b/doc/guides/index.rst
@@ -19,6 +19,7 @@ DPDK documentation
    bbdevs/index
    cryptodevs/index
    compressdevs/index
+   vdpadevs/index
    eventdevs/index
    rawdevs/index
    mempool/index
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
new file mode 100644
index 0000000..d69dc91
--- /dev/null
+++ b/doc/guides/vdpadevs/index.rst
@@ -0,0 +1,13 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright 2019 Mellanox Technologies, Ltd
+
+vDPA Device Drivers
+===================
+
+The following are a list of vDPA(vhost data path acceleration) device drivers,
+which can be used from an application through vhost API.
+
+.. toctree::
+    :maxdepth: 2
+    :numbered:
+
diff --git a/drivers/Makefile b/drivers/Makefile
index 7d5da5d..46374ca 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -18,6 +18,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += common/qat
 DEPDIRS-common/qat := bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_COMPRESSDEV) += compress
 DEPDIRS-compress := bus mempool
+DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vdpa
+DEPDIRS-vdpa := common bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += event
 DEPDIRS-event := common bus mempool net
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += raw
diff --git a/drivers/meson.build b/drivers/meson.build
index 32d68aa..d271667 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -13,6 +13,7 @@ dpdk_driver_classes = ['common',
 	       'raw',     # depends on common, bus and net.
 	       'crypto',  # depends on common, bus and mempool (net in future).
 	       'compress', # depends on common, bus, mempool.
+	       'vdpa',    # depends on common, bus and mempool.
 	       'event',   # depends on common, bus, mempool and net.
 	       'baseband'] # depends on common and bus.
 
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
new file mode 100644
index 0000000..82a2b70
--- /dev/null
+++ b/drivers/vdpa/Makefile
@@ -0,0 +1,8 @@
+#   SPDX-License-Identifier: BSD-3-Clause
+#   Copyright 2019 Mellanox Technologies, Ltd
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# DIRS-$(<configuration>) += <directory>
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
new file mode 100644
index 0000000..a839ff5
--- /dev/null
+++ b/drivers/vdpa/meson.build
@@ -0,0 +1,8 @@
+#   SPDX-License-Identifier: BSD-3-Clause
+#   Copyright 2019 Mellanox Technologies, Ltd
+
+drivers = []
+std_deps = ['bus_pci', 'kvargs']
+std_deps += ['vhost']
+config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
+driver_name_fmt = 'rte_pmd_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class Matan Azrad
@ 2019-12-25 15:19 ` Matan Azrad
  2020-01-07 17:39   ` Maxime Coquelin
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2019-12-25 15:19 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

Add vDPA devices features table and explanation.

Any vDPA driver can add its own supported features by ading a new ini
file to the features directory in doc/guides/vdpadevs/features.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/conf.py                        |  5 +++
 doc/guides/vdpadevs/features/default.ini  | 55 ++++++++++++++++++++++++++
 doc/guides/vdpadevs/features_overview.rst | 65 +++++++++++++++++++++++++++++++
 doc/guides/vdpadevs/index.rst             |  1 +
 4 files changed, 126 insertions(+)
 create mode 100644 doc/guides/vdpadevs/features/default.ini
 create mode 100644 doc/guides/vdpadevs/features_overview.rst

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 0892c06..c368fa5 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -401,6 +401,11 @@ def setup(app):
                             'Features',
                             'Features availability in compression drivers',
                             'Feature')
+    table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
+    generate_overview_table(table_file, 1,
+                            'Features',
+                            'Features availability in vDPA drivers',
+                            'Feature')
 
     if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
         print('Upgrade sphinx to version >= 1.3.1 for '
diff --git a/doc/guides/vdpadevs/features/default.ini b/doc/guides/vdpadevs/features/default.ini
new file mode 100644
index 0000000..a3e0bc7
--- /dev/null
+++ b/doc/guides/vdpadevs/features/default.ini
@@ -0,0 +1,55 @@
+;
+; Features of a default vDPA driver.
+;
+; This file defines the features that are valid for inclusion in
+; the other driver files and also the order that they appear in
+; the features table in the documentation. The feature description
+; string should not exceed feature_str_len defined in conf.py.
+;
+[Features]
+csum                 =
+guest csum           =
+mac                  =
+gso                  =
+guest tso4           =
+guest tso6           =
+ecn                  =
+ufo                  =
+host tso4            =
+host tso6            =
+mrg rxbuf            =
+ctrl vq              =
+ctrl rx              =
+any layout           =
+guest announce       =
+mq                   =
+version 1            =
+log all              =
+protocol features    =
+indirect desc        =
+event idx            =
+mtu                  =
+in_order             =
+IOMMU platform       =
+packed               =
+proto mq             =
+proto log shmfd      =
+proto rarp           =
+proto reply ack      =
+proto slave req      =
+proto crypto session =
+proto host notifier  =
+proto pagefault      =
+Multiprocess aware   =
+BSD nic_uio          =
+Linux UIO            =
+Linux VFIO           =
+Other kdrv           =
+ARMv7                =
+ARMv8                =
+Power8               =
+x86-32               =
+x86-64               =
+Usage doc            =
+Design doc           =
+Perf doc             =
\ No newline at end of file
diff --git a/doc/guides/vdpadevs/features_overview.rst b/doc/guides/vdpadevs/features_overview.rst
new file mode 100644
index 0000000..c7745b7
--- /dev/null
+++ b/doc/guides/vdpadevs/features_overview.rst
@@ -0,0 +1,65 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright 2019 Mellanox Technologies, Ltd
+
+Overview of vDPA drivers features
+=================================
+
+This section explains the supported features that are listed in the table below.
+
+  * csum - Device can handle packets with partial checksum.
+  * guest csum - Guest can handle packets with partial checksum.
+  * mac - Device has given MAC address.
+  * gso - Device can handle packets with any GSO type.
+  * guest tso4 - Guest can receive TSOv4.
+  * guest tso6 - Guest can receive TSOv6.
+  * ecn - Device can receive TSO with ECN.
+  * ufo - Device can receive UFO.
+  * host tso4 - Device can receive TSOv4.
+  * host tso6 - Device can receive TSOv6.
+  * mrg rxbuf - Guest can merge receive buffers.
+  * ctrl vq - Control channel is available.
+  * ctrl rx - Control channel RX mode support.
+  * any layout - Device can handle any descriptor layout.
+  * guest announce - Guest can send gratuitous packets.
+  * mq - Device supports Receive Flow Steering.
+  * version 1 - v1.0 compliant.
+  * log all - Device can log all write descriptors (live migration).
+  * protocol features - Protocol features negotiation support.
+  * indirect desc - Indirect buffer descriptors support.
+  * event idx - Support for avail_idx and used_idx fields.
+  * mtu - Host can advise the guest with its maximum supported MTU.
+  * in_order - Device can use descriptors in ring order.
+  * IOMMU platform - Device support IOMMU addresses.
+  * packed - Device support packed virtio queues.
+  * proto mq - Support the number of queues query.
+  * proto log shmfd - Guest support setting log base.
+  * proto rarp - Host can broadcast a fake RARP after live migration.
+  * proto reply ack - Host support requested operation status ack. 
+  * proto slave req - Allow the slave to make requests to the master.
+  * proto crypto session - Support crypto session creation.
+  * proto host notifier - Host can register memory region based host notifiers.
+  * proto pagefault - Slave expose page-fault FD for migration process.
+  * Multiprocess aware - Driver can be used for primary-secondary process model.
+  * BSD nic_uio - BSD ``nic_uio`` module supported.
+  * Linux UIO - Works with ``igb_uio`` kernel module.
+  * Linux VFIO - Works with ``vfio-pci`` kernel module.
+  * Other kdrv - Kernel module other than above ones supported.
+  * ARMv7 - Support armv7 architecture.
+  * ARMv8 - Support armv8a (64bit) architecture.
+  * Power8 - Support PowerPC architecture.
+  * x86-32 - Support 32bits x86 architecture.
+  * x86-64 - Support 64bits x86 architecture.
+  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
+  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
+  * Perf doc - Documentation describes performance values, In ``doc/perf/``.
+
+
+
+.. _table_vdpa_pmd_features:
+
+.. include:: overview_feature_table.txt
+
+.. Note::
+
+   Features marked with "P" are partially supported. Refer to the appropriate
+   driver guide in the following sections for details.
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
index d69dc91..89e2b03 100644
--- a/doc/guides/vdpadevs/index.rst
+++ b/doc/guides/vdpadevs/index.rst
@@ -11,3 +11,4 @@ which can be used from an application through vhost API.
     :maxdepth: 2
     :numbered:
 
+    features_overview
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class
  2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class Matan Azrad
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table Matan Azrad
@ 2019-12-25 15:19 ` Matan Azrad
  2020-01-07 18:17   ` Maxime Coquelin
  2020-01-07  7:57 ` [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
  4 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2019-12-25 15:19 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

A new vDPA class was recently introduced.

IFC driver implements the vDPA operations, hence it should be moved to
the vDPA class.

Move it.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 MAINTAINERS                              |    6 +-
 doc/guides/nics/features/ifcvf.ini       |    8 -
 doc/guides/nics/ifc.rst                  |  106 ---
 doc/guides/nics/index.rst                |    1 -
 doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
 doc/guides/vdpadevs/ifc.rst              |  106 +++
 doc/guides/vdpadevs/index.rst            |    1 +
 drivers/net/Makefile                     |    3 -
 drivers/net/ifc/Makefile                 |   34 -
 drivers/net/ifc/base/ifcvf.c             |  329 --------
 drivers/net/ifc/base/ifcvf.h             |  162 ----
 drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
 drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
 drivers/net/ifc/meson.build              |    9 -
 drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
 drivers/net/meson.build                  |    1 -
 drivers/vdpa/Makefile                    |    6 +
 drivers/vdpa/ifc/Makefile                |   34 +
 drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
 drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
 drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
 drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280 ++++++++++++++++++++++++++++++
 drivers/vdpa/ifc/meson.build             |    9 +
 drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
 drivers/vdpa/meson.build                 |    2 +-
 25 files changed, 1994 insertions(+), 1992 deletions(-)
 delete mode 100644 doc/guides/nics/features/ifcvf.ini
 delete mode 100644 doc/guides/nics/ifc.rst
 create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
 create mode 100644 doc/guides/vdpadevs/ifc.rst
 delete mode 100644 drivers/net/ifc/Makefile
 delete mode 100644 drivers/net/ifc/base/ifcvf.c
 delete mode 100644 drivers/net/ifc/base/ifcvf.h
 delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 delete mode 100644 drivers/net/ifc/meson.build
 delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/ifc/Makefile
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
 create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/vdpa/ifc/meson.build
 create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 9b5c80f..87abf60 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -682,9 +682,9 @@ F: doc/guides/nics/features/iavf*.ini
 Intel ifc
 M: Xiao Wang <xiao.w.wang@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/ifc/
-F: doc/guides/nics/ifc.rst
-F: doc/guides/nics/features/ifc*.ini
+F: drivers/vdpa/ifc/
+F: doc/guides/vdpadevs/ifc.rst
+F: doc/guides/vdpadevs/features/ifcvf.ini
 
 Intel ice
 M: Qiming Yang <qiming.yang@intel.com>
diff --git a/doc/guides/nics/features/ifcvf.ini b/doc/guides/nics/features/ifcvf.ini
deleted file mode 100644
index ef1fc47..0000000
--- a/doc/guides/nics/features/ifcvf.ini
+++ /dev/null
@@ -1,8 +0,0 @@
-;
-; Supported features of the 'ifcvf' vDPA driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-x86-32               = Y
-x86-64               = Y
diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst
deleted file mode 100644
index 12a2a34..0000000
--- a/doc/guides/nics/ifc.rst
+++ /dev/null
@@ -1,106 +0,0 @@
-..  SPDX-License-Identifier: BSD-3-Clause
-    Copyright(c) 2018 Intel Corporation.
-
-IFCVF vDPA driver
-=================
-
-The IFCVF vDPA (vhost data path acceleration) driver provides support for the
-Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
-works as a HW vhost backend which can send/receive packets to/from virtio
-directly by DMA. Besides, it supports dirty page logging and device state
-report/restore, this driver enables its vDPA functionality.
-
-
-Pre-Installation Configuration
-------------------------------
-
-Config File Options
-~~~~~~~~~~~~~~~~~~~
-
-The following option can be modified in the ``config`` file.
-
-- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
-
-  Toggle compilation of the ``librte_pmd_ifc`` driver.
-
-
-IFCVF vDPA Implementation
--------------------------
-
-IFCVF's vendor ID and device ID are same as that of virtio net pci device,
-with its specific subsystem vendor ID and device ID. To let the device be
-probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
-device is to be used in vDPA mode, rather than polling mode, virtio pmd will
-skip when it detects this message. If no this parameter specified, device
-will not be used as a vDPA device, and it will be driven by virtio pmd.
-
-Different VF devices serve different virtio frontends which are in different
-VMs, so each VF needs to have its own DMA address translation service. During
-the driver probe a new container is created for this device, with this
-container vDPA driver can program DMA remapping table with the VM's memory
-region information.
-
-The device argument "sw-live-migration=1" will configure the driver into SW
-assisted live migration mode. In this mode, the driver will set up a SW relay
-thread when LM happens, this thread will help device to log dirty pages. Thus
-this mode does not require HW to implement a dirty page logging function block,
-but will consume some percentage of CPU resource depending on the network
-throughput. If no this parameter specified, driver will rely on device's logging
-capability.
-
-Key IFCVF vDPA driver ops
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-- ifcvf_dev_config:
-  Enable VF data path with virtio information provided by vhost lib, including
-  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
-  route HW interrupt to virtio driver, create notify relay thread to translate
-  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
-
-  This function gets called to set up HW data path backend when virtio driver
-  in VM gets ready.
-
-- ifcvf_dev_close:
-  Revoke all the setup in ifcvf_dev_config.
-
-  This function gets called when virtio driver stops device in VM.
-
-To create a vhost port with IFC VF
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-- Create a vhost socket and assign a VF's device ID to this socket via
-  vhost API. When QEMU vhost connection gets ready, the assigned VF will
-  get configured automatically.
-
-
-Features
---------
-
-Features of the IFCVF driver are:
-
-- Compatibility with virtio 0.95 and 1.0.
-- SW assisted vDPA live migration.
-
-
-Prerequisites
--------------
-
-- Platform with IOMMU feature. IFC VF needs address translation service to
-  Rx/Tx directly with virtio driver in VM.
-
-
-Limitations
------------
-
-Dependency on vfio-pci
-~~~~~~~~~~~~~~~~~~~~~~
-
-vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector
-is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
-allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
-
-Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-IFC VF doesn't support RARP packet generation, virtio frontend supporting
-VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index d61c27f..8c540c0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -31,7 +31,6 @@ Network Interface Controller Drivers
     hns3
     i40e
     ice
-    ifc
     igb
     ipn3ke
     ixgbe
diff --git a/doc/guides/vdpadevs/features/ifcvf.ini b/doc/guides/vdpadevs/features/ifcvf.ini
new file mode 100644
index 0000000..ef1fc47
--- /dev/null
+++ b/doc/guides/vdpadevs/features/ifcvf.ini
@@ -0,0 +1,8 @@
+;
+; Supported features of the 'ifcvf' vDPA driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+x86-32               = Y
+x86-64               = Y
diff --git a/doc/guides/vdpadevs/ifc.rst b/doc/guides/vdpadevs/ifc.rst
new file mode 100644
index 0000000..12a2a34
--- /dev/null
+++ b/doc/guides/vdpadevs/ifc.rst
@@ -0,0 +1,106 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018 Intel Corporation.
+
+IFCVF vDPA driver
+=================
+
+The IFCVF vDPA (vhost data path acceleration) driver provides support for the
+Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
+works as a HW vhost backend which can send/receive packets to/from virtio
+directly by DMA. Besides, it supports dirty page logging and device state
+report/restore, this driver enables its vDPA functionality.
+
+
+Pre-Installation Configuration
+------------------------------
+
+Config File Options
+~~~~~~~~~~~~~~~~~~~
+
+The following option can be modified in the ``config`` file.
+
+- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
+
+  Toggle compilation of the ``librte_pmd_ifc`` driver.
+
+
+IFCVF vDPA Implementation
+-------------------------
+
+IFCVF's vendor ID and device ID are same as that of virtio net pci device,
+with its specific subsystem vendor ID and device ID. To let the device be
+probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
+device is to be used in vDPA mode, rather than polling mode, virtio pmd will
+skip when it detects this message. If no this parameter specified, device
+will not be used as a vDPA device, and it will be driven by virtio pmd.
+
+Different VF devices serve different virtio frontends which are in different
+VMs, so each VF needs to have its own DMA address translation service. During
+the driver probe a new container is created for this device, with this
+container vDPA driver can program DMA remapping table with the VM's memory
+region information.
+
+The device argument "sw-live-migration=1" will configure the driver into SW
+assisted live migration mode. In this mode, the driver will set up a SW relay
+thread when LM happens, this thread will help device to log dirty pages. Thus
+this mode does not require HW to implement a dirty page logging function block,
+but will consume some percentage of CPU resource depending on the network
+throughput. If no this parameter specified, driver will rely on device's logging
+capability.
+
+Key IFCVF vDPA driver ops
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- ifcvf_dev_config:
+  Enable VF data path with virtio information provided by vhost lib, including
+  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
+  route HW interrupt to virtio driver, create notify relay thread to translate
+  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
+
+  This function gets called to set up HW data path backend when virtio driver
+  in VM gets ready.
+
+- ifcvf_dev_close:
+  Revoke all the setup in ifcvf_dev_config.
+
+  This function gets called when virtio driver stops device in VM.
+
+To create a vhost port with IFC VF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- Create a vhost socket and assign a VF's device ID to this socket via
+  vhost API. When QEMU vhost connection gets ready, the assigned VF will
+  get configured automatically.
+
+
+Features
+--------
+
+Features of the IFCVF driver are:
+
+- Compatibility with virtio 0.95 and 1.0.
+- SW assisted vDPA live migration.
+
+
+Prerequisites
+-------------
+
+- Platform with IOMMU feature. IFC VF needs address translation service to
+  Rx/Tx directly with virtio driver in VM.
+
+
+Limitations
+-----------
+
+Dependency on vfio-pci
+~~~~~~~~~~~~~~~~~~~~~~
+
+vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector
+is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
+allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
+
+Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+IFC VF doesn't support RARP packet generation, virtio frontend supporting
+VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
index 89e2b03..6cf0827 100644
--- a/doc/guides/vdpadevs/index.rst
+++ b/doc/guides/vdpadevs/index.rst
@@ -12,3 +12,4 @@ which can be used from an application through vhost API.
     :numbered:
 
     features_overview
+    ifc
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index cee3036..cca3c44 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -71,9 +71,6 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
 
 ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
-ifeq ($(CONFIG_RTE_EAL_VFIO),y)
-DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
-endif
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 
 ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifc/Makefile
deleted file mode 100644
index fe227b8..0000000
--- a/drivers/net/ifc/Makefile
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 Intel Corporation
-
-include $(RTE_SDK)/mk/rte.vars.mk
-
-#
-# library name
-#
-LIB = librte_pmd_ifc.a
-
-LDLIBS += -lpthread
-LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
-LDLIBS += -lrte_kvargs
-
-CFLAGS += -O3
-CFLAGS += $(WERROR_FLAGS)
-CFLAGS += -DALLOW_EXPERIMENTAL_API
-
-#
-# Add extra flags for base driver source files to disable warnings in them
-#
-BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/base/*.c))))
-
-VPATH += $(SRCDIR)/base
-
-EXPORT_MAP := rte_pmd_ifc_version.map
-
-#
-# all source are stored in SRCS-y
-#
-SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
-SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
-
-include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifc/base/ifcvf.c
deleted file mode 100644
index 3c0b2df..0000000
--- a/drivers/net/ifc/base/ifcvf.c
+++ /dev/null
@@ -1,329 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include "ifcvf.h"
-#include "ifcvf_osdep.h"
-
-STATIC void *
-get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
-{
-	u8 bar = cap->bar;
-	u32 length = cap->length;
-	u32 offset = cap->offset;
-
-	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
-		DEBUGOUT("invalid bar: %u\n", bar);
-		return NULL;
-	}
-
-	if (offset + length < offset) {
-		DEBUGOUT("offset(%u) + length(%u) overflows\n",
-			offset, length);
-		return NULL;
-	}
-
-	if (offset + length > hw->mem_resource[cap->bar].len) {
-		DEBUGOUT("offset(%u) + length(%u) overflows bar length(%u)",
-			offset, length, (u32)hw->mem_resource[cap->bar].len);
-		return NULL;
-	}
-
-	return hw->mem_resource[bar].addr + offset;
-}
-
-int
-ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
-{
-	int ret;
-	u8 pos;
-	struct ifcvf_pci_cap cap;
-
-	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
-	if (ret < 0) {
-		DEBUGOUT("failed to read pci capability list\n");
-		return -1;
-	}
-
-	while (pos) {
-		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
-				sizeof(cap), pos);
-		if (ret < 0) {
-			DEBUGOUT("failed to read cap at pos: %x", pos);
-			break;
-		}
-
-		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
-			goto next;
-
-		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
-				"len: %u\n", cap.cfg_type, cap.bar,
-				cap.offset, cap.length);
-
-		switch (cap.cfg_type) {
-		case IFCVF_PCI_CAP_COMMON_CFG:
-			hw->common_cfg = get_cap_addr(hw, &cap);
-			break;
-		case IFCVF_PCI_CAP_NOTIFY_CFG:
-			PCI_READ_CONFIG_DWORD(dev, &hw->notify_off_multiplier,
-					pos + sizeof(cap));
-			hw->notify_base = get_cap_addr(hw, &cap);
-			hw->notify_region = cap.bar;
-			break;
-		case IFCVF_PCI_CAP_ISR_CFG:
-			hw->isr = get_cap_addr(hw, &cap);
-			break;
-		case IFCVF_PCI_CAP_DEVICE_CFG:
-			hw->dev_cfg = get_cap_addr(hw, &cap);
-			break;
-		}
-next:
-		pos = cap.cap_next;
-	}
-
-	hw->lm_cfg = hw->mem_resource[4].addr;
-
-	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
-			hw->isr == NULL || hw->dev_cfg == NULL) {
-		DEBUGOUT("capability incomplete\n");
-		return -1;
-	}
-
-	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
-			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
-			"multiplier: %u\n",
-			hw->common_cfg, hw->dev_cfg,
-			hw->isr, hw->notify_base,
-			hw->notify_off_multiplier);
-
-	return 0;
-}
-
-STATIC u8
-ifcvf_get_status(struct ifcvf_hw *hw)
-{
-	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
-}
-
-STATIC void
-ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
-{
-	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
-}
-
-STATIC void
-ifcvf_reset(struct ifcvf_hw *hw)
-{
-	ifcvf_set_status(hw, 0);
-
-	/* flush status write */
-	while (ifcvf_get_status(hw))
-		msec_delay(1);
-}
-
-STATIC void
-ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
-{
-	if (status != 0)
-		status |= ifcvf_get_status(hw);
-
-	ifcvf_set_status(hw, status);
-	ifcvf_get_status(hw);
-}
-
-u64
-ifcvf_get_features(struct ifcvf_hw *hw)
-{
-	u32 features_lo, features_hi;
-	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
-	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
-
-	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
-	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
-
-	return ((u64)features_hi << 32) | features_lo;
-}
-
-STATIC void
-ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
-{
-	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
-	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg->guest_feature);
-
-	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
-	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
-}
-
-STATIC int
-ifcvf_config_features(struct ifcvf_hw *hw)
-{
-	u64 host_features;
-
-	host_features = ifcvf_get_features(hw);
-	hw->req_features &= host_features;
-
-	ifcvf_set_features(hw, hw->req_features);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
-
-	if (!(ifcvf_get_status(hw) & IFCVF_CONFIG_STATUS_FEATURES_OK)) {
-		DEBUGOUT("failed to set FEATURES_OK status\n");
-		return -1;
-	}
-
-	return 0;
-}
-
-STATIC void
-io_write64_twopart(u64 val, u32 *lo, u32 *hi)
-{
-	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
-	IFCVF_WRITE_REG32(val >> 32, hi);
-}
-
-STATIC int
-ifcvf_hw_enable(struct ifcvf_hw *hw)
-{
-	struct ifcvf_pci_common_cfg *cfg;
-	u8 *lm_cfg;
-	u32 i;
-	u16 notify_off;
-
-	cfg = hw->common_cfg;
-	lm_cfg = hw->lm_cfg;
-
-	IFCVF_WRITE_REG16(0, &cfg->msix_config);
-	if (IFCVF_READ_REG16(&cfg->msix_config) == IFCVF_MSI_NO_VECTOR) {
-		DEBUGOUT("msix vec alloc failed for device config\n");
-		return -1;
-	}
-
-	for (i = 0; i < hw->nr_vring; i++) {
-		IFCVF_WRITE_REG16(i, &cfg->queue_select);
-		io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
-				&cfg->queue_desc_hi);
-		io_write64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
-				&cfg->queue_avail_hi);
-		io_write64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
-				&cfg->queue_used_hi);
-		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
-
-		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
-				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
-			(u32)hw->vring[i].last_avail_idx |
-			((u32)hw->vring[i].last_used_idx << 16);
-
-		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
-		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
-				IFCVF_MSI_NO_VECTOR) {
-			DEBUGOUT("queue %u, msix vec alloc failed\n",
-					i);
-			return -1;
-		}
-
-		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
-		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
-				notify_off * hw->notify_off_multiplier);
-		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
-	}
-
-	return 0;
-}
-
-STATIC void
-ifcvf_hw_disable(struct ifcvf_hw *hw)
-{
-	u32 i;
-	struct ifcvf_pci_common_cfg *cfg;
-	u32 ring_state;
-
-	cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
-	for (i = 0; i < hw->nr_vring; i++) {
-		IFCVF_WRITE_REG16(i, &cfg->queue_select);
-		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
-		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->queue_msix_vector);
-		ring_state = *(u32 *)(hw->lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
-				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
-		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
-		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
-	}
-}
-
-int
-ifcvf_start_hw(struct ifcvf_hw *hw)
-{
-	ifcvf_reset(hw);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
-
-	if (ifcvf_config_features(hw) < 0)
-		return -1;
-
-	if (ifcvf_hw_enable(hw) < 0)
-		return -1;
-
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
-	return 0;
-}
-
-void
-ifcvf_stop_hw(struct ifcvf_hw *hw)
-{
-	ifcvf_hw_disable(hw);
-	ifcvf_reset(hw);
-}
-
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
-{
-	u8 *lm_cfg;
-
-	lm_cfg = hw->lm_cfg;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
-		log_base & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
-		(log_base >> 32) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
-		(log_base + log_size) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
-		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_ENABLE_VF;
-}
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw)
-{
-	u8 *lm_cfg;
-
-	lm_cfg = hw->lm_cfg;
-	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
-}
-
-void
-ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
-{
-	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
-}
-
-u8
-ifcvf_get_notify_region(struct ifcvf_hw *hw)
-{
-	return hw->notify_region;
-}
-
-u64
-ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
-{
-	return (u8 *)hw->notify_addr[qid] -
-		(u8 *)hw->mem_resource[hw->notify_region].addr;
-}
diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
deleted file mode 100644
index 9be2770..0000000
--- a/drivers/net/ifc/base/ifcvf.h
+++ /dev/null
@@ -1,162 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _IFCVF_H_
-#define _IFCVF_H_
-
-#include "ifcvf_osdep.h"
-
-#define IFCVF_VENDOR_ID		0x1AF4
-#define IFCVF_DEVICE_ID		0x1041
-#define IFCVF_SUBSYS_VENDOR_ID	0x8086
-#define IFCVF_SUBSYS_DEVICE_ID	0x001A
-
-#define IFCVF_MAX_QUEUES		1
-#define VIRTIO_F_IOMMU_PLATFORM		33
-
-/* Common configuration */
-#define IFCVF_PCI_CAP_COMMON_CFG	1
-/* Notifications */
-#define IFCVF_PCI_CAP_NOTIFY_CFG	2
-/* ISR Status */
-#define IFCVF_PCI_CAP_ISR_CFG		3
-/* Device specific configuration */
-#define IFCVF_PCI_CAP_DEVICE_CFG	4
-/* PCI configuration access */
-#define IFCVF_PCI_CAP_PCI_CFG		5
-
-#define IFCVF_CONFIG_STATUS_RESET     0x00
-#define IFCVF_CONFIG_STATUS_ACK       0x01
-#define IFCVF_CONFIG_STATUS_DRIVER    0x02
-#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
-#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
-#define IFCVF_CONFIG_STATUS_FAILED    0x80
-
-#define IFCVF_MSI_NO_VECTOR	0xffff
-#define IFCVF_PCI_MAX_RESOURCE	6
-
-#define IFCVF_LM_CFG_SIZE		0x40
-#define IFCVF_LM_RING_STATE_OFFSET	0x20
-
-#define IFCVF_LM_LOGGING_CTRL		0x0
-
-#define IFCVF_LM_BASE_ADDR_LOW		0x10
-#define IFCVF_LM_BASE_ADDR_HIGH		0x14
-#define IFCVF_LM_END_ADDR_LOW		0x18
-#define IFCVF_LM_END_ADDR_HIGH		0x1c
-
-#define IFCVF_LM_DISABLE		0x0
-#define IFCVF_LM_ENABLE_VF		0x1
-#define IFCVF_LM_ENABLE_PF		0x3
-#define IFCVF_LOG_BASE			0x100000000000
-#define IFCVF_MEDIATED_VRING		0x200000000000
-
-#define IFCVF_32_BIT_MASK		0xffffffff
-
-
-struct ifcvf_pci_cap {
-	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
-	u8 cap_next;            /* Generic PCI field: next ptr. */
-	u8 cap_len;             /* Generic PCI field: capability length */
-	u8 cfg_type;            /* Identifies the structure. */
-	u8 bar;                 /* Where to find it. */
-	u8 padding[3];          /* Pad to full dword. */
-	u32 offset;             /* Offset within bar. */
-	u32 length;             /* Length of the structure, in bytes. */
-};
-
-struct ifcvf_pci_notify_cap {
-	struct ifcvf_pci_cap cap;
-	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
-};
-
-struct ifcvf_pci_common_cfg {
-	/* About the whole device. */
-	u32 device_feature_select;
-	u32 device_feature;
-	u32 guest_feature_select;
-	u32 guest_feature;
-	u16 msix_config;
-	u16 num_queues;
-	u8 device_status;
-	u8 config_generation;
-
-	/* About a specific virtqueue. */
-	u16 queue_select;
-	u16 queue_size;
-	u16 queue_msix_vector;
-	u16 queue_enable;
-	u16 queue_notify_off;
-	u32 queue_desc_lo;
-	u32 queue_desc_hi;
-	u32 queue_avail_lo;
-	u32 queue_avail_hi;
-	u32 queue_used_lo;
-	u32 queue_used_hi;
-};
-
-struct ifcvf_net_config {
-	u8    mac[6];
-	u16   status;
-	u16   max_virtqueue_pairs;
-} __attribute__((packed));
-
-struct ifcvf_pci_mem_resource {
-	u64      phys_addr; /**< Physical address, 0 if not resource. */
-	u64      len;       /**< Length of the resource. */
-	u8       *addr;     /**< Virtual address, NULL when not mapped. */
-};
-
-struct vring_info {
-	u64 desc;
-	u64 avail;
-	u64 used;
-	u16 size;
-	u16 last_avail_idx;
-	u16 last_used_idx;
-};
-
-struct ifcvf_hw {
-	u64    req_features;
-	u8     notify_region;
-	u32    notify_off_multiplier;
-	struct ifcvf_pci_common_cfg *common_cfg;
-	struct ifcvf_net_config *dev_cfg;
-	u8     *isr;
-	u16    *notify_base;
-	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
-	u8     *lm_cfg;
-	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
-	u8 nr_vring;
-	struct ifcvf_pci_mem_resource mem_resource[IFCVF_PCI_MAX_RESOURCE];
-};
-
-int
-ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
-
-u64
-ifcvf_get_features(struct ifcvf_hw *hw);
-
-int
-ifcvf_start_hw(struct ifcvf_hw *hw);
-
-void
-ifcvf_stop_hw(struct ifcvf_hw *hw);
-
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw);
-
-void
-ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
-
-u8
-ifcvf_get_notify_region(struct ifcvf_hw *hw);
-
-u64
-ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
-
-#endif /* _IFCVF_H_ */
diff --git a/drivers/net/ifc/base/ifcvf_osdep.h b/drivers/net/ifc/base/ifcvf_osdep.h
deleted file mode 100644
index 6aef25e..0000000
--- a/drivers/net/ifc/base/ifcvf_osdep.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _IFCVF_OSDEP_H_
-#define _IFCVF_OSDEP_H_
-
-#include <stdint.h>
-#include <linux/pci_regs.h>
-
-#include <rte_cycles.h>
-#include <rte_pci.h>
-#include <rte_bus_pci.h>
-#include <rte_log.h>
-#include <rte_io.h>
-
-#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
-#define STATIC                  static
-
-#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
-
-#define IFCVF_READ_REG8(reg)		rte_read8(reg)
-#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
-#define IFCVF_READ_REG16(reg)		rte_read16(reg)
-#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
-#define IFCVF_READ_REG32(reg)		rte_read32(reg)
-#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
-
-typedef struct rte_pci_device PCI_DEV;
-
-#define PCI_READ_CONFIG_BYTE(dev, val, where) \
-	rte_pci_read_config(dev, val, 1, where)
-
-#define PCI_READ_CONFIG_DWORD(dev, val, where) \
-	rte_pci_read_config(dev, val, 4, where)
-
-typedef uint8_t    u8;
-typedef int8_t     s8;
-typedef uint16_t   u16;
-typedef int16_t    s16;
-typedef uint32_t   u32;
-typedef int32_t    s32;
-typedef int64_t    s64;
-typedef uint64_t   u64;
-
-static inline int
-PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int where)
-{
-	return rte_pci_read_config(dev, val, size, where);
-}
-
-#endif /* _IFCVF_OSDEP_H_ */
diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
deleted file mode 100644
index da4667b..0000000
--- a/drivers/net/ifc/ifcvf_vdpa.c
+++ /dev/null
@@ -1,1280 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include <unistd.h>
-#include <pthread.h>
-#include <fcntl.h>
-#include <string.h>
-#include <sys/ioctl.h>
-#include <sys/epoll.h>
-#include <linux/virtio_net.h>
-#include <stdbool.h>
-
-#include <rte_malloc.h>
-#include <rte_memory.h>
-#include <rte_bus_pci.h>
-#include <rte_vhost.h>
-#include <rte_vdpa.h>
-#include <rte_vfio.h>
-#include <rte_spinlock.h>
-#include <rte_log.h>
-#include <rte_kvargs.h>
-#include <rte_devargs.h>
-
-#include "base/ifcvf.h"
-
-#define DRV_LOG(level, fmt, args...) \
-	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
-		"IFCVF %s(): " fmt "\n", __func__, ##args)
-
-#ifndef PAGE_SIZE
-#define PAGE_SIZE 4096
-#endif
-
-#define IFCVF_USED_RING_LEN(size) \
-	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
-
-#define IFCVF_VDPA_MODE		"vdpa"
-#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
-
-static const char * const ifcvf_valid_arguments[] = {
-	IFCVF_VDPA_MODE,
-	IFCVF_SW_FALLBACK_LM,
-	NULL
-};
-
-static int ifcvf_vdpa_logtype;
-
-struct ifcvf_internal {
-	struct rte_vdpa_dev_addr dev_addr;
-	struct rte_pci_device *pdev;
-	struct ifcvf_hw hw;
-	int vfio_container_fd;
-	int vfio_group_fd;
-	int vfio_dev_fd;
-	pthread_t tid;	/* thread for notify relay */
-	int epfd;
-	int vid;
-	int did;
-	uint16_t max_queues;
-	uint64_t features;
-	rte_atomic32_t started;
-	rte_atomic32_t dev_attached;
-	rte_atomic32_t running;
-	rte_spinlock_t lock;
-	bool sw_lm;
-	bool sw_fallback_running;
-	/* mediated vring for sw fallback */
-	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
-	/* eventfd for used ring interrupt */
-	int intr_fd[IFCVF_MAX_QUEUES * 2];
-};
-
-struct internal_list {
-	TAILQ_ENTRY(internal_list) next;
-	struct ifcvf_internal *internal;
-};
-
-TAILQ_HEAD(internal_list_head, internal_list);
-static struct internal_list_head internal_list =
-	TAILQ_HEAD_INITIALIZER(internal_list);
-
-static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
-
-static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
-
-static struct internal_list *
-find_internal_resource_by_did(int did)
-{
-	int found = 0;
-	struct internal_list *list;
-
-	pthread_mutex_lock(&internal_list_lock);
-
-	TAILQ_FOREACH(list, &internal_list, next) {
-		if (did == list->internal->did) {
-			found = 1;
-			break;
-		}
-	}
-
-	pthread_mutex_unlock(&internal_list_lock);
-
-	if (!found)
-		return NULL;
-
-	return list;
-}
-
-static struct internal_list *
-find_internal_resource_by_dev(struct rte_pci_device *pdev)
-{
-	int found = 0;
-	struct internal_list *list;
-
-	pthread_mutex_lock(&internal_list_lock);
-
-	TAILQ_FOREACH(list, &internal_list, next) {
-		if (pdev == list->internal->pdev) {
-			found = 1;
-			break;
-		}
-	}
-
-	pthread_mutex_unlock(&internal_list_lock);
-
-	if (!found)
-		return NULL;
-
-	return list;
-}
-
-static int
-ifcvf_vfio_setup(struct ifcvf_internal *internal)
-{
-	struct rte_pci_device *dev = internal->pdev;
-	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
-	int iommu_group_num;
-	int i, ret;
-
-	internal->vfio_dev_fd = -1;
-	internal->vfio_group_fd = -1;
-	internal->vfio_container_fd = -1;
-
-	rte_pci_device_name(&dev->addr, devname, RTE_DEV_NAME_MAX_LEN);
-	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
-			&iommu_group_num);
-	if (ret <= 0) {
-		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
-		return -1;
-	}
-
-	internal->vfio_container_fd = rte_vfio_container_create();
-	if (internal->vfio_container_fd < 0)
-		return -1;
-
-	internal->vfio_group_fd = rte_vfio_container_group_bind(
-			internal->vfio_container_fd, iommu_group_num);
-	if (internal->vfio_group_fd < 0)
-		goto err;
-
-	if (rte_pci_map_device(dev))
-		goto err;
-
-	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
-
-	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE, IFCVF_PCI_MAX_RESOURCE);
-			i++) {
-		internal->hw.mem_resource[i].addr =
-			internal->pdev->mem_resource[i].addr;
-		internal->hw.mem_resource[i].phys_addr =
-			internal->pdev->mem_resource[i].phys_addr;
-		internal->hw.mem_resource[i].len =
-			internal->pdev->mem_resource[i].len;
-	}
-
-	return 0;
-
-err:
-	rte_vfio_container_destroy(internal->vfio_container_fd);
-	return -1;
-}
-
-static int
-ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
-{
-	uint32_t i;
-	int ret;
-	struct rte_vhost_memory *mem = NULL;
-	int vfio_container_fd;
-
-	ret = rte_vhost_get_mem_table(internal->vid, &mem);
-	if (ret < 0) {
-		DRV_LOG(ERR, "failed to get VM memory layout.");
-		goto exit;
-	}
-
-	vfio_container_fd = internal->vfio_container_fd;
-
-	for (i = 0; i < mem->nregions; i++) {
-		struct rte_vhost_mem_region *reg;
-
-		reg = &mem->regions[i];
-		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
-			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
-			do_map ? "DMA map" : "DMA unmap", i,
-			reg->host_user_addr, reg->guest_phys_addr, reg->size);
-
-		if (do_map) {
-			ret = rte_vfio_container_dma_map(vfio_container_fd,
-				reg->host_user_addr, reg->guest_phys_addr,
-				reg->size);
-			if (ret < 0) {
-				DRV_LOG(ERR, "DMA map failed.");
-				goto exit;
-			}
-		} else {
-			ret = rte_vfio_container_dma_unmap(vfio_container_fd,
-				reg->host_user_addr, reg->guest_phys_addr,
-				reg->size);
-			if (ret < 0) {
-				DRV_LOG(ERR, "DMA unmap failed.");
-				goto exit;
-			}
-		}
-	}
-
-exit:
-	if (mem)
-		free(mem);
-	return ret;
-}
-
-static uint64_t
-hva_to_gpa(int vid, uint64_t hva)
-{
-	struct rte_vhost_memory *mem = NULL;
-	struct rte_vhost_mem_region *reg;
-	uint32_t i;
-	uint64_t gpa = 0;
-
-	if (rte_vhost_get_mem_table(vid, &mem) < 0)
-		goto exit;
-
-	for (i = 0; i < mem->nregions; i++) {
-		reg = &mem->regions[i];
-
-		if (hva >= reg->host_user_addr &&
-				hva < reg->host_user_addr + reg->size) {
-			gpa = hva - reg->host_user_addr + reg->guest_phys_addr;
-			break;
-		}
-	}
-
-exit:
-	if (mem)
-		free(mem);
-	return gpa;
-}
-
-static int
-vdpa_ifcvf_start(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	int i, nr_vring;
-	int vid;
-	struct rte_vhost_vring vq;
-	uint64_t gpa;
-
-	vid = internal->vid;
-	nr_vring = rte_vhost_get_vring_num(vid);
-	rte_vhost_get_negotiated_features(vid, &hw->req_features);
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
-			return -1;
-		}
-		hw->vring[i].desc = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for available ring.");
-			return -1;
-		}
-		hw->vring[i].avail = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for used ring.");
-			return -1;
-		}
-		hw->vring[i].used = gpa;
-
-		hw->vring[i].size = vq.size;
-		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
-				&hw->vring[i].last_used_idx);
-	}
-	hw->nr_vring = i;
-
-	return ifcvf_start_hw(&internal->hw);
-}
-
-static void
-vdpa_ifcvf_stop(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	uint32_t i;
-	int vid;
-	uint64_t features = 0;
-	uint64_t log_base = 0, log_size = 0;
-	uint64_t len;
-
-	vid = internal->vid;
-	ifcvf_stop_hw(hw);
-
-	for (i = 0; i < hw->nr_vring; i++)
-		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
-				hw->vring[i].last_used_idx);
-
-	if (internal->sw_lm)
-		return;
-
-	rte_vhost_get_negotiated_features(vid, &features);
-	if (RTE_VHOST_NEED_LOG(features)) {
-		ifcvf_disable_logging(hw);
-		rte_vhost_get_log_base(internal->vid, &log_base, &log_size);
-		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
-				log_base, IFCVF_LOG_BASE, log_size);
-		/*
-		 * IFCVF marks dirty memory pages for only packet buffer,
-		 * SW helps to mark the used ring as dirty after device stops.
-		 */
-		for (i = 0; i < hw->nr_vring; i++) {
-			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
-			rte_vhost_log_used_vring(vid, i, 0, len);
-		}
-	}
-}
-
-#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
-		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
-static int
-vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
-{
-	int ret;
-	uint32_t i, nr_vring;
-	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
-	struct vfio_irq_set *irq_set;
-	int *fd_ptr;
-	struct rte_vhost_vring vring;
-	int fd;
-
-	vring.callfd = -1;
-
-	nr_vring = rte_vhost_get_vring_num(internal->vid);
-
-	irq_set = (struct vfio_irq_set *)irq_set_buf;
-	irq_set->argsz = sizeof(irq_set_buf);
-	irq_set->count = nr_vring + 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-			 VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-	fd_ptr = (int *)&irq_set->data;
-	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev->intr_handle.fd;
-
-	for (i = 0; i < nr_vring; i++)
-		internal->intr_fd[i] = -1;
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
-		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
-		if ((i & 1) == 0 && m_rx == true) {
-			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
-			if (fd < 0) {
-				DRV_LOG(ERR, "can't setup eventfd: %s",
-					strerror(errno));
-				return -1;
-			}
-			internal->intr_fd[i] = fd;
-			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
-		}
-	}
-
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-	if (ret) {
-		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
-{
-	int ret;
-	uint32_t i, nr_vring;
-	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
-	struct vfio_irq_set *irq_set;
-
-	irq_set = (struct vfio_irq_set *)irq_set_buf;
-	irq_set->argsz = sizeof(irq_set_buf);
-	irq_set->count = 0;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	nr_vring = rte_vhost_get_vring_num(internal->vid);
-	for (i = 0; i < nr_vring; i++) {
-		if (internal->intr_fd[i] >= 0)
-			close(internal->intr_fd[i]);
-		internal->intr_fd[i] = -1;
-	}
-
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-	if (ret) {
-		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	return 0;
-}
-
-static void *
-notify_relay(void *arg)
-{
-	int i, kickfd, epfd, nfds = 0;
-	uint32_t qid, q_num;
-	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
-	struct epoll_event ev;
-	uint64_t buf;
-	int nbytes;
-	struct rte_vhost_vring vring;
-	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
-	struct ifcvf_hw *hw = &internal->hw;
-
-	q_num = rte_vhost_get_vring_num(internal->vid);
-
-	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
-	if (epfd < 0) {
-		DRV_LOG(ERR, "failed to create epoll instance.");
-		return NULL;
-	}
-	internal->epfd = epfd;
-
-	vring.kickfd = -1;
-	for (qid = 0; qid < q_num; qid++) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
-		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-	}
-
-	for (;;) {
-		nfds = epoll_wait(epfd, events, q_num, -1);
-		if (nfds < 0) {
-			if (errno == EINTR)
-				continue;
-			DRV_LOG(ERR, "epoll_wait return fail\n");
-			return NULL;
-		}
-
-		for (i = 0; i < nfds; i++) {
-			qid = events[i].data.u32;
-			kickfd = (uint32_t)(events[i].data.u64 >> 32);
-			do {
-				nbytes = read(kickfd, &buf, 8);
-				if (nbytes < 0) {
-					if (errno == EINTR ||
-					    errno == EWOULDBLOCK ||
-					    errno == EAGAIN)
-						continue;
-					DRV_LOG(INFO, "Error reading "
-						"kickfd: %s",
-						strerror(errno));
-				}
-				break;
-			} while (1);
-
-			ifcvf_notify_queue(hw, qid);
-		}
-	}
-
-	return NULL;
-}
-
-static int
-setup_notify_relay(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	ret = pthread_create(&internal->tid, NULL, notify_relay,
-			(void *)internal);
-	if (ret) {
-		DRV_LOG(ERR, "failed to create notify relay pthread.");
-		return -1;
-	}
-	return 0;
-}
-
-static int
-unset_notify_relay(struct ifcvf_internal *internal)
-{
-	void *status;
-
-	if (internal->tid) {
-		pthread_cancel(internal->tid);
-		pthread_join(internal->tid, &status);
-	}
-	internal->tid = 0;
-
-	if (internal->epfd >= 0)
-		close(internal->epfd);
-	internal->epfd = -1;
-
-	return 0;
-}
-
-static int
-update_datapath(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	rte_spinlock_lock(&internal->lock);
-
-	if (!rte_atomic32_read(&internal->running) &&
-	    (rte_atomic32_read(&internal->started) &&
-	     rte_atomic32_read(&internal->dev_attached))) {
-		ret = ifcvf_dma_map(internal, 1);
-		if (ret)
-			goto err;
-
-		ret = vdpa_enable_vfio_intr(internal, 0);
-		if (ret)
-			goto err;
-
-		ret = vdpa_ifcvf_start(internal);
-		if (ret)
-			goto err;
-
-		ret = setup_notify_relay(internal);
-		if (ret)
-			goto err;
-
-		rte_atomic32_set(&internal->running, 1);
-	} else if (rte_atomic32_read(&internal->running) &&
-		   (!rte_atomic32_read(&internal->started) ||
-		    !rte_atomic32_read(&internal->dev_attached))) {
-		ret = unset_notify_relay(internal);
-		if (ret)
-			goto err;
-
-		vdpa_ifcvf_stop(internal);
-
-		ret = vdpa_disable_vfio_intr(internal);
-		if (ret)
-			goto err;
-
-		ret = ifcvf_dma_map(internal, 0);
-		if (ret)
-			goto err;
-
-		rte_atomic32_set(&internal->running, 0);
-	}
-
-	rte_spinlock_unlock(&internal->lock);
-	return 0;
-err:
-	rte_spinlock_unlock(&internal->lock);
-	return ret;
-}
-
-static int
-m_ifcvf_start(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	uint32_t i, nr_vring;
-	int vid, ret;
-	struct rte_vhost_vring vq;
-	void *vring_buf;
-	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
-	uint64_t size;
-	uint64_t gpa;
-
-	memset(&vq, 0, sizeof(vq));
-	vid = internal->vid;
-	nr_vring = rte_vhost_get_vring_num(vid);
-	rte_vhost_get_negotiated_features(vid, &hw->req_features);
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-
-		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
-				PAGE_SIZE);
-		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
-		vring_init(&internal->m_vring[i], vq.size, vring_buf,
-				PAGE_SIZE);
-
-		ret = rte_vfio_container_dma_map(internal->vfio_container_fd,
-			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
-		if (ret < 0) {
-			DRV_LOG(ERR, "mediated vring DMA map failed.");
-			goto error;
-		}
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
-			return -1;
-		}
-		hw->vring[i].desc = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for available ring.");
-			return -1;
-		}
-		hw->vring[i].avail = gpa;
-
-		/* Direct I/O for Tx queue, relay for Rx queue */
-		if (i & 1) {
-			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
-			if (gpa == 0) {
-				DRV_LOG(ERR, "Fail to get GPA for used ring.");
-				return -1;
-			}
-			hw->vring[i].used = gpa;
-		} else {
-			hw->vring[i].used = m_vring_iova +
-				(char *)internal->m_vring[i].used -
-				(char *)internal->m_vring[i].desc;
-		}
-
-		hw->vring[i].size = vq.size;
-
-		rte_vhost_get_vring_base(vid, i,
-				&internal->m_vring[i].avail->idx,
-				&internal->m_vring[i].used->idx);
-
-		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
-				&hw->vring[i].last_used_idx);
-
-		m_vring_iova += size;
-	}
-	hw->nr_vring = nr_vring;
-
-	return ifcvf_start_hw(&internal->hw);
-
-error:
-	for (i = 0; i < nr_vring; i++)
-		if (internal->m_vring[i].desc)
-			rte_free(internal->m_vring[i].desc);
-
-	return -1;
-}
-
-static int
-m_ifcvf_stop(struct ifcvf_internal *internal)
-{
-	int vid;
-	uint32_t i;
-	struct rte_vhost_vring vq;
-	struct ifcvf_hw *hw = &internal->hw;
-	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
-	uint64_t size, len;
-
-	vid = internal->vid;
-	ifcvf_stop_hw(hw);
-
-	for (i = 0; i < hw->nr_vring; i++) {
-		/* synchronize remaining new used entries if any */
-		if ((i & 1) == 0)
-			update_used_ring(internal, i);
-
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-		len = IFCVF_USED_RING_LEN(vq.size);
-		rte_vhost_log_used_vring(vid, i, 0, len);
-
-		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
-				PAGE_SIZE);
-		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
-			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
-			m_vring_iova, size);
-
-		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
-				hw->vring[i].last_used_idx);
-		rte_free(internal->m_vring[i].desc);
-		m_vring_iova += size;
-	}
-
-	return 0;
-}
-
-static void
-update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
-{
-	rte_vdpa_relay_vring_used(internal->vid, qid, &internal->m_vring[qid]);
-	rte_vhost_vring_call(internal->vid, qid);
-}
-
-static void *
-vring_relay(void *arg)
-{
-	int i, vid, epfd, fd, nfds;
-	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
-	struct rte_vhost_vring vring;
-	uint16_t qid, q_num;
-	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
-	struct epoll_event ev;
-	int nbytes;
-	uint64_t buf;
-
-	vid = internal->vid;
-	q_num = rte_vhost_get_vring_num(vid);
-
-	/* add notify fd and interrupt fd to epoll */
-	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
-	if (epfd < 0) {
-		DRV_LOG(ERR, "failed to create epoll instance.");
-		return NULL;
-	}
-	internal->epfd = epfd;
-
-	vring.kickfd = -1;
-	for (qid = 0; qid < q_num; qid++) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		rte_vhost_get_vhost_vring(vid, qid, &vring);
-		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-	}
-
-	for (qid = 0; qid < q_num; qid += 2) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		/* leave a flag to mark it's for interrupt */
-		ev.data.u64 = 1 | qid << 1 |
-			(uint64_t)internal->intr_fd[qid] << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid], &ev)
-				< 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-		update_used_ring(internal, qid);
-	}
-
-	/* start relay with a first kick */
-	for (qid = 0; qid < q_num; qid++)
-		ifcvf_notify_queue(&internal->hw, qid);
-
-	/* listen to the events and react accordingly */
-	for (;;) {
-		nfds = epoll_wait(epfd, events, q_num * 2, -1);
-		if (nfds < 0) {
-			if (errno == EINTR)
-				continue;
-			DRV_LOG(ERR, "epoll_wait return fail\n");
-			return NULL;
-		}
-
-		for (i = 0; i < nfds; i++) {
-			fd = (uint32_t)(events[i].data.u64 >> 32);
-			do {
-				nbytes = read(fd, &buf, 8);
-				if (nbytes < 0) {
-					if (errno == EINTR ||
-					    errno == EWOULDBLOCK ||
-					    errno == EAGAIN)
-						continue;
-					DRV_LOG(INFO, "Error reading "
-						"kickfd: %s",
-						strerror(errno));
-				}
-				break;
-			} while (1);
-
-			qid = events[i].data.u32 >> 1;
-
-			if (events[i].data.u32 & 1)
-				update_used_ring(internal, qid);
-			else
-				ifcvf_notify_queue(&internal->hw, qid);
-		}
-	}
-
-	return NULL;
-}
-
-static int
-setup_vring_relay(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	ret = pthread_create(&internal->tid, NULL, vring_relay,
-			(void *)internal);
-	if (ret) {
-		DRV_LOG(ERR, "failed to create ring relay pthread.");
-		return -1;
-	}
-	return 0;
-}
-
-static int
-unset_vring_relay(struct ifcvf_internal *internal)
-{
-	void *status;
-
-	if (internal->tid) {
-		pthread_cancel(internal->tid);
-		pthread_join(internal->tid, &status);
-	}
-	internal->tid = 0;
-
-	if (internal->epfd >= 0)
-		close(internal->epfd);
-	internal->epfd = -1;
-
-	return 0;
-}
-
-static int
-ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
-{
-	int ret;
-	int vid = internal->vid;
-
-	/* stop the direct IO data path */
-	unset_notify_relay(internal);
-	vdpa_ifcvf_stop(internal);
-	vdpa_disable_vfio_intr(internal);
-
-	ret = rte_vhost_host_notifier_ctrl(vid, false);
-	if (ret && ret != -ENOTSUP)
-		goto error;
-
-	/* set up interrupt for interrupt relay */
-	ret = vdpa_enable_vfio_intr(internal, 1);
-	if (ret)
-		goto unmap;
-
-	/* config the VF */
-	ret = m_ifcvf_start(internal);
-	if (ret)
-		goto unset_intr;
-
-	/* set up vring relay thread */
-	ret = setup_vring_relay(internal);
-	if (ret)
-		goto stop_vf;
-
-	rte_vhost_host_notifier_ctrl(vid, true);
-
-	internal->sw_fallback_running = true;
-
-	return 0;
-
-stop_vf:
-	m_ifcvf_stop(internal);
-unset_intr:
-	vdpa_disable_vfio_intr(internal);
-unmap:
-	ifcvf_dma_map(internal, 0);
-error:
-	return -1;
-}
-
-static int
-ifcvf_dev_config(int vid)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-	internal->vid = vid;
-	rte_atomic32_set(&internal->dev_attached, 1);
-	update_datapath(internal);
-
-	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
-		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
-
-	return 0;
-}
-
-static int
-ifcvf_dev_close(int vid)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-
-	if (internal->sw_fallback_running) {
-		/* unset ring relay */
-		unset_vring_relay(internal);
-
-		/* reset VF */
-		m_ifcvf_stop(internal);
-
-		/* remove interrupt setting */
-		vdpa_disable_vfio_intr(internal);
-
-		/* unset DMA map for guest memory */
-		ifcvf_dma_map(internal, 0);
-
-		internal->sw_fallback_running = false;
-	} else {
-		rte_atomic32_set(&internal->dev_attached, 0);
-		update_datapath(internal);
-	}
-
-	return 0;
-}
-
-static int
-ifcvf_set_features(int vid)
-{
-	uint64_t features = 0;
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-	uint64_t log_base = 0, log_size = 0;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-	rte_vhost_get_negotiated_features(vid, &features);
-
-	if (!RTE_VHOST_NEED_LOG(features))
-		return 0;
-
-	if (internal->sw_lm) {
-		ifcvf_sw_fallback_switchover(internal);
-	} else {
-		rte_vhost_get_log_base(vid, &log_base, &log_size);
-		rte_vfio_container_dma_map(internal->vfio_container_fd,
-				log_base, IFCVF_LOG_BASE, log_size);
-		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE, log_size);
-	}
-
-	return 0;
-}
-
-static int
-ifcvf_get_vfio_group_fd(int vid)
-{
-	int did;
-	struct internal_list *list;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	return list->internal->vfio_group_fd;
-}
-
-static int
-ifcvf_get_vfio_device_fd(int vid)
-{
-	int did;
-	struct internal_list *list;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	return list->internal->vfio_dev_fd;
-}
-
-static int
-ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-	struct vfio_region_info reg = { .argsz = sizeof(reg) };
-	int ret;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-
-	reg.index = ifcvf_get_notify_region(&internal->hw);
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
-	if (ret) {
-		DRV_LOG(ERR, "Get not get device region info: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) + reg.offset;
-	*size = 0x1000;
-
-	return 0;
-}
-
-static int
-ifcvf_get_queue_num(int did, uint32_t *queue_num)
-{
-	struct internal_list *list;
-
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	*queue_num = list->internal->max_queues;
-
-	return 0;
-}
-
-static int
-ifcvf_get_vdpa_features(int did, uint64_t *features)
-{
-	struct internal_list *list;
-
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	*features = list->internal->features;
-
-	return 0;
-}
-
-#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
-		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
-		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
-		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
-		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
-		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
-static int
-ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
-{
-	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
-	return 0;
-}
-
-static struct rte_vdpa_dev_ops ifcvf_ops = {
-	.get_queue_num = ifcvf_get_queue_num,
-	.get_features = ifcvf_get_vdpa_features,
-	.get_protocol_features = ifcvf_get_protocol_features,
-	.dev_conf = ifcvf_dev_config,
-	.dev_close = ifcvf_dev_close,
-	.set_vring_state = NULL,
-	.set_features = ifcvf_set_features,
-	.migration_done = NULL,
-	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
-	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
-	.get_notify_area = ifcvf_get_notify_area,
-};
-
-static inline int
-open_int(const char *key __rte_unused, const char *value, void *extra_args)
-{
-	uint16_t *n = extra_args;
-
-	if (value == NULL || extra_args == NULL)
-		return -EINVAL;
-
-	*n = (uint16_t)strtoul(value, NULL, 0);
-	if (*n == USHRT_MAX && errno == ERANGE)
-		return -1;
-
-	return 0;
-}
-
-static int
-ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
-		struct rte_pci_device *pci_dev)
-{
-	uint64_t features;
-	struct ifcvf_internal *internal = NULL;
-	struct internal_list *list = NULL;
-	int vdpa_mode = 0;
-	int sw_fallback_lm = 0;
-	struct rte_kvargs *kvlist = NULL;
-	int ret = 0;
-
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
-	if (!pci_dev->device.devargs)
-		return 1;
-
-	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
-			ifcvf_valid_arguments);
-	if (kvlist == NULL)
-		return 1;
-
-	/* probe only when vdpa mode is specified */
-	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
-		rte_kvargs_free(kvlist);
-		return 1;
-	}
-
-	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
-			&vdpa_mode);
-	if (ret < 0 || vdpa_mode == 0) {
-		rte_kvargs_free(kvlist);
-		return 1;
-	}
-
-	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
-	if (list == NULL)
-		goto error;
-
-	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
-	if (internal == NULL)
-		goto error;
-
-	internal->pdev = pci_dev;
-	rte_spinlock_init(&internal->lock);
-
-	if (ifcvf_vfio_setup(internal) < 0) {
-		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
-		goto error;
-	}
-
-	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
-		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
-		goto error;
-	}
-
-	internal->max_queues = IFCVF_MAX_QUEUES;
-	features = ifcvf_get_features(&internal->hw);
-	internal->features = (features &
-		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
-		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
-		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
-		(1ULL << VIRTIO_NET_F_STATUS) |
-		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
-		(1ULL << VHOST_F_LOG_ALL);
-
-	internal->dev_addr.pci_addr = pci_dev->addr;
-	internal->dev_addr.type = PCI_ADDR;
-	list->internal = internal;
-
-	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
-		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
-				&open_int, &sw_fallback_lm);
-		if (ret < 0)
-			goto error;
-	}
-	internal->sw_lm = sw_fallback_lm;
-
-	internal->did = rte_vdpa_register_device(&internal->dev_addr,
-				&ifcvf_ops);
-	if (internal->did < 0) {
-		DRV_LOG(ERR, "failed to register device %s", pci_dev->name);
-		goto error;
-	}
-
-	pthread_mutex_lock(&internal_list_lock);
-	TAILQ_INSERT_TAIL(&internal_list, list, next);
-	pthread_mutex_unlock(&internal_list_lock);
-
-	rte_atomic32_set(&internal->started, 1);
-	update_datapath(internal);
-
-	rte_kvargs_free(kvlist);
-	return 0;
-
-error:
-	rte_kvargs_free(kvlist);
-	rte_free(list);
-	rte_free(internal);
-	return -1;
-}
-
-static int
-ifcvf_pci_remove(struct rte_pci_device *pci_dev)
-{
-	struct ifcvf_internal *internal;
-	struct internal_list *list;
-
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
-	list = find_internal_resource_by_dev(pci_dev);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
-		return -1;
-	}
-
-	internal = list->internal;
-	rte_atomic32_set(&internal->started, 0);
-	update_datapath(internal);
-
-	rte_pci_unmap_device(internal->pdev);
-	rte_vfio_container_destroy(internal->vfio_container_fd);
-	rte_vdpa_unregister_device(internal->did);
-
-	pthread_mutex_lock(&internal_list_lock);
-	TAILQ_REMOVE(&internal_list, list, next);
-	pthread_mutex_unlock(&internal_list_lock);
-
-	rte_free(list);
-	rte_free(internal);
-
-	return 0;
-}
-
-/*
- * IFCVF has the same vendor ID and device ID as virtio net PCI
- * device, with its specific subsystem vendor ID and device ID.
- */
-static const struct rte_pci_id pci_id_ifcvf_map[] = {
-	{ .class_id = RTE_CLASS_ANY_ID,
-	  .vendor_id = IFCVF_VENDOR_ID,
-	  .device_id = IFCVF_DEVICE_ID,
-	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
-	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
-	},
-
-	{ .vendor_id = 0, /* sentinel */
-	},
-};
-
-static struct rte_pci_driver rte_ifcvf_vdpa = {
-	.id_table = pci_id_ifcvf_map,
-	.drv_flags = 0,
-	.probe = ifcvf_pci_probe,
-	.remove = ifcvf_pci_remove,
-};
-
-RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
-RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
-RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
-
-RTE_INIT(ifcvf_vdpa_init_log)
-{
-	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
-	if (ifcvf_vdpa_logtype >= 0)
-		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
-}
diff --git a/drivers/net/ifc/meson.build b/drivers/net/ifc/meson.build
deleted file mode 100644
index adc9ed9..0000000
--- a/drivers/net/ifc/meson.build
+++ /dev/null
@@ -1,9 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 Intel Corporation
-
-build = dpdk_conf.has('RTE_LIBRTE_VHOST')
-reason = 'missing dependency, DPDK vhost library'
-allow_experimental_apis = true
-sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
-includes += include_directories('base')
-deps += 'vhost'
diff --git a/drivers/net/ifc/rte_pmd_ifc_version.map b/drivers/net/ifc/rte_pmd_ifc_version.map
deleted file mode 100644
index f9f17e4..0000000
--- a/drivers/net/ifc/rte_pmd_ifc_version.map
+++ /dev/null
@@ -1,3 +0,0 @@
-DPDK_20.0 {
-	local: *;
-};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index c300afb..b0ea8fe 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -21,7 +21,6 @@ drivers = ['af_packet',
 	'hns3',
 	'iavf',
 	'ice',
-	'ifc',
 	'ipn3ke',
 	'ixgbe',
 	'kni',
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index 82a2b70..27fec96 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,4 +5,10 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # DIRS-$(<configuration>) += <directory>
 
+ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
+ifeq ($(CONFIG_RTE_EAL_VFIO),y)
+DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
+endif
+endif # $(CONFIG_RTE_LIBRTE_VHOST)
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/vdpa/ifc/Makefile b/drivers/vdpa/ifc/Makefile
new file mode 100644
index 0000000..fe227b8
--- /dev/null
+++ b/drivers/vdpa/ifc/Makefile
@@ -0,0 +1,34 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ifc.a
+
+LDLIBS += -lpthread
+LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
+LDLIBS += -lrte_kvargs
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
+#
+# Add extra flags for base driver source files to disable warnings in them
+#
+BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/base/*.c))))
+
+VPATH += $(SRCDIR)/base
+
+EXPORT_MAP := rte_pmd_ifc_version.map
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
+SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
new file mode 100644
index 0000000..3c0b2df
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include "ifcvf.h"
+#include "ifcvf_osdep.h"
+
+STATIC void *
+get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
+{
+	u8 bar = cap->bar;
+	u32 length = cap->length;
+	u32 offset = cap->offset;
+
+	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
+		DEBUGOUT("invalid bar: %u\n", bar);
+		return NULL;
+	}
+
+	if (offset + length < offset) {
+		DEBUGOUT("offset(%u) + length(%u) overflows\n",
+			offset, length);
+		return NULL;
+	}
+
+	if (offset + length > hw->mem_resource[cap->bar].len) {
+		DEBUGOUT("offset(%u) + length(%u) overflows bar length(%u)",
+			offset, length, (u32)hw->mem_resource[cap->bar].len);
+		return NULL;
+	}
+
+	return hw->mem_resource[bar].addr + offset;
+}
+
+int
+ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
+{
+	int ret;
+	u8 pos;
+	struct ifcvf_pci_cap cap;
+
+	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
+	if (ret < 0) {
+		DEBUGOUT("failed to read pci capability list\n");
+		return -1;
+	}
+
+	while (pos) {
+		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
+				sizeof(cap), pos);
+		if (ret < 0) {
+			DEBUGOUT("failed to read cap at pos: %x", pos);
+			break;
+		}
+
+		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
+			goto next;
+
+		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
+				"len: %u\n", cap.cfg_type, cap.bar,
+				cap.offset, cap.length);
+
+		switch (cap.cfg_type) {
+		case IFCVF_PCI_CAP_COMMON_CFG:
+			hw->common_cfg = get_cap_addr(hw, &cap);
+			break;
+		case IFCVF_PCI_CAP_NOTIFY_CFG:
+			PCI_READ_CONFIG_DWORD(dev, &hw->notify_off_multiplier,
+					pos + sizeof(cap));
+			hw->notify_base = get_cap_addr(hw, &cap);
+			hw->notify_region = cap.bar;
+			break;
+		case IFCVF_PCI_CAP_ISR_CFG:
+			hw->isr = get_cap_addr(hw, &cap);
+			break;
+		case IFCVF_PCI_CAP_DEVICE_CFG:
+			hw->dev_cfg = get_cap_addr(hw, &cap);
+			break;
+		}
+next:
+		pos = cap.cap_next;
+	}
+
+	hw->lm_cfg = hw->mem_resource[4].addr;
+
+	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+			hw->isr == NULL || hw->dev_cfg == NULL) {
+		DEBUGOUT("capability incomplete\n");
+		return -1;
+	}
+
+	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
+			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
+			"multiplier: %u\n",
+			hw->common_cfg, hw->dev_cfg,
+			hw->isr, hw->notify_base,
+			hw->notify_off_multiplier);
+
+	return 0;
+}
+
+STATIC u8
+ifcvf_get_status(struct ifcvf_hw *hw)
+{
+	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
+}
+
+STATIC void
+ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
+{
+	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
+}
+
+STATIC void
+ifcvf_reset(struct ifcvf_hw *hw)
+{
+	ifcvf_set_status(hw, 0);
+
+	/* flush status write */
+	while (ifcvf_get_status(hw))
+		msec_delay(1);
+}
+
+STATIC void
+ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
+{
+	if (status != 0)
+		status |= ifcvf_get_status(hw);
+
+	ifcvf_set_status(hw, status);
+	ifcvf_get_status(hw);
+}
+
+u64
+ifcvf_get_features(struct ifcvf_hw *hw)
+{
+	u32 features_lo, features_hi;
+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
+	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
+
+	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
+	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
+
+	return ((u64)features_hi << 32) | features_lo;
+}
+
+STATIC void
+ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
+{
+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
+	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg->guest_feature);
+
+	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
+	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
+}
+
+STATIC int
+ifcvf_config_features(struct ifcvf_hw *hw)
+{
+	u64 host_features;
+
+	host_features = ifcvf_get_features(hw);
+	hw->req_features &= host_features;
+
+	ifcvf_set_features(hw, hw->req_features);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
+
+	if (!(ifcvf_get_status(hw) & IFCVF_CONFIG_STATUS_FEATURES_OK)) {
+		DEBUGOUT("failed to set FEATURES_OK status\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+STATIC void
+io_write64_twopart(u64 val, u32 *lo, u32 *hi)
+{
+	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
+	IFCVF_WRITE_REG32(val >> 32, hi);
+}
+
+STATIC int
+ifcvf_hw_enable(struct ifcvf_hw *hw)
+{
+	struct ifcvf_pci_common_cfg *cfg;
+	u8 *lm_cfg;
+	u32 i;
+	u16 notify_off;
+
+	cfg = hw->common_cfg;
+	lm_cfg = hw->lm_cfg;
+
+	IFCVF_WRITE_REG16(0, &cfg->msix_config);
+	if (IFCVF_READ_REG16(&cfg->msix_config) == IFCVF_MSI_NO_VECTOR) {
+		DEBUGOUT("msix vec alloc failed for device config\n");
+		return -1;
+	}
+
+	for (i = 0; i < hw->nr_vring; i++) {
+		IFCVF_WRITE_REG16(i, &cfg->queue_select);
+		io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
+				&cfg->queue_desc_hi);
+		io_write64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
+				&cfg->queue_avail_hi);
+		io_write64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
+				&cfg->queue_used_hi);
+		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
+
+		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
+				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
+			(u32)hw->vring[i].last_avail_idx |
+			((u32)hw->vring[i].last_used_idx << 16);
+
+		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
+		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
+				IFCVF_MSI_NO_VECTOR) {
+			DEBUGOUT("queue %u, msix vec alloc failed\n",
+					i);
+			return -1;
+		}
+
+		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
+		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
+				notify_off * hw->notify_off_multiplier);
+		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
+	}
+
+	return 0;
+}
+
+STATIC void
+ifcvf_hw_disable(struct ifcvf_hw *hw)
+{
+	u32 i;
+	struct ifcvf_pci_common_cfg *cfg;
+	u32 ring_state;
+
+	cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
+	for (i = 0; i < hw->nr_vring; i++) {
+		IFCVF_WRITE_REG16(i, &cfg->queue_select);
+		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
+		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->queue_msix_vector);
+		ring_state = *(u32 *)(hw->lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
+				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
+		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
+		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
+	}
+}
+
+int
+ifcvf_start_hw(struct ifcvf_hw *hw)
+{
+	ifcvf_reset(hw);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
+
+	if (ifcvf_config_features(hw) < 0)
+		return -1;
+
+	if (ifcvf_hw_enable(hw) < 0)
+		return -1;
+
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
+	return 0;
+}
+
+void
+ifcvf_stop_hw(struct ifcvf_hw *hw)
+{
+	ifcvf_hw_disable(hw);
+	ifcvf_reset(hw);
+}
+
+void
+ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
+{
+	u8 *lm_cfg;
+
+	lm_cfg = hw->lm_cfg;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
+		log_base & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
+		(log_base >> 32) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
+		(log_base + log_size) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
+		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_ENABLE_VF;
+}
+
+void
+ifcvf_disable_logging(struct ifcvf_hw *hw)
+{
+	u8 *lm_cfg;
+
+	lm_cfg = hw->lm_cfg;
+	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
+}
+
+void
+ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
+{
+	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
+}
+
+u8
+ifcvf_get_notify_region(struct ifcvf_hw *hw)
+{
+	return hw->notify_region;
+}
+
+u64
+ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
+{
+	return (u8 *)hw->notify_addr[qid] -
+		(u8 *)hw->mem_resource[hw->notify_region].addr;
+}
diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
new file mode 100644
index 0000000..9be2770
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf.h
@@ -0,0 +1,162 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _IFCVF_H_
+#define _IFCVF_H_
+
+#include "ifcvf_osdep.h"
+
+#define IFCVF_VENDOR_ID		0x1AF4
+#define IFCVF_DEVICE_ID		0x1041
+#define IFCVF_SUBSYS_VENDOR_ID	0x8086
+#define IFCVF_SUBSYS_DEVICE_ID	0x001A
+
+#define IFCVF_MAX_QUEUES		1
+#define VIRTIO_F_IOMMU_PLATFORM		33
+
+/* Common configuration */
+#define IFCVF_PCI_CAP_COMMON_CFG	1
+/* Notifications */
+#define IFCVF_PCI_CAP_NOTIFY_CFG	2
+/* ISR Status */
+#define IFCVF_PCI_CAP_ISR_CFG		3
+/* Device specific configuration */
+#define IFCVF_PCI_CAP_DEVICE_CFG	4
+/* PCI configuration access */
+#define IFCVF_PCI_CAP_PCI_CFG		5
+
+#define IFCVF_CONFIG_STATUS_RESET     0x00
+#define IFCVF_CONFIG_STATUS_ACK       0x01
+#define IFCVF_CONFIG_STATUS_DRIVER    0x02
+#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
+#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
+#define IFCVF_CONFIG_STATUS_FAILED    0x80
+
+#define IFCVF_MSI_NO_VECTOR	0xffff
+#define IFCVF_PCI_MAX_RESOURCE	6
+
+#define IFCVF_LM_CFG_SIZE		0x40
+#define IFCVF_LM_RING_STATE_OFFSET	0x20
+
+#define IFCVF_LM_LOGGING_CTRL		0x0
+
+#define IFCVF_LM_BASE_ADDR_LOW		0x10
+#define IFCVF_LM_BASE_ADDR_HIGH		0x14
+#define IFCVF_LM_END_ADDR_LOW		0x18
+#define IFCVF_LM_END_ADDR_HIGH		0x1c
+
+#define IFCVF_LM_DISABLE		0x0
+#define IFCVF_LM_ENABLE_VF		0x1
+#define IFCVF_LM_ENABLE_PF		0x3
+#define IFCVF_LOG_BASE			0x100000000000
+#define IFCVF_MEDIATED_VRING		0x200000000000
+
+#define IFCVF_32_BIT_MASK		0xffffffff
+
+
+struct ifcvf_pci_cap {
+	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
+	u8 cap_next;            /* Generic PCI field: next ptr. */
+	u8 cap_len;             /* Generic PCI field: capability length */
+	u8 cfg_type;            /* Identifies the structure. */
+	u8 bar;                 /* Where to find it. */
+	u8 padding[3];          /* Pad to full dword. */
+	u32 offset;             /* Offset within bar. */
+	u32 length;             /* Length of the structure, in bytes. */
+};
+
+struct ifcvf_pci_notify_cap {
+	struct ifcvf_pci_cap cap;
+	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
+};
+
+struct ifcvf_pci_common_cfg {
+	/* About the whole device. */
+	u32 device_feature_select;
+	u32 device_feature;
+	u32 guest_feature_select;
+	u32 guest_feature;
+	u16 msix_config;
+	u16 num_queues;
+	u8 device_status;
+	u8 config_generation;
+
+	/* About a specific virtqueue. */
+	u16 queue_select;
+	u16 queue_size;
+	u16 queue_msix_vector;
+	u16 queue_enable;
+	u16 queue_notify_off;
+	u32 queue_desc_lo;
+	u32 queue_desc_hi;
+	u32 queue_avail_lo;
+	u32 queue_avail_hi;
+	u32 queue_used_lo;
+	u32 queue_used_hi;
+};
+
+struct ifcvf_net_config {
+	u8    mac[6];
+	u16   status;
+	u16   max_virtqueue_pairs;
+} __attribute__((packed));
+
+struct ifcvf_pci_mem_resource {
+	u64      phys_addr; /**< Physical address, 0 if not resource. */
+	u64      len;       /**< Length of the resource. */
+	u8       *addr;     /**< Virtual address, NULL when not mapped. */
+};
+
+struct vring_info {
+	u64 desc;
+	u64 avail;
+	u64 used;
+	u16 size;
+	u16 last_avail_idx;
+	u16 last_used_idx;
+};
+
+struct ifcvf_hw {
+	u64    req_features;
+	u8     notify_region;
+	u32    notify_off_multiplier;
+	struct ifcvf_pci_common_cfg *common_cfg;
+	struct ifcvf_net_config *dev_cfg;
+	u8     *isr;
+	u16    *notify_base;
+	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
+	u8     *lm_cfg;
+	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
+	u8 nr_vring;
+	struct ifcvf_pci_mem_resource mem_resource[IFCVF_PCI_MAX_RESOURCE];
+};
+
+int
+ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
+
+u64
+ifcvf_get_features(struct ifcvf_hw *hw);
+
+int
+ifcvf_start_hw(struct ifcvf_hw *hw);
+
+void
+ifcvf_stop_hw(struct ifcvf_hw *hw);
+
+void
+ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
+
+void
+ifcvf_disable_logging(struct ifcvf_hw *hw);
+
+void
+ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
+
+u8
+ifcvf_get_notify_region(struct ifcvf_hw *hw);
+
+u64
+ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
+
+#endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifc/base/ifcvf_osdep.h b/drivers/vdpa/ifc/base/ifcvf_osdep.h
new file mode 100644
index 0000000..6aef25e
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf_osdep.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _IFCVF_OSDEP_H_
+#define _IFCVF_OSDEP_H_
+
+#include <stdint.h>
+#include <linux/pci_regs.h>
+
+#include <rte_cycles.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+#include <rte_log.h>
+#include <rte_io.h>
+
+#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
+#define STATIC                  static
+
+#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
+
+#define IFCVF_READ_REG8(reg)		rte_read8(reg)
+#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
+#define IFCVF_READ_REG16(reg)		rte_read16(reg)
+#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
+#define IFCVF_READ_REG32(reg)		rte_read32(reg)
+#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
+
+typedef struct rte_pci_device PCI_DEV;
+
+#define PCI_READ_CONFIG_BYTE(dev, val, where) \
+	rte_pci_read_config(dev, val, 1, where)
+
+#define PCI_READ_CONFIG_DWORD(dev, val, where) \
+	rte_pci_read_config(dev, val, 4, where)
+
+typedef uint8_t    u8;
+typedef int8_t     s8;
+typedef uint16_t   u16;
+typedef int16_t    s16;
+typedef uint32_t   u32;
+typedef int32_t    s32;
+typedef int64_t    s64;
+typedef uint64_t   u64;
+
+static inline int
+PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int where)
+{
+	return rte_pci_read_config(dev, val, size, where);
+}
+
+#endif /* _IFCVF_OSDEP_H_ */
diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
new file mode 100644
index 0000000..da4667b
--- /dev/null
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -0,0 +1,1280 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <pthread.h>
+#include <fcntl.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/epoll.h>
+#include <linux/virtio_net.h>
+#include <stdbool.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_bus_pci.h>
+#include <rte_vhost.h>
+#include <rte_vdpa.h>
+#include <rte_vfio.h>
+#include <rte_spinlock.h>
+#include <rte_log.h>
+#include <rte_kvargs.h>
+#include <rte_devargs.h>
+
+#include "base/ifcvf.h"
+
+#define DRV_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
+		"IFCVF %s(): " fmt "\n", __func__, ##args)
+
+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif
+
+#define IFCVF_USED_RING_LEN(size) \
+	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
+
+#define IFCVF_VDPA_MODE		"vdpa"
+#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
+
+static const char * const ifcvf_valid_arguments[] = {
+	IFCVF_VDPA_MODE,
+	IFCVF_SW_FALLBACK_LM,
+	NULL
+};
+
+static int ifcvf_vdpa_logtype;
+
+struct ifcvf_internal {
+	struct rte_vdpa_dev_addr dev_addr;
+	struct rte_pci_device *pdev;
+	struct ifcvf_hw hw;
+	int vfio_container_fd;
+	int vfio_group_fd;
+	int vfio_dev_fd;
+	pthread_t tid;	/* thread for notify relay */
+	int epfd;
+	int vid;
+	int did;
+	uint16_t max_queues;
+	uint64_t features;
+	rte_atomic32_t started;
+	rte_atomic32_t dev_attached;
+	rte_atomic32_t running;
+	rte_spinlock_t lock;
+	bool sw_lm;
+	bool sw_fallback_running;
+	/* mediated vring for sw fallback */
+	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
+	/* eventfd for used ring interrupt */
+	int intr_fd[IFCVF_MAX_QUEUES * 2];
+};
+
+struct internal_list {
+	TAILQ_ENTRY(internal_list) next;
+	struct ifcvf_internal *internal;
+};
+
+TAILQ_HEAD(internal_list_head, internal_list);
+static struct internal_list_head internal_list =
+	TAILQ_HEAD_INITIALIZER(internal_list);
+
+static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
+
+static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
+
+static struct internal_list *
+find_internal_resource_by_did(int did)
+{
+	int found = 0;
+	struct internal_list *list;
+
+	pthread_mutex_lock(&internal_list_lock);
+
+	TAILQ_FOREACH(list, &internal_list, next) {
+		if (did == list->internal->did) {
+			found = 1;
+			break;
+		}
+	}
+
+	pthread_mutex_unlock(&internal_list_lock);
+
+	if (!found)
+		return NULL;
+
+	return list;
+}
+
+static struct internal_list *
+find_internal_resource_by_dev(struct rte_pci_device *pdev)
+{
+	int found = 0;
+	struct internal_list *list;
+
+	pthread_mutex_lock(&internal_list_lock);
+
+	TAILQ_FOREACH(list, &internal_list, next) {
+		if (pdev == list->internal->pdev) {
+			found = 1;
+			break;
+		}
+	}
+
+	pthread_mutex_unlock(&internal_list_lock);
+
+	if (!found)
+		return NULL;
+
+	return list;
+}
+
+static int
+ifcvf_vfio_setup(struct ifcvf_internal *internal)
+{
+	struct rte_pci_device *dev = internal->pdev;
+	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
+	int iommu_group_num;
+	int i, ret;
+
+	internal->vfio_dev_fd = -1;
+	internal->vfio_group_fd = -1;
+	internal->vfio_container_fd = -1;
+
+	rte_pci_device_name(&dev->addr, devname, RTE_DEV_NAME_MAX_LEN);
+	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
+			&iommu_group_num);
+	if (ret <= 0) {
+		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
+		return -1;
+	}
+
+	internal->vfio_container_fd = rte_vfio_container_create();
+	if (internal->vfio_container_fd < 0)
+		return -1;
+
+	internal->vfio_group_fd = rte_vfio_container_group_bind(
+			internal->vfio_container_fd, iommu_group_num);
+	if (internal->vfio_group_fd < 0)
+		goto err;
+
+	if (rte_pci_map_device(dev))
+		goto err;
+
+	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
+
+	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE, IFCVF_PCI_MAX_RESOURCE);
+			i++) {
+		internal->hw.mem_resource[i].addr =
+			internal->pdev->mem_resource[i].addr;
+		internal->hw.mem_resource[i].phys_addr =
+			internal->pdev->mem_resource[i].phys_addr;
+		internal->hw.mem_resource[i].len =
+			internal->pdev->mem_resource[i].len;
+	}
+
+	return 0;
+
+err:
+	rte_vfio_container_destroy(internal->vfio_container_fd);
+	return -1;
+}
+
+static int
+ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
+{
+	uint32_t i;
+	int ret;
+	struct rte_vhost_memory *mem = NULL;
+	int vfio_container_fd;
+
+	ret = rte_vhost_get_mem_table(internal->vid, &mem);
+	if (ret < 0) {
+		DRV_LOG(ERR, "failed to get VM memory layout.");
+		goto exit;
+	}
+
+	vfio_container_fd = internal->vfio_container_fd;
+
+	for (i = 0; i < mem->nregions; i++) {
+		struct rte_vhost_mem_region *reg;
+
+		reg = &mem->regions[i];
+		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
+			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
+			do_map ? "DMA map" : "DMA unmap", i,
+			reg->host_user_addr, reg->guest_phys_addr, reg->size);
+
+		if (do_map) {
+			ret = rte_vfio_container_dma_map(vfio_container_fd,
+				reg->host_user_addr, reg->guest_phys_addr,
+				reg->size);
+			if (ret < 0) {
+				DRV_LOG(ERR, "DMA map failed.");
+				goto exit;
+			}
+		} else {
+			ret = rte_vfio_container_dma_unmap(vfio_container_fd,
+				reg->host_user_addr, reg->guest_phys_addr,
+				reg->size);
+			if (ret < 0) {
+				DRV_LOG(ERR, "DMA unmap failed.");
+				goto exit;
+			}
+		}
+	}
+
+exit:
+	if (mem)
+		free(mem);
+	return ret;
+}
+
+static uint64_t
+hva_to_gpa(int vid, uint64_t hva)
+{
+	struct rte_vhost_memory *mem = NULL;
+	struct rte_vhost_mem_region *reg;
+	uint32_t i;
+	uint64_t gpa = 0;
+
+	if (rte_vhost_get_mem_table(vid, &mem) < 0)
+		goto exit;
+
+	for (i = 0; i < mem->nregions; i++) {
+		reg = &mem->regions[i];
+
+		if (hva >= reg->host_user_addr &&
+				hva < reg->host_user_addr + reg->size) {
+			gpa = hva - reg->host_user_addr + reg->guest_phys_addr;
+			break;
+		}
+	}
+
+exit:
+	if (mem)
+		free(mem);
+	return gpa;
+}
+
+static int
+vdpa_ifcvf_start(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	int i, nr_vring;
+	int vid;
+	struct rte_vhost_vring vq;
+	uint64_t gpa;
+
+	vid = internal->vid;
+	nr_vring = rte_vhost_get_vring_num(vid);
+	rte_vhost_get_negotiated_features(vid, &hw->req_features);
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
+			return -1;
+		}
+		hw->vring[i].desc = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for available ring.");
+			return -1;
+		}
+		hw->vring[i].avail = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for used ring.");
+			return -1;
+		}
+		hw->vring[i].used = gpa;
+
+		hw->vring[i].size = vq.size;
+		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
+				&hw->vring[i].last_used_idx);
+	}
+	hw->nr_vring = i;
+
+	return ifcvf_start_hw(&internal->hw);
+}
+
+static void
+vdpa_ifcvf_stop(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	uint32_t i;
+	int vid;
+	uint64_t features = 0;
+	uint64_t log_base = 0, log_size = 0;
+	uint64_t len;
+
+	vid = internal->vid;
+	ifcvf_stop_hw(hw);
+
+	for (i = 0; i < hw->nr_vring; i++)
+		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
+				hw->vring[i].last_used_idx);
+
+	if (internal->sw_lm)
+		return;
+
+	rte_vhost_get_negotiated_features(vid, &features);
+	if (RTE_VHOST_NEED_LOG(features)) {
+		ifcvf_disable_logging(hw);
+		rte_vhost_get_log_base(internal->vid, &log_base, &log_size);
+		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
+				log_base, IFCVF_LOG_BASE, log_size);
+		/*
+		 * IFCVF marks dirty memory pages for only packet buffer,
+		 * SW helps to mark the used ring as dirty after device stops.
+		 */
+		for (i = 0; i < hw->nr_vring; i++) {
+			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
+			rte_vhost_log_used_vring(vid, i, 0, len);
+		}
+	}
+}
+
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
+static int
+vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
+{
+	int ret;
+	uint32_t i, nr_vring;
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+	int *fd_ptr;
+	struct rte_vhost_vring vring;
+	int fd;
+
+	vring.callfd = -1;
+
+	nr_vring = rte_vhost_get_vring_num(internal->vid);
+
+	irq_set = (struct vfio_irq_set *)irq_set_buf;
+	irq_set->argsz = sizeof(irq_set_buf);
+	irq_set->count = nr_vring + 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+			 VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	irq_set->start = 0;
+	fd_ptr = (int *)&irq_set->data;
+	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev->intr_handle.fd;
+
+	for (i = 0; i < nr_vring; i++)
+		internal->intr_fd[i] = -1;
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
+		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
+		if ((i & 1) == 0 && m_rx == true) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				DRV_LOG(ERR, "can't setup eventfd: %s",
+					strerror(errno));
+				return -1;
+			}
+			internal->intr_fd[i] = fd;
+			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
+		}
+	}
+
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+	if (ret) {
+		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
+{
+	int ret;
+	uint32_t i, nr_vring;
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+
+	irq_set = (struct vfio_irq_set *)irq_set_buf;
+	irq_set->argsz = sizeof(irq_set_buf);
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	nr_vring = rte_vhost_get_vring_num(internal->vid);
+	for (i = 0; i < nr_vring; i++) {
+		if (internal->intr_fd[i] >= 0)
+			close(internal->intr_fd[i]);
+		internal->intr_fd[i] = -1;
+	}
+
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+	if (ret) {
+		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static void *
+notify_relay(void *arg)
+{
+	int i, kickfd, epfd, nfds = 0;
+	uint32_t qid, q_num;
+	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
+	struct epoll_event ev;
+	uint64_t buf;
+	int nbytes;
+	struct rte_vhost_vring vring;
+	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
+	struct ifcvf_hw *hw = &internal->hw;
+
+	q_num = rte_vhost_get_vring_num(internal->vid);
+
+	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
+	if (epfd < 0) {
+		DRV_LOG(ERR, "failed to create epoll instance.");
+		return NULL;
+	}
+	internal->epfd = epfd;
+
+	vring.kickfd = -1;
+	for (qid = 0; qid < q_num; qid++) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
+		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+	}
+
+	for (;;) {
+		nfds = epoll_wait(epfd, events, q_num, -1);
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			DRV_LOG(ERR, "epoll_wait return fail\n");
+			return NULL;
+		}
+
+		for (i = 0; i < nfds; i++) {
+			qid = events[i].data.u32;
+			kickfd = (uint32_t)(events[i].data.u64 >> 32);
+			do {
+				nbytes = read(kickfd, &buf, 8);
+				if (nbytes < 0) {
+					if (errno == EINTR ||
+					    errno == EWOULDBLOCK ||
+					    errno == EAGAIN)
+						continue;
+					DRV_LOG(INFO, "Error reading "
+						"kickfd: %s",
+						strerror(errno));
+				}
+				break;
+			} while (1);
+
+			ifcvf_notify_queue(hw, qid);
+		}
+	}
+
+	return NULL;
+}
+
+static int
+setup_notify_relay(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	ret = pthread_create(&internal->tid, NULL, notify_relay,
+			(void *)internal);
+	if (ret) {
+		DRV_LOG(ERR, "failed to create notify relay pthread.");
+		return -1;
+	}
+	return 0;
+}
+
+static int
+unset_notify_relay(struct ifcvf_internal *internal)
+{
+	void *status;
+
+	if (internal->tid) {
+		pthread_cancel(internal->tid);
+		pthread_join(internal->tid, &status);
+	}
+	internal->tid = 0;
+
+	if (internal->epfd >= 0)
+		close(internal->epfd);
+	internal->epfd = -1;
+
+	return 0;
+}
+
+static int
+update_datapath(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	rte_spinlock_lock(&internal->lock);
+
+	if (!rte_atomic32_read(&internal->running) &&
+	    (rte_atomic32_read(&internal->started) &&
+	     rte_atomic32_read(&internal->dev_attached))) {
+		ret = ifcvf_dma_map(internal, 1);
+		if (ret)
+			goto err;
+
+		ret = vdpa_enable_vfio_intr(internal, 0);
+		if (ret)
+			goto err;
+
+		ret = vdpa_ifcvf_start(internal);
+		if (ret)
+			goto err;
+
+		ret = setup_notify_relay(internal);
+		if (ret)
+			goto err;
+
+		rte_atomic32_set(&internal->running, 1);
+	} else if (rte_atomic32_read(&internal->running) &&
+		   (!rte_atomic32_read(&internal->started) ||
+		    !rte_atomic32_read(&internal->dev_attached))) {
+		ret = unset_notify_relay(internal);
+		if (ret)
+			goto err;
+
+		vdpa_ifcvf_stop(internal);
+
+		ret = vdpa_disable_vfio_intr(internal);
+		if (ret)
+			goto err;
+
+		ret = ifcvf_dma_map(internal, 0);
+		if (ret)
+			goto err;
+
+		rte_atomic32_set(&internal->running, 0);
+	}
+
+	rte_spinlock_unlock(&internal->lock);
+	return 0;
+err:
+	rte_spinlock_unlock(&internal->lock);
+	return ret;
+}
+
+static int
+m_ifcvf_start(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	uint32_t i, nr_vring;
+	int vid, ret;
+	struct rte_vhost_vring vq;
+	void *vring_buf;
+	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
+	uint64_t size;
+	uint64_t gpa;
+
+	memset(&vq, 0, sizeof(vq));
+	vid = internal->vid;
+	nr_vring = rte_vhost_get_vring_num(vid);
+	rte_vhost_get_negotiated_features(vid, &hw->req_features);
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+
+		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
+				PAGE_SIZE);
+		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
+		vring_init(&internal->m_vring[i], vq.size, vring_buf,
+				PAGE_SIZE);
+
+		ret = rte_vfio_container_dma_map(internal->vfio_container_fd,
+			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
+		if (ret < 0) {
+			DRV_LOG(ERR, "mediated vring DMA map failed.");
+			goto error;
+		}
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
+			return -1;
+		}
+		hw->vring[i].desc = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for available ring.");
+			return -1;
+		}
+		hw->vring[i].avail = gpa;
+
+		/* Direct I/O for Tx queue, relay for Rx queue */
+		if (i & 1) {
+			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
+			if (gpa == 0) {
+				DRV_LOG(ERR, "Fail to get GPA for used ring.");
+				return -1;
+			}
+			hw->vring[i].used = gpa;
+		} else {
+			hw->vring[i].used = m_vring_iova +
+				(char *)internal->m_vring[i].used -
+				(char *)internal->m_vring[i].desc;
+		}
+
+		hw->vring[i].size = vq.size;
+
+		rte_vhost_get_vring_base(vid, i,
+				&internal->m_vring[i].avail->idx,
+				&internal->m_vring[i].used->idx);
+
+		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
+				&hw->vring[i].last_used_idx);
+
+		m_vring_iova += size;
+	}
+	hw->nr_vring = nr_vring;
+
+	return ifcvf_start_hw(&internal->hw);
+
+error:
+	for (i = 0; i < nr_vring; i++)
+		if (internal->m_vring[i].desc)
+			rte_free(internal->m_vring[i].desc);
+
+	return -1;
+}
+
+static int
+m_ifcvf_stop(struct ifcvf_internal *internal)
+{
+	int vid;
+	uint32_t i;
+	struct rte_vhost_vring vq;
+	struct ifcvf_hw *hw = &internal->hw;
+	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
+	uint64_t size, len;
+
+	vid = internal->vid;
+	ifcvf_stop_hw(hw);
+
+	for (i = 0; i < hw->nr_vring; i++) {
+		/* synchronize remaining new used entries if any */
+		if ((i & 1) == 0)
+			update_used_ring(internal, i);
+
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+		len = IFCVF_USED_RING_LEN(vq.size);
+		rte_vhost_log_used_vring(vid, i, 0, len);
+
+		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
+				PAGE_SIZE);
+		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
+			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
+			m_vring_iova, size);
+
+		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
+				hw->vring[i].last_used_idx);
+		rte_free(internal->m_vring[i].desc);
+		m_vring_iova += size;
+	}
+
+	return 0;
+}
+
+static void
+update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
+{
+	rte_vdpa_relay_vring_used(internal->vid, qid, &internal->m_vring[qid]);
+	rte_vhost_vring_call(internal->vid, qid);
+}
+
+static void *
+vring_relay(void *arg)
+{
+	int i, vid, epfd, fd, nfds;
+	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
+	struct rte_vhost_vring vring;
+	uint16_t qid, q_num;
+	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
+	struct epoll_event ev;
+	int nbytes;
+	uint64_t buf;
+
+	vid = internal->vid;
+	q_num = rte_vhost_get_vring_num(vid);
+
+	/* add notify fd and interrupt fd to epoll */
+	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
+	if (epfd < 0) {
+		DRV_LOG(ERR, "failed to create epoll instance.");
+		return NULL;
+	}
+	internal->epfd = epfd;
+
+	vring.kickfd = -1;
+	for (qid = 0; qid < q_num; qid++) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		rte_vhost_get_vhost_vring(vid, qid, &vring);
+		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+	}
+
+	for (qid = 0; qid < q_num; qid += 2) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		/* leave a flag to mark it's for interrupt */
+		ev.data.u64 = 1 | qid << 1 |
+			(uint64_t)internal->intr_fd[qid] << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid], &ev)
+				< 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+		update_used_ring(internal, qid);
+	}
+
+	/* start relay with a first kick */
+	for (qid = 0; qid < q_num; qid++)
+		ifcvf_notify_queue(&internal->hw, qid);
+
+	/* listen to the events and react accordingly */
+	for (;;) {
+		nfds = epoll_wait(epfd, events, q_num * 2, -1);
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			DRV_LOG(ERR, "epoll_wait return fail\n");
+			return NULL;
+		}
+
+		for (i = 0; i < nfds; i++) {
+			fd = (uint32_t)(events[i].data.u64 >> 32);
+			do {
+				nbytes = read(fd, &buf, 8);
+				if (nbytes < 0) {
+					if (errno == EINTR ||
+					    errno == EWOULDBLOCK ||
+					    errno == EAGAIN)
+						continue;
+					DRV_LOG(INFO, "Error reading "
+						"kickfd: %s",
+						strerror(errno));
+				}
+				break;
+			} while (1);
+
+			qid = events[i].data.u32 >> 1;
+
+			if (events[i].data.u32 & 1)
+				update_used_ring(internal, qid);
+			else
+				ifcvf_notify_queue(&internal->hw, qid);
+		}
+	}
+
+	return NULL;
+}
+
+static int
+setup_vring_relay(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	ret = pthread_create(&internal->tid, NULL, vring_relay,
+			(void *)internal);
+	if (ret) {
+		DRV_LOG(ERR, "failed to create ring relay pthread.");
+		return -1;
+	}
+	return 0;
+}
+
+static int
+unset_vring_relay(struct ifcvf_internal *internal)
+{
+	void *status;
+
+	if (internal->tid) {
+		pthread_cancel(internal->tid);
+		pthread_join(internal->tid, &status);
+	}
+	internal->tid = 0;
+
+	if (internal->epfd >= 0)
+		close(internal->epfd);
+	internal->epfd = -1;
+
+	return 0;
+}
+
+static int
+ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
+{
+	int ret;
+	int vid = internal->vid;
+
+	/* stop the direct IO data path */
+	unset_notify_relay(internal);
+	vdpa_ifcvf_stop(internal);
+	vdpa_disable_vfio_intr(internal);
+
+	ret = rte_vhost_host_notifier_ctrl(vid, false);
+	if (ret && ret != -ENOTSUP)
+		goto error;
+
+	/* set up interrupt for interrupt relay */
+	ret = vdpa_enable_vfio_intr(internal, 1);
+	if (ret)
+		goto unmap;
+
+	/* config the VF */
+	ret = m_ifcvf_start(internal);
+	if (ret)
+		goto unset_intr;
+
+	/* set up vring relay thread */
+	ret = setup_vring_relay(internal);
+	if (ret)
+		goto stop_vf;
+
+	rte_vhost_host_notifier_ctrl(vid, true);
+
+	internal->sw_fallback_running = true;
+
+	return 0;
+
+stop_vf:
+	m_ifcvf_stop(internal);
+unset_intr:
+	vdpa_disable_vfio_intr(internal);
+unmap:
+	ifcvf_dma_map(internal, 0);
+error:
+	return -1;
+}
+
+static int
+ifcvf_dev_config(int vid)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+	internal->vid = vid;
+	rte_atomic32_set(&internal->dev_attached, 1);
+	update_datapath(internal);
+
+	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
+		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
+
+	return 0;
+}
+
+static int
+ifcvf_dev_close(int vid)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+
+	if (internal->sw_fallback_running) {
+		/* unset ring relay */
+		unset_vring_relay(internal);
+
+		/* reset VF */
+		m_ifcvf_stop(internal);
+
+		/* remove interrupt setting */
+		vdpa_disable_vfio_intr(internal);
+
+		/* unset DMA map for guest memory */
+		ifcvf_dma_map(internal, 0);
+
+		internal->sw_fallback_running = false;
+	} else {
+		rte_atomic32_set(&internal->dev_attached, 0);
+		update_datapath(internal);
+	}
+
+	return 0;
+}
+
+static int
+ifcvf_set_features(int vid)
+{
+	uint64_t features = 0;
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+	uint64_t log_base = 0, log_size = 0;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+	rte_vhost_get_negotiated_features(vid, &features);
+
+	if (!RTE_VHOST_NEED_LOG(features))
+		return 0;
+
+	if (internal->sw_lm) {
+		ifcvf_sw_fallback_switchover(internal);
+	} else {
+		rte_vhost_get_log_base(vid, &log_base, &log_size);
+		rte_vfio_container_dma_map(internal->vfio_container_fd,
+				log_base, IFCVF_LOG_BASE, log_size);
+		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE, log_size);
+	}
+
+	return 0;
+}
+
+static int
+ifcvf_get_vfio_group_fd(int vid)
+{
+	int did;
+	struct internal_list *list;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	return list->internal->vfio_group_fd;
+}
+
+static int
+ifcvf_get_vfio_device_fd(int vid)
+{
+	int did;
+	struct internal_list *list;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	return list->internal->vfio_dev_fd;
+}
+
+static int
+ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+	struct vfio_region_info reg = { .argsz = sizeof(reg) };
+	int ret;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+
+	reg.index = ifcvf_get_notify_region(&internal->hw);
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
+	if (ret) {
+		DRV_LOG(ERR, "Get not get device region info: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) + reg.offset;
+	*size = 0x1000;
+
+	return 0;
+}
+
+static int
+ifcvf_get_queue_num(int did, uint32_t *queue_num)
+{
+	struct internal_list *list;
+
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	*queue_num = list->internal->max_queues;
+
+	return 0;
+}
+
+static int
+ifcvf_get_vdpa_features(int did, uint64_t *features)
+{
+	struct internal_list *list;
+
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	*features = list->internal->features;
+
+	return 0;
+}
+
+#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
+		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
+		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
+		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
+		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
+		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+static int
+ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
+{
+	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
+	return 0;
+}
+
+static struct rte_vdpa_dev_ops ifcvf_ops = {
+	.get_queue_num = ifcvf_get_queue_num,
+	.get_features = ifcvf_get_vdpa_features,
+	.get_protocol_features = ifcvf_get_protocol_features,
+	.dev_conf = ifcvf_dev_config,
+	.dev_close = ifcvf_dev_close,
+	.set_vring_state = NULL,
+	.set_features = ifcvf_set_features,
+	.migration_done = NULL,
+	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
+	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
+	.get_notify_area = ifcvf_get_notify_area,
+};
+
+static inline int
+open_int(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint16_t *n = extra_args;
+
+	if (value == NULL || extra_args == NULL)
+		return -EINVAL;
+
+	*n = (uint16_t)strtoul(value, NULL, 0);
+	if (*n == USHRT_MAX && errno == ERANGE)
+		return -1;
+
+	return 0;
+}
+
+static int
+ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		struct rte_pci_device *pci_dev)
+{
+	uint64_t features;
+	struct ifcvf_internal *internal = NULL;
+	struct internal_list *list = NULL;
+	int vdpa_mode = 0;
+	int sw_fallback_lm = 0;
+	struct rte_kvargs *kvlist = NULL;
+	int ret = 0;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	if (!pci_dev->device.devargs)
+		return 1;
+
+	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
+			ifcvf_valid_arguments);
+	if (kvlist == NULL)
+		return 1;
+
+	/* probe only when vdpa mode is specified */
+	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
+		rte_kvargs_free(kvlist);
+		return 1;
+	}
+
+	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
+			&vdpa_mode);
+	if (ret < 0 || vdpa_mode == 0) {
+		rte_kvargs_free(kvlist);
+		return 1;
+	}
+
+	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
+	if (list == NULL)
+		goto error;
+
+	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
+	if (internal == NULL)
+		goto error;
+
+	internal->pdev = pci_dev;
+	rte_spinlock_init(&internal->lock);
+
+	if (ifcvf_vfio_setup(internal) < 0) {
+		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
+		goto error;
+	}
+
+	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
+		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
+		goto error;
+	}
+
+	internal->max_queues = IFCVF_MAX_QUEUES;
+	features = ifcvf_get_features(&internal->hw);
+	internal->features = (features &
+		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
+		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
+		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
+		(1ULL << VIRTIO_NET_F_STATUS) |
+		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
+		(1ULL << VHOST_F_LOG_ALL);
+
+	internal->dev_addr.pci_addr = pci_dev->addr;
+	internal->dev_addr.type = PCI_ADDR;
+	list->internal = internal;
+
+	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
+		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
+				&open_int, &sw_fallback_lm);
+		if (ret < 0)
+			goto error;
+	}
+	internal->sw_lm = sw_fallback_lm;
+
+	internal->did = rte_vdpa_register_device(&internal->dev_addr,
+				&ifcvf_ops);
+	if (internal->did < 0) {
+		DRV_LOG(ERR, "failed to register device %s", pci_dev->name);
+		goto error;
+	}
+
+	pthread_mutex_lock(&internal_list_lock);
+	TAILQ_INSERT_TAIL(&internal_list, list, next);
+	pthread_mutex_unlock(&internal_list_lock);
+
+	rte_atomic32_set(&internal->started, 1);
+	update_datapath(internal);
+
+	rte_kvargs_free(kvlist);
+	return 0;
+
+error:
+	rte_kvargs_free(kvlist);
+	rte_free(list);
+	rte_free(internal);
+	return -1;
+}
+
+static int
+ifcvf_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct ifcvf_internal *internal;
+	struct internal_list *list;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	list = find_internal_resource_by_dev(pci_dev);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
+		return -1;
+	}
+
+	internal = list->internal;
+	rte_atomic32_set(&internal->started, 0);
+	update_datapath(internal);
+
+	rte_pci_unmap_device(internal->pdev);
+	rte_vfio_container_destroy(internal->vfio_container_fd);
+	rte_vdpa_unregister_device(internal->did);
+
+	pthread_mutex_lock(&internal_list_lock);
+	TAILQ_REMOVE(&internal_list, list, next);
+	pthread_mutex_unlock(&internal_list_lock);
+
+	rte_free(list);
+	rte_free(internal);
+
+	return 0;
+}
+
+/*
+ * IFCVF has the same vendor ID and device ID as virtio net PCI
+ * device, with its specific subsystem vendor ID and device ID.
+ */
+static const struct rte_pci_id pci_id_ifcvf_map[] = {
+	{ .class_id = RTE_CLASS_ANY_ID,
+	  .vendor_id = IFCVF_VENDOR_ID,
+	  .device_id = IFCVF_DEVICE_ID,
+	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
+	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
+	},
+
+	{ .vendor_id = 0, /* sentinel */
+	},
+};
+
+static struct rte_pci_driver rte_ifcvf_vdpa = {
+	.id_table = pci_id_ifcvf_map,
+	.drv_flags = 0,
+	.probe = ifcvf_pci_probe,
+	.remove = ifcvf_pci_remove,
+};
+
+RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
+RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
+
+RTE_INIT(ifcvf_vdpa_init_log)
+{
+	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
+	if (ifcvf_vdpa_logtype >= 0)
+		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/vdpa/ifc/meson.build b/drivers/vdpa/ifc/meson.build
new file mode 100644
index 0000000..adc9ed9
--- /dev/null
+++ b/drivers/vdpa/ifc/meson.build
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+build = dpdk_conf.has('RTE_LIBRTE_VHOST')
+reason = 'missing dependency, DPDK vhost library'
+allow_experimental_apis = true
+sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
+includes += include_directories('base')
+deps += 'vhost'
diff --git a/drivers/vdpa/ifc/rte_pmd_ifc_version.map b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
new file mode 100644
index 0000000..f9f17e4
--- /dev/null
+++ b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
@@ -0,0 +1,3 @@
+DPDK_20.0 {
+	local: *;
+};
diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
index a839ff5..fd164d3 100644
--- a/drivers/vdpa/meson.build
+++ b/drivers/vdpa/meson.build
@@ -1,7 +1,7 @@
 #   SPDX-License-Identifier: BSD-3-Clause
 #   Copyright 2019 Mellanox Technologies, Ltd
 
-drivers = []
+drivers = ['ifc']
 std_deps = ['bus_pci', 'kvargs']
 std_deps += ['vhost']
 config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
                   ` (2 preceding siblings ...)
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
@ 2020-01-07  7:57 ` Matan Azrad
  2020-01-08  5:44   ` Xu, Rosen
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
  4 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2020-01-07  7:57 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

Hi all

Any comments?

From: Matan Azrad
> As discussed and as described in RFC "[RFC] net: new vdpa PMD for Mellanox
> devices", new vDPA driver is going to be added for Mellanox devices - vDPA
> mlx5 and more.
> 
> The only vDPA driver now is the IFC driver that is located in net directory.
> 
> The IFC driver and the new vDPA mlx5 driver provide the vDPA ops
> introduced in librte_vhost and not the eth-dev ops.
> All the others drivers in net class provide the eth-dev ops.
> The set of features is also different.
> 
> Create a new class for vDPA drivers and move IFC to this class.
> Later, all the new drivers that implement the vDPA ops will be added to the
> vDPA class.
> 
> Also, a vDPA device driver features list was added to vDPA documentation.
> 
> Please review the features list and the series.
> 
> Later on, I'm going to send the vDPA mlx5 driver.
> 
> Thanks.
> 
> 
> Matan Azrad (3):
>   drivers: introduce vDPA class
>   doc: add vDPA feature table
>   drivers: move ifc driver to the vDPA class
> 
>  MAINTAINERS                               |    6 +-
>  doc/guides/conf.py                        |    5 +
>  doc/guides/index.rst                      |    1 +
>  doc/guides/nics/features/ifcvf.ini        |    8 -
>  doc/guides/nics/ifc.rst                   |  106 ---
>  doc/guides/nics/index.rst                 |    1 -
>  doc/guides/vdpadevs/features/default.ini  |   55 ++
>  doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
>  doc/guides/vdpadevs/features_overview.rst |   65 ++
>  doc/guides/vdpadevs/ifc.rst               |  106 +++
>  doc/guides/vdpadevs/index.rst             |   15 +
>  drivers/Makefile                          |    2 +
>  drivers/meson.build                       |    1 +
>  drivers/net/Makefile                      |    3 -
>  drivers/net/ifc/Makefile                  |   34 -
>  drivers/net/ifc/base/ifcvf.c              |  329 --------
>  drivers/net/ifc/base/ifcvf.h              |  162 ----
>  drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
>  drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
>  drivers/net/ifc/meson.build               |    9 -
>  drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
>  drivers/net/meson.build                   |    1 -
>  drivers/vdpa/Makefile                     |   14 +
>  drivers/vdpa/ifc/Makefile                 |   34 +
>  drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
>  drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
>  drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280
> +++++++++++++++++++++++++++++
>  drivers/vdpa/ifc/meson.build              |    9 +
>  drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
>  drivers/vdpa/meson.build                  |    8 +
>  31 files changed, 2152 insertions(+), 1991 deletions(-)  delete mode 100644
> doc/guides/nics/features/ifcvf.ini
>  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> doc/guides/vdpadevs/features/default.ini
>  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
>  create mode 100644 doc/guides/vdpadevs/ifc.rst  create mode 100644
> doc/guides/vdpadevs/index.rst  delete mode 100644
> drivers/net/ifc/Makefile  delete mode 100644 drivers/net/ifc/base/ifcvf.c
> delete mode 100644 drivers/net/ifc/base/ifcvf.h  delete mode 100644
> drivers/net/ifc/base/ifcvf_osdep.h
>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> drivers/net/ifc/meson.build  delete mode 100644
> drivers/net/ifc/rte_pmd_ifc_version.map
>  create mode 100644 drivers/vdpa/Makefile  create mode 100644
> drivers/vdpa/ifc/Makefile  create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
> create mode 100644 drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> drivers/vdpa/ifc/base/ifcvf_osdep.h
>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode 100644
> drivers/vdpa/ifc/meson.build  create mode 100644
> drivers/vdpa/ifc/rte_pmd_ifc_version.map
>  create mode 100644 drivers/vdpa/meson.build
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class Matan Azrad
@ 2020-01-07 17:32   ` Maxime Coquelin
  2020-01-08 21:28     ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-07 17:32 UTC (permalink / raw)
  To: Matan Azrad, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon

Hi Matan,

On 12/25/19 4:19 PM, Matan Azrad wrote:
> The vDPA (vhost data path acceleration) drivers provide support for
> the vDPA operations introduced by the rte_vhost library.
> 
> Any driver which provides the vDPA operations should be moved\added to
> the vdpa class under drivers/vdpa/.
> 
> Create the general files for vDPA class in drivers and in documentation.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  doc/guides/index.rst          |  1 +
>  doc/guides/vdpadevs/index.rst | 13 +++++++++++++
>  drivers/Makefile              |  2 ++
>  drivers/meson.build           |  1 +
>  drivers/vdpa/Makefile         |  8 ++++++++
>  drivers/vdpa/meson.build      |  8 ++++++++
>  6 files changed, 33 insertions(+)
>  create mode 100644 doc/guides/vdpadevs/index.rst
>  create mode 100644 drivers/vdpa/Makefile
>  create mode 100644 drivers/vdpa/meson.build
> 

Looks good to me. Just wondering if we need a dedicated maintainer for
this new class of devices?

Other than that:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table Matan Azrad
@ 2020-01-07 17:39   ` Maxime Coquelin
  2020-01-08  5:28     ` Tiwei Bie
  0 siblings, 1 reply; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-07 17:39 UTC (permalink / raw)
  To: Matan Azrad, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon



On 12/25/19 4:19 PM, Matan Azrad wrote:
> Add vDPA devices features table and explanation.
> 
> Any vDPA driver can add its own supported features by ading a new ini
> file to the features directory in doc/guides/vdpadevs/features.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  doc/guides/conf.py                        |  5 +++
>  doc/guides/vdpadevs/features/default.ini  | 55 ++++++++++++++++++++++++++
>  doc/guides/vdpadevs/features_overview.rst | 65 +++++++++++++++++++++++++++++++
>  doc/guides/vdpadevs/index.rst             |  1 +
>  4 files changed, 126 insertions(+)
>  create mode 100644 doc/guides/vdpadevs/features/default.ini
>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> 
> diff --git a/doc/guides/conf.py b/doc/guides/conf.py
> index 0892c06..c368fa5 100644
> --- a/doc/guides/conf.py
> +++ b/doc/guides/conf.py
> @@ -401,6 +401,11 @@ def setup(app):
>                              'Features',
>                              'Features availability in compression drivers',
>                              'Feature')
> +    table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
> +    generate_overview_table(table_file, 1,
> +                            'Features',
> +                            'Features availability in vDPA drivers',
> +                            'Feature')
>  
>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>          print('Upgrade sphinx to version >= 1.3.1 for '
> diff --git a/doc/guides/vdpadevs/features/default.ini b/doc/guides/vdpadevs/features/default.ini
> new file mode 100644
> index 0000000..a3e0bc7
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/default.ini
> @@ -0,0 +1,55 @@
> +;
> +; Features of a default vDPA driver.
> +;
> +; This file defines the features that are valid for inclusion in
> +; the other driver files and also the order that they appear in
> +; the features table in the documentation. The feature description
> +; string should not exceed feature_str_len defined in conf.py.
> +;

I think some entries below could be removed for vDPA.

> +[Features]
> +csum                 =
> +guest csum           =
> +mac                  =
> +gso                  =
> +guest tso4           =
> +guest tso6           =
> +ecn                  =
> +ufo                  =
> +host tso4            =
> +host tso6            =
> +mrg rxbuf            =
> +ctrl vq              =
> +ctrl rx              =
> +any layout           =
> +guest announce       =
> +mq                   =
> +version 1            =
> +log all              =
> +protocol features    =
> +indirect desc        =
> +event idx            =
> +mtu                  =
> +in_order             =
> +IOMMU platform       =
> +packed               =
> +proto mq             =
> +proto log shmfd      =
> +proto rarp           =
> +proto reply ack      =
> +proto slave req      =
> +proto crypto session =
> +proto host notifier  =
> +proto pagefault      =
> +Multiprocess aware   =
> +BSD nic_uio          =
> +Linux UIO            =

E.g. UIO, which cannot be used since vDPA requires an IOMMU.

> +Linux VFIO           =
> +Other kdrv           =
> +ARMv7                =
> +ARMv8                =
> +Power8               =
> +x86-32               =
> +x86-64               =
> +Usage doc            =
> +Design doc           =
> +Perf doc             =
> \ No newline at end of file
> diff --git a/doc/guides/vdpadevs/features_overview.rst b/doc/guides/vdpadevs/features_overview.rst
> new file mode 100644
> index 0000000..c7745b7
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features_overview.rst
> @@ -0,0 +1,65 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright 2019 Mellanox Technologies, Ltd
> +
> +Overview of vDPA drivers features
> +=================================
> +
> +This section explains the supported features that are listed in the table below.
> +
> +  * csum - Device can handle packets with partial checksum.
> +  * guest csum - Guest can handle packets with partial checksum.
> +  * mac - Device has given MAC address.
> +  * gso - Device can handle packets with any GSO type.
> +  * guest tso4 - Guest can receive TSOv4.
> +  * guest tso6 - Guest can receive TSOv6.
> +  * ecn - Device can receive TSO with ECN.
> +  * ufo - Device can receive UFO.
> +  * host tso4 - Device can receive TSOv4.
> +  * host tso6 - Device can receive TSOv6.
> +  * mrg rxbuf - Guest can merge receive buffers.
> +  * ctrl vq - Control channel is available.
> +  * ctrl rx - Control channel RX mode support.
> +  * any layout - Device can handle any descriptor layout.
> +  * guest announce - Guest can send gratuitous packets.
> +  * mq - Device supports Receive Flow Steering.
> +  * version 1 - v1.0 compliant.
> +  * log all - Device can log all write descriptors (live migration).
> +  * protocol features - Protocol features negotiation support.
> +  * indirect desc - Indirect buffer descriptors support.
> +  * event idx - Support for avail_idx and used_idx fields.
> +  * mtu - Host can advise the guest with its maximum supported MTU.
> +  * in_order - Device can use descriptors in ring order.
> +  * IOMMU platform - Device support IOMMU addresses.
> +  * packed - Device support packed virtio queues.
> +  * proto mq - Support the number of queues query.
> +  * proto log shmfd - Guest support setting log base.
> +  * proto rarp - Host can broadcast a fake RARP after live migration.
> +  * proto reply ack - Host support requested operation status ack. 
> +  * proto slave req - Allow the slave to make requests to the master.
> +  * proto crypto session - Support crypto session creation.
> +  * proto host notifier - Host can register memory region based host notifiers.
> +  * proto pagefault - Slave expose page-fault FD for migration process.
> +  * Multiprocess aware - Driver can be used for primary-secondary process model.
> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> +  * Linux UIO - Works with ``igb_uio`` kernel module.
> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> +  * Other kdrv - Kernel module other than above ones supported.
> +  * ARMv7 - Support armv7 architecture.
> +  * ARMv8 - Support armv8a (64bit) architecture.
> +  * Power8 - Support PowerPC architecture.
> +  * x86-32 - Support 32bits x86 architecture.
> +  * x86-64 - Support 64bits x86 architecture.
> +  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
> +  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
> +  * Perf doc - Documentation describes performance values, In ``doc/perf/``.
> +
> +
> +
> +.. _table_vdpa_pmd_features:
> +
> +.. include:: overview_feature_table.txt
> +
> +.. Note::
> +
> +   Features marked with "P" are partially supported. Refer to the appropriate
> +   driver guide in the following sections for details.
> diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
> index d69dc91..89e2b03 100644
> --- a/doc/guides/vdpadevs/index.rst
> +++ b/doc/guides/vdpadevs/index.rst
> @@ -11,3 +11,4 @@ which can be used from an application through vhost API.
>      :maxdepth: 2
>      :numbered:
>  
> +    features_overview
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class
  2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
@ 2020-01-07 18:17   ` Maxime Coquelin
  0 siblings, 0 replies; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-07 18:17 UTC (permalink / raw)
  To: Matan Azrad, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon



On 12/25/19 4:19 PM, Matan Azrad wrote:
> A new vDPA class was recently introduced.
> 
> IFC driver implements the vDPA operations, hence it should be moved to
> the vDPA class.
> 
> Move it.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  MAINTAINERS                              |    6 +-
>  doc/guides/nics/features/ifcvf.ini       |    8 -
>  doc/guides/nics/ifc.rst                  |  106 ---
>  doc/guides/nics/index.rst                |    1 -
>  doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
>  doc/guides/vdpadevs/ifc.rst              |  106 +++
>  doc/guides/vdpadevs/index.rst            |    1 +
>  drivers/net/Makefile                     |    3 -
>  drivers/net/ifc/Makefile                 |   34 -
>  drivers/net/ifc/base/ifcvf.c             |  329 --------
>  drivers/net/ifc/base/ifcvf.h             |  162 ----
>  drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
>  drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
>  drivers/net/ifc/meson.build              |    9 -
>  drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
>  drivers/net/meson.build                  |    1 -
>  drivers/vdpa/Makefile                    |    6 +
>  drivers/vdpa/ifc/Makefile                |   34 +
>  drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
>  drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
>  drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280 ++++++++++++++++++++++++++++++
>  drivers/vdpa/ifc/meson.build             |    9 +
>  drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
>  drivers/vdpa/meson.build                 |    2 +-
>  25 files changed, 1994 insertions(+), 1992 deletions(-)
>  delete mode 100644 doc/guides/nics/features/ifcvf.ini
>  delete mode 100644 doc/guides/nics/ifc.rst
>  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
>  create mode 100644 doc/guides/vdpadevs/ifc.rst
>  delete mode 100644 drivers/net/ifc/Makefile
>  delete mode 100644 drivers/net/ifc/base/ifcvf.c
>  delete mode 100644 drivers/net/ifc/base/ifcvf.h
>  delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
>  delete mode 100644 drivers/net/ifc/meson.build
>  delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
>  create mode 100644 drivers/vdpa/ifc/Makefile
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
>  create mode 100644 drivers/vdpa/ifc/meson.build
>  create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
> 

...

> diff --git a/doc/guides/vdpadevs/features/ifcvf.ini b/doc/guides/vdpadevs/features/ifcvf.ini
> new file mode 100644
> index 0000000..ef1fc47
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/ifcvf.ini
> @@ -0,0 +1,8 @@
> +;
> +; Supported features of the 'ifcvf' vDPA driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +x86-32               = Y
> +x86-64               = Y

Xiao or someone knowing the IFC enough would need to file the feature
list in a separate patch.

Other than that:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-07 17:39   ` Maxime Coquelin
@ 2020-01-08  5:28     ` Tiwei Bie
  2020-01-08  7:20       ` Andrew Rybchenko
  0 siblings, 1 reply; 50+ messages in thread
From: Tiwei Bie @ 2020-01-08  5:28 UTC (permalink / raw)
  To: Maxime Coquelin, Matan Azrad
  Cc: Zhihong Wang, Xiao Wang, Ferruh Yigit, dev, Thomas Monjalon

On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> On 12/25/19 4:19 PM, Matan Azrad wrote:
> > Add vDPA devices features table and explanation.
> > 
> > Any vDPA driver can add its own supported features by ading a new ini
> > file to the features directory in doc/guides/vdpadevs/features.
> > 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  doc/guides/conf.py                        |  5 +++
> >  doc/guides/vdpadevs/features/default.ini  | 55 ++++++++++++++++++++++++++
> >  doc/guides/vdpadevs/features_overview.rst | 65 +++++++++++++++++++++++++++++++
> >  doc/guides/vdpadevs/index.rst             |  1 +
> >  4 files changed, 126 insertions(+)
> >  create mode 100644 doc/guides/vdpadevs/features/default.ini
> >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > 
> > diff --git a/doc/guides/conf.py b/doc/guides/conf.py
> > index 0892c06..c368fa5 100644
> > --- a/doc/guides/conf.py
> > +++ b/doc/guides/conf.py
> > @@ -401,6 +401,11 @@ def setup(app):
> >                              'Features',
> >                              'Features availability in compression drivers',
> >                              'Feature')
> > +    table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
> > +    generate_overview_table(table_file, 1,
> > +                            'Features',
> > +                            'Features availability in vDPA drivers',
> > +                            'Feature')
> >  
> >      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> >          print('Upgrade sphinx to version >= 1.3.1 for '
> > diff --git a/doc/guides/vdpadevs/features/default.ini b/doc/guides/vdpadevs/features/default.ini
> > new file mode 100644
> > index 0000000..a3e0bc7
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/features/default.ini
> > @@ -0,0 +1,55 @@
> > +;
> > +; Features of a default vDPA driver.
> > +;
> > +; This file defines the features that are valid for inclusion in
> > +; the other driver files and also the order that they appear in
> > +; the features table in the documentation. The feature description
> > +; string should not exceed feature_str_len defined in conf.py.
> > +;
> 
> I think some entries below could be removed for vDPA.

+1

> 
> > +[Features]
> > +csum                 =
> > +guest csum           =
> > +mac                  =
> > +gso                  =
> > +guest tso4           =
> > +guest tso6           =
> > +ecn                  =
> > +ufo                  =
> > +host tso4            =
> > +host tso6            =
> > +mrg rxbuf            =
> > +ctrl vq              =
> > +ctrl rx              =
> > +any layout           =
> > +guest announce       =
> > +mq                   =
> > +version 1            =
> > +log all              =
> > +protocol features    =

We may not need to list this. The proto * would imply it.

> > +indirect desc        =
> > +event idx            =
> > +mtu                  =
> > +in_order             =
> > +IOMMU platform       =
> > +packed               =
> > +proto mq             =
> > +proto log shmfd      =
> > +proto rarp           =
> > +proto reply ack      =
> > +proto slave req      =

Ditto. This feature is to be used by other features.
Features like host notifier would imply it.

> > +proto crypto session =

We don't need to list this before we officially support
the crypto vDPA device.

> > +proto host notifier  =
> > +proto pagefault      =
> > +Multiprocess aware   =

There is no support for this in library currently.
To support it, we need to sync vhost fds and messages
among processes.

> > +BSD nic_uio          =
> > +Linux UIO            =
> 
> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
> 
> > +Linux VFIO           =
> > +Other kdrv           =
> > +ARMv7                =
> > +ARMv8                =
> > +Power8               =
> > +x86-32               =
> > +x86-64               =
> > +Usage doc            =
> > +Design doc           =
> > +Perf doc             =
> > \ No newline at end of file
> > diff --git a/doc/guides/vdpadevs/features_overview.rst b/doc/guides/vdpadevs/features_overview.rst
> > new file mode 100644
> > index 0000000..c7745b7
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/features_overview.rst
> > @@ -0,0 +1,65 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright 2019 Mellanox Technologies, Ltd
> > +
> > +Overview of vDPA drivers features
> > +=================================
> > +
> > +This section explains the supported features that are listed in the table below.
> > +
> > +  * csum - Device can handle packets with partial checksum.
> > +  * guest csum - Guest can handle packets with partial checksum.
> > +  * mac - Device has given MAC address.
> > +  * gso - Device can handle packets with any GSO type.
> > +  * guest tso4 - Guest can receive TSOv4.
> > +  * guest tso6 - Guest can receive TSOv6.
> > +  * ecn - Device can receive TSO with ECN.
> > +  * ufo - Device can receive UFO.
> > +  * host tso4 - Device can receive TSOv4.
> > +  * host tso6 - Device can receive TSOv6.
> > +  * mrg rxbuf - Guest can merge receive buffers.
> > +  * ctrl vq - Control channel is available.
> > +  * ctrl rx - Control channel RX mode support.
> > +  * any layout - Device can handle any descriptor layout.
> > +  * guest announce - Guest can send gratuitous packets.
> > +  * mq - Device supports Receive Flow Steering.
> > +  * version 1 - v1.0 compliant.
> > +  * log all - Device can log all write descriptors (live migration).
> > +  * protocol features - Protocol features negotiation support.
> > +  * indirect desc - Indirect buffer descriptors support.
> > +  * event idx - Support for avail_idx and used_idx fields.
> > +  * mtu - Host can advise the guest with its maximum supported MTU.
> > +  * in_order - Device can use descriptors in ring order.
> > +  * IOMMU platform - Device support IOMMU addresses.
> > +  * packed - Device support packed virtio queues.
> > +  * proto mq - Support the number of queues query.
> > +  * proto log shmfd - Guest support setting log base.
> > +  * proto rarp - Host can broadcast a fake RARP after live migration.
> > +  * proto reply ack - Host support requested operation status ack. 
> > +  * proto slave req - Allow the slave to make requests to the master.
> > +  * proto crypto session - Support crypto session creation.
> > +  * proto host notifier - Host can register memory region based host notifiers.
> > +  * proto pagefault - Slave expose page-fault FD for migration process.
> > +  * Multiprocess aware - Driver can be used for primary-secondary process model.
> > +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> > +  * Linux UIO - Works with ``igb_uio`` kernel module.
> > +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> > +  * Other kdrv - Kernel module other than above ones supported.
> > +  * ARMv7 - Support armv7 architecture.
> > +  * ARMv8 - Support armv8a (64bit) architecture.
> > +  * Power8 - Support PowerPC architecture.
> > +  * x86-32 - Support 32bits x86 architecture.
> > +  * x86-64 - Support 64bits x86 architecture.
> > +  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
> > +  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
> > +  * Perf doc - Documentation describes performance values, In ``doc/perf/``.
> > +
> > +
> > +
> > +.. _table_vdpa_pmd_features:
> > +
> > +.. include:: overview_feature_table.txt
> > +
> > +.. Note::
> > +
> > +   Features marked with "P" are partially supported. Refer to the appropriate
> > +   driver guide in the following sections for details.
> > diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
> > index d69dc91..89e2b03 100644
> > --- a/doc/guides/vdpadevs/index.rst
> > +++ b/doc/guides/vdpadevs/index.rst
> > @@ -11,3 +11,4 @@ which can be used from an application through vhost API.
> >      :maxdepth: 2
> >      :numbered:
> >  
> > +    features_overview
> > 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-07  7:57 ` [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
@ 2020-01-08  5:44   ` Xu, Rosen
  2020-01-08 10:45     ` Matan Azrad
  0 siblings, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-08  5:44 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Thomas Monjalon, Xu, Rosen, Pei, Andy

Hi Matan,

Did you think about OVS DPDK?
vDPA is a basic module for OVS, currently it will take some exception path packet processing
for OVS, so it still needs to integrate eth_dev.

Thanks,
Rosen

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> Sent: Tuesday, January 07, 2020 15:57
> To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> Hi all
> 
> Any comments?
> 
> From: Matan Azrad
> > As discussed and as described in RFC "[RFC] net: new vdpa PMD for
> > Mellanox devices", new vDPA driver is going to be added for Mellanox
> > devices - vDPA
> > mlx5 and more.
> >
> > The only vDPA driver now is the IFC driver that is located in net directory.
> >
> > The IFC driver and the new vDPA mlx5 driver provide the vDPA ops
> > introduced in librte_vhost and not the eth-dev ops.
> > All the others drivers in net class provide the eth-dev ops.
> > The set of features is also different.
> >
> > Create a new class for vDPA drivers and move IFC to this class.
> > Later, all the new drivers that implement the vDPA ops will be added
> > to the vDPA class.
> >
> > Also, a vDPA device driver features list was added to vDPA documentation.
> >
> > Please review the features list and the series.
> >
> > Later on, I'm going to send the vDPA mlx5 driver.
> >
> > Thanks.
> >
> >
> > Matan Azrad (3):
> >   drivers: introduce vDPA class
> >   doc: add vDPA feature table
> >   drivers: move ifc driver to the vDPA class
> >
> >  MAINTAINERS                               |    6 +-
> >  doc/guides/conf.py                        |    5 +
> >  doc/guides/index.rst                      |    1 +
> >  doc/guides/nics/features/ifcvf.ini        |    8 -
> >  doc/guides/nics/ifc.rst                   |  106 ---
> >  doc/guides/nics/index.rst                 |    1 -
> >  doc/guides/vdpadevs/features/default.ini  |   55 ++
> >  doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
> >  doc/guides/vdpadevs/features_overview.rst |   65 ++
> >  doc/guides/vdpadevs/ifc.rst               |  106 +++
> >  doc/guides/vdpadevs/index.rst             |   15 +
> >  drivers/Makefile                          |    2 +
> >  drivers/meson.build                       |    1 +
> >  drivers/net/Makefile                      |    3 -
> >  drivers/net/ifc/Makefile                  |   34 -
> >  drivers/net/ifc/base/ifcvf.c              |  329 --------
> >  drivers/net/ifc/base/ifcvf.h              |  162 ----
> >  drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
> >  drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
> >  drivers/net/ifc/meson.build               |    9 -
> >  drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
> >  drivers/net/meson.build                   |    1 -
> >  drivers/vdpa/Makefile                     |   14 +
> >  drivers/vdpa/ifc/Makefile                 |   34 +
> >  drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
> >  drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
> >  drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
> >  drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280
> > +++++++++++++++++++++++++++++
> >  drivers/vdpa/ifc/meson.build              |    9 +
> >  drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
> >  drivers/vdpa/meson.build                  |    8 +
> >  31 files changed, 2152 insertions(+), 1991 deletions(-)  delete mode
> > 100644 doc/guides/nics/features/ifcvf.ini
> >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > doc/guides/vdpadevs/features/default.ini
> >  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
> >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> >  create mode 100644 doc/guides/vdpadevs/ifc.rst  create mode 100644
> > doc/guides/vdpadevs/index.rst  delete mode 100644
> > drivers/net/ifc/Makefile  delete mode 100644
> > drivers/net/ifc/base/ifcvf.c delete mode 100644
> > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > drivers/net/ifc/base/ifcvf_osdep.h
> >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > drivers/net/ifc/meson.build  delete mode 100644
> > drivers/net/ifc/rte_pmd_ifc_version.map
> >  create mode 100644 drivers/vdpa/Makefile  create mode 100644
> > drivers/vdpa/ifc/Makefile  create mode 100644
> > drivers/vdpa/ifc/base/ifcvf.c create mode 100644
> > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > drivers/vdpa/ifc/base/ifcvf_osdep.h
> >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode 100644
> > drivers/vdpa/ifc/meson.build  create mode 100644
> > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> >  create mode 100644 drivers/vdpa/meson.build
> >
> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-08  5:28     ` Tiwei Bie
@ 2020-01-08  7:20       ` Andrew Rybchenko
  2020-01-08 10:42         ` Matan Azrad
  0 siblings, 1 reply; 50+ messages in thread
From: Andrew Rybchenko @ 2020-01-08  7:20 UTC (permalink / raw)
  To: Tiwei Bie, Maxime Coquelin, Matan Azrad
  Cc: Zhihong Wang, Xiao Wang, Ferruh Yigit, dev, Thomas Monjalon

On 1/8/20 8:28 AM, Tiwei Bie wrote:
> On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
>> On 12/25/19 4:19 PM, Matan Azrad wrote:
>>> Add vDPA devices features table and explanation.
>>>
>>> Any vDPA driver can add its own supported features by ading a new ini
>>> file to the features directory in doc/guides/vdpadevs/features.
>>>
>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>> ---
>>>  doc/guides/conf.py                        |  5 +++
>>>  doc/guides/vdpadevs/features/default.ini  | 55 ++++++++++++++++++++++++++
>>>  doc/guides/vdpadevs/features_overview.rst | 65 +++++++++++++++++++++++++++++++
>>>  doc/guides/vdpadevs/index.rst             |  1 +
>>>  4 files changed, 126 insertions(+)
>>>  create mode 100644 doc/guides/vdpadevs/features/default.ini
>>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
>>>
>>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py
>>> index 0892c06..c368fa5 100644
>>> --- a/doc/guides/conf.py
>>> +++ b/doc/guides/conf.py
>>> @@ -401,6 +401,11 @@ def setup(app):
>>>                              'Features',
>>>                              'Features availability in compression drivers',
>>>                              'Feature')
>>> +    table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
>>> +    generate_overview_table(table_file, 1,
>>> +                            'Features',
>>> +                            'Features availability in vDPA drivers',
>>> +                            'Feature')
>>>  
>>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>>>          print('Upgrade sphinx to version >= 1.3.1 for '
>>> diff --git a/doc/guides/vdpadevs/features/default.ini b/doc/guides/vdpadevs/features/default.ini
>>> new file mode 100644
>>> index 0000000..a3e0bc7
>>> --- /dev/null
>>> +++ b/doc/guides/vdpadevs/features/default.ini
>>> @@ -0,0 +1,55 @@
>>> +;
>>> +; Features of a default vDPA driver.
>>> +;
>>> +; This file defines the features that are valid for inclusion in
>>> +; the other driver files and also the order that they appear in
>>> +; the features table in the documentation. The feature description
>>> +; string should not exceed feature_str_len defined in conf.py.
>>> +;
>> I think some entries below could be removed for vDPA.
> +1
>
>>> +[Features]
>>> +csum                 =
>>> +guest csum           =
>>> +mac                  =
>>> +gso                  =
>>> +guest tso4           =
>>> +guest tso6           =
>>> +ecn                  =
>>> +ufo                  =
>>> +host tso4            =
>>> +host tso6            =
>>> +mrg rxbuf            =
>>> +ctrl vq              =
>>> +ctrl rx              =
>>> +any layout           =
>>> +guest announce       =
>>> +mq                   =
>>> +version 1            =
>>> +log all              =
>>> +protocol features    =
> We may not need to list this. The proto * would imply it.
>
>>> +indirect desc        =
>>> +event idx            =
>>> +mtu                  =
>>> +in_order             =
>>> +IOMMU platform       =
>>> +packed               =
>>> +proto mq             =
>>> +proto log shmfd      =
>>> +proto rarp           =
>>> +proto reply ack      =
>>> +proto slave req      =
> Ditto. This feature is to be used by other features.
> Features like host notifier would imply it.
>
>>> +proto crypto session =
> We don't need to list this before we officially support
> the crypto vDPA device.
>
>>> +proto host notifier  =
>>> +proto pagefault      =
>>> +Multiprocess aware   =
> There is no support for this in library currently.
> To support it, we need to sync vhost fds and messages
> among processes.
>
>>> +BSD nic_uio          =
>>> +Linux UIO            =
>> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
>>
>>> +Linux VFIO           =
>>> +Other kdrv           =
>>> +ARMv7                =
>>> +ARMv8                =
>>> +Power8               =
>>> +x86-32               =
>>> +x86-64               =
>>> +Usage doc            =
>>> +Design doc           =
>>> +Perf doc             =
>>> \ No newline at end of file
>>> diff --git a/doc/guides/vdpadevs/features_overview.rst b/doc/guides/vdpadevs/features_overview.rst
>>> new file mode 100644
>>> index 0000000..c7745b7
>>> --- /dev/null
>>> +++ b/doc/guides/vdpadevs/features_overview.rst
>>> @@ -0,0 +1,65 @@
>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>> +    Copyright 2019 Mellanox Technologies, Ltd
>>> +
>>> +Overview of vDPA drivers features
>>> +=================================
>>> +
>>> +This section explains the supported features that are listed in the table below.
>>> +
>>> +  * csum - Device can handle packets with partial checksum.
>>> +  * guest csum - Guest can handle packets with partial checksum.
>>> +  * mac - Device has given MAC address.
>>> +  * gso - Device can handle packets with any GSO type.
>>> +  * guest tso4 - Guest can receive TSOv4.
>>> +  * guest tso6 - Guest can receive TSOv6.
>>> +  * ecn - Device can receive TSO with ECN.
>>> +  * ufo - Device can receive UFO.
>>> +  * host tso4 - Device can receive TSOv4.
>>> +  * host tso6 - Device can receive TSOv6.
>>> +  * mrg rxbuf - Guest can merge receive buffers.
>>> +  * ctrl vq - Control channel is available.
>>> +  * ctrl rx - Control channel RX mode support.
>>> +  * any layout - Device can handle any descriptor layout.
>>> +  * guest announce - Guest can send gratuitous packets.
>>> +  * mq - Device supports Receive Flow Steering.
>>> +  * version 1 - v1.0 compliant.
>>> +  * log all - Device can log all write descriptors (live migration).
>>> +  * protocol features - Protocol features negotiation support.
>>> +  * indirect desc - Indirect buffer descriptors support.
>>> +  * event idx - Support for avail_idx and used_idx fields.
>>> +  * mtu - Host can advise the guest with its maximum supported MTU.
>>> +  * in_order - Device can use descriptors in ring order.
>>> +  * IOMMU platform - Device support IOMMU addresses.
>>> +  * packed - Device support packed virtio queues.
>>> +  * proto mq - Support the number of queues query.
>>> +  * proto log shmfd - Guest support setting log base.
>>> +  * proto rarp - Host can broadcast a fake RARP after live migration.
>>> +  * proto reply ack - Host support requested operation status ack. 
>>> +  * proto slave req - Allow the slave to make requests to the master.
>>> +  * proto crypto session - Support crypto session creation.
>>> +  * proto host notifier - Host can register memory region based host notifiers.
>>> +  * proto pagefault - Slave expose page-fault FD for migration process.
>>> +  * Multiprocess aware - Driver can be used for primary-secondary process model.
>>> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
>>> +  * Linux UIO - Works with ``igb_uio`` kernel module.
>>> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
>>> +  * Other kdrv - Kernel module other than above ones supported.
>>> +  * ARMv7 - Support armv7 architecture.
>>> +  * ARMv8 - Support armv8a (64bit) architecture.
>>> +  * Power8 - Support PowerPC architecture.
>>> +  * x86-32 - Support 32bits x86 architecture.
>>> +  * x86-64 - Support 64bits x86 architecture.
>>> +  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
>>> +  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
>>> +  * Perf doc - Documentation describes performance values, In ``doc/perf/``.

Are you going to put Y mark for all these features in v20.02 release cycle?
Basically the question is: is it OK to have features that no driver
supports?
"Dead" features do not look nice.
I would say yes for architecture support since it is better to list all
architectures
supported by DPDK and make it clear which architectures are supported by
particular vDPA driver.
May be it is OK for features which are directly correspond to virtio/vhost
features (of course, it is very useful to know spec compliance).
In this case I think it would be very helpful to add references to spec
sections.

Also I like doc/guides/nics/features.rst and would like to know why
the practice is not used here. I'm talking about features description
format.

[snip]


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-08  7:20       ` Andrew Rybchenko
@ 2020-01-08 10:42         ` Matan Azrad
  2020-01-08 13:11           ` Andrew Rybchenko
  2020-01-09  2:15           ` Tiwei Bie
  0 siblings, 2 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-08 10:42 UTC (permalink / raw)
  To: Andrew Rybchenko, Tiwei Bie, Maxime Coquelin
  Cc: Zhihong Wang, Xiao Wang, Ferruh Yigit, dev, Thomas Monjalon

Hi all

Thanks very much for the review.
Please see below.

From: Andrew Rybchenko
> On 1/8/20 8:28 AM, Tiwei Bie wrote:
> > On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> >> On 12/25/19 4:19 PM, Matan Azrad wrote:
> >>> Add vDPA devices features table and explanation.
> >>>
> >>> Any vDPA driver can add its own supported features by ading a new
> >>> ini file to the features directory in doc/guides/vdpadevs/features.
> >>>
> >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>> ---
> >>>  doc/guides/conf.py                        |  5 +++
> >>>  doc/guides/vdpadevs/features/default.ini  | 55
> >>> ++++++++++++++++++++++++++
> doc/guides/vdpadevs/features_overview.rst | 65
> +++++++++++++++++++++++++++++++
> >>>  doc/guides/vdpadevs/index.rst             |  1 +
> >>>  4 files changed, 126 insertions(+)
> >>>  create mode 100644 doc/guides/vdpadevs/features/default.ini
> >>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> >>>
> >>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index
> >>> 0892c06..c368fa5 100644
> >>> --- a/doc/guides/conf.py
> >>> +++ b/doc/guides/conf.py
> >>> @@ -401,6 +401,11 @@ def setup(app):
> >>>                              'Features',
> >>>                              'Features availability in compression drivers',
> >>>                              'Feature')
> >>> +    table_file = dirname(__file__) +
> '/vdpadevs/overview_feature_table.txt'
> >>> +    generate_overview_table(table_file, 1,
> >>> +                            'Features',
> >>> +                            'Features availability in vDPA drivers',
> >>> +                            'Feature')
> >>>
> >>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> >>>          print('Upgrade sphinx to version >= 1.3.1 for '
> >>> diff --git a/doc/guides/vdpadevs/features/default.ini
> >>> b/doc/guides/vdpadevs/features/default.ini
> >>> new file mode 100644
> >>> index 0000000..a3e0bc7
> >>> --- /dev/null
> >>> +++ b/doc/guides/vdpadevs/features/default.ini
> >>> @@ -0,0 +1,55 @@
> >>> +;
> >>> +; Features of a default vDPA driver.
> >>> +;
> >>> +; This file defines the features that are valid for inclusion in ;
> >>> +the other driver files and also the order that they appear in ; the
> >>> +features table in the documentation. The feature description ;
> >>> +string should not exceed feature_str_len defined in conf.py.
> >>> +;
> >> I think some entries below could be removed for vDPA.
> > +1
> >
> >>> +[Features]
> >>> +csum                 =
> >>> +guest csum           =
> >>> +mac                  =
> >>> +gso                  =
> >>> +guest tso4           =
> >>> +guest tso6           =
> >>> +ecn                  =
> >>> +ufo                  =
> >>> +host tso4            =
> >>> +host tso6            =
> >>> +mrg rxbuf            =
> >>> +ctrl vq              =
> >>> +ctrl rx              =
> >>> +any layout           =
> >>> +guest announce       =
> >>> +mq                   =
> >>> +version 1            =
> >>> +log all              =
> >>> +protocol features    =
> > We may not need to list this. The proto * would imply it.

So can you explain why this flag is exposed by the vhost features?

> >>> +indirect desc        =
> >>> +event idx            =
> >>> +mtu                  =
> >>> +in_order             =
> >>> +IOMMU platform       =
> >>> +packed               =
> >>> +proto mq             =
> >>> +proto log shmfd      =
> >>> +proto rarp           =
> >>> +proto reply ack      =
> >>> +proto slave req      =
> > Ditto. This feature is to be used by other features.
> > Features like host notifier would imply it.

So can you explain why this flag is exposed by the vhost protocol features?

> >>> +proto crypto session =
> > We don't need to list this before we officially support the crypto
> > vDPA device.
> >

Ok, will remove.

> >>> +proto host notifier  =
> >>> +proto pagefault      =
> >>> +Multiprocess aware   =
> > There is no support for this in library currently.
> > To support it, we need to sync vhost fds and messages among processes.
> >

Ok, will remove. 

> >>> +BSD nic_uio          =
> >>> +Linux UIO            =
> >> E.g. UIO, which cannot be used since vDPA requires an IOMMU.

Ok, will remove.

> >>> +Linux VFIO           =
> >>> +Other kdrv           =
> >>> +ARMv7                =
> >>> +ARMv8                =
> >>> +Power8               =
> >>> +x86-32               =
> >>> +x86-64               =
> >>> +Usage doc            =
> >>> +Design doc           =
> >>> +Perf doc             =
> >>> \ No newline at end of file
> >>> diff --git a/doc/guides/vdpadevs/features_overview.rst
> >>> b/doc/guides/vdpadevs/features_overview.rst
> >>> new file mode 100644
> >>> index 0000000..c7745b7
> >>> --- /dev/null
> >>> +++ b/doc/guides/vdpadevs/features_overview.rst
> >>> @@ -0,0 +1,65 @@
> >>> +..  SPDX-License-Identifier: BSD-3-Clause
> >>> +    Copyright 2019 Mellanox Technologies, Ltd
> >>> +
> >>> +Overview of vDPA drivers features
> >>> +=================================
> >>> +
> >>> +This section explains the supported features that are listed in the table
> below.
> >>> +
> >>> +  * csum - Device can handle packets with partial checksum.
> >>> +  * guest csum - Guest can handle packets with partial checksum.
> >>> +  * mac - Device has given MAC address.
> >>> +  * gso - Device can handle packets with any GSO type.
> >>> +  * guest tso4 - Guest can receive TSOv4.
> >>> +  * guest tso6 - Guest can receive TSOv6.
> >>> +  * ecn - Device can receive TSO with ECN.
> >>> +  * ufo - Device can receive UFO.
> >>> +  * host tso4 - Device can receive TSOv4.
> >>> +  * host tso6 - Device can receive TSOv6.
> >>> +  * mrg rxbuf - Guest can merge receive buffers.
> >>> +  * ctrl vq - Control channel is available.
> >>> +  * ctrl rx - Control channel RX mode support.
> >>> +  * any layout - Device can handle any descriptor layout.
> >>> +  * guest announce - Guest can send gratuitous packets.
> >>> +  * mq - Device supports Receive Flow Steering.
> >>> +  * version 1 - v1.0 compliant.
> >>> +  * log all - Device can log all write descriptors (live migration).
> >>> +  * protocol features - Protocol features negotiation support.
> >>> +  * indirect desc - Indirect buffer descriptors support.
> >>> +  * event idx - Support for avail_idx and used_idx fields.
> >>> +  * mtu - Host can advise the guest with its maximum supported MTU.
> >>> +  * in_order - Device can use descriptors in ring order.
> >>> +  * IOMMU platform - Device support IOMMU addresses.
> >>> +  * packed - Device support packed virtio queues.
> >>> +  * proto mq - Support the number of queues query.
> >>> +  * proto log shmfd - Guest support setting log base.
> >>> +  * proto rarp - Host can broadcast a fake RARP after live migration.
> >>> +  * proto reply ack - Host support requested operation status ack.
> >>> +  * proto slave req - Allow the slave to make requests to the master.
> >>> +  * proto crypto session - Support crypto session creation.
> >>> +  * proto host notifier - Host can register memory region based host
> notifiers.
> >>> +  * proto pagefault - Slave expose page-fault FD for migration process.
> >>> +  * Multiprocess aware - Driver can be used for primary-secondary
> process model.
> >>> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> >>> +  * Linux UIO - Works with ``igb_uio`` kernel module.
> >>> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> >>> +  * Other kdrv - Kernel module other than above ones supported.
> >>> +  * ARMv7 - Support armv7 architecture.
> >>> +  * ARMv8 - Support armv8a (64bit) architecture.
> >>> +  * Power8 - Support PowerPC architecture.
> >>> +  * x86-32 - Support 32bits x86 architecture.
> >>> +  * x86-64 - Support 64bits x86 architecture.
> >>> +  * Usage doc - Documentation describes usage, In
> ``doc/guides/vdpadevs/``.
> >>> +  * Design doc - Documentation describes design. In
> ``doc/guides/vdpadevs/``.
> >>> +  * Perf doc - Documentation describes performance values, In
> ``doc/perf/``.
> 
> Are you going to put Y mark for all these features in v20.02 release cycle?

No.

> Basically the question is: is it OK to have features that no driver supports?
> "Dead" features do not look nice.
> I would say yes for architecture support since it is better to list all
> architectures supported by DPDK and make it clear which architectures are
> supported by particular vDPA driver.
> May be it is OK for features which are directly correspond to virtio/vhost
> features (of course, it is very useful to know spec compliance).
> In this case I think it would be very helpful to add references to spec
> sections.

These features are supported in vhost library. So I think it may sense to add them.
I think, especially in vDPA case, It is good for the vhost users to see what is supported and what is not supported by the vDPA driver.

Which spec are you talking about?

> Also I like doc/guides/nics/features.rst and would like to know why the
> practice is not used here. I'm talking about features description format.

Yes, but except nic class, no class uses it.
Maybe it is because there are not a lot of different ways to add the configuration\capability.
Here, for example, most of them are just in feature bitmap.
 
> [snip]


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-08  5:44   ` Xu, Rosen
@ 2020-01-08 10:45     ` Matan Azrad
  2020-01-08 12:39       ` Xu, Rosen
  0 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2020-01-08 10:45 UTC (permalink / raw)
  To: Xu, Rosen, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Thomas Monjalon, Pei, Andy

Hi Xu

From: Xu, Rosen
> Hi Matan,
> 
> Did you think about OVS DPDK?
> vDPA is a basic module for OVS, currently it will take some exception path
> packet processing for OVS, so it still needs to integrate eth_dev.

I don't understand your question.

What do you mean by "integrate eth_dev"?

> Thanks,
> Rosen
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> > Sent: Tuesday, January 07, 2020 15:57
> > To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> > <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> > Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>
> > Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas
> > Monjalon <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA
> > device drivers
> >
> > Hi all
> >
> > Any comments?
> >
> > From: Matan Azrad
> > > As discussed and as described in RFC "[RFC] net: new vdpa PMD for
> > > Mellanox devices", new vDPA driver is going to be added for Mellanox
> > > devices - vDPA
> > > mlx5 and more.
> > >
> > > The only vDPA driver now is the IFC driver that is located in net directory.
> > >
> > > The IFC driver and the new vDPA mlx5 driver provide the vDPA ops
> > > introduced in librte_vhost and not the eth-dev ops.
> > > All the others drivers in net class provide the eth-dev ops.
> > > The set of features is also different.
> > >
> > > Create a new class for vDPA drivers and move IFC to this class.
> > > Later, all the new drivers that implement the vDPA ops will be added
> > > to the vDPA class.
> > >
> > > Also, a vDPA device driver features list was added to vDPA
> documentation.
> > >
> > > Please review the features list and the series.
> > >
> > > Later on, I'm going to send the vDPA mlx5 driver.
> > >
> > > Thanks.
> > >
> > >
> > > Matan Azrad (3):
> > >   drivers: introduce vDPA class
> > >   doc: add vDPA feature table
> > >   drivers: move ifc driver to the vDPA class
> > >
> > >  MAINTAINERS                               |    6 +-
> > >  doc/guides/conf.py                        |    5 +
> > >  doc/guides/index.rst                      |    1 +
> > >  doc/guides/nics/features/ifcvf.ini        |    8 -
> > >  doc/guides/nics/ifc.rst                   |  106 ---
> > >  doc/guides/nics/index.rst                 |    1 -
> > >  doc/guides/vdpadevs/features/default.ini  |   55 ++
> > >  doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
> > >  doc/guides/vdpadevs/features_overview.rst |   65 ++
> > >  doc/guides/vdpadevs/ifc.rst               |  106 +++
> > >  doc/guides/vdpadevs/index.rst             |   15 +
> > >  drivers/Makefile                          |    2 +
> > >  drivers/meson.build                       |    1 +
> > >  drivers/net/Makefile                      |    3 -
> > >  drivers/net/ifc/Makefile                  |   34 -
> > >  drivers/net/ifc/base/ifcvf.c              |  329 --------
> > >  drivers/net/ifc/base/ifcvf.h              |  162 ----
> > >  drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
> > >  drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
> > >  drivers/net/ifc/meson.build               |    9 -
> > >  drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
> > >  drivers/net/meson.build                   |    1 -
> > >  drivers/vdpa/Makefile                     |   14 +
> > >  drivers/vdpa/ifc/Makefile                 |   34 +
> > >  drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
> > >  drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
> > >  drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
> > >  drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280
> > > +++++++++++++++++++++++++++++
> > >  drivers/vdpa/ifc/meson.build              |    9 +
> > >  drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
> > >  drivers/vdpa/meson.build                  |    8 +
> > >  31 files changed, 2152 insertions(+), 1991 deletions(-)  delete
> > > mode
> > > 100644 doc/guides/nics/features/ifcvf.ini
> > >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > > doc/guides/vdpadevs/features/default.ini
> > >  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
> > >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > >  create mode 100644 doc/guides/vdpadevs/ifc.rst  create mode 100644
> > > doc/guides/vdpadevs/index.rst  delete mode 100644
> > > drivers/net/ifc/Makefile  delete mode 100644
> > > drivers/net/ifc/base/ifcvf.c delete mode 100644
> > > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > > drivers/net/ifc/base/ifcvf_osdep.h
> > >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > > drivers/net/ifc/meson.build  delete mode 100644
> > > drivers/net/ifc/rte_pmd_ifc_version.map
> > >  create mode 100644 drivers/vdpa/Makefile  create mode 100644
> > > drivers/vdpa/ifc/Makefile  create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf.c create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf_osdep.h
> > >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> > > 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> > > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> > >  create mode 100644 drivers/vdpa/meson.build
> > >
> > > --
> > > 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-08 10:45     ` Matan Azrad
@ 2020-01-08 12:39       ` Xu, Rosen
  2020-01-08 12:58         ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-08 12:39 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Thomas Monjalon, Pei, Andy

Hi Matan,

> -----Original Message-----
> From: Matan Azrad <matan@mellanox.com>
> Sent: Wednesday, January 08, 2020 18:46
> To: Xu, Rosen <rosen.xu@intel.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas Monjalon
> <thomas@monjalon.net>; Pei, Andy <andy.pei@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> Hi Xu
> 
> From: Xu, Rosen
> > Hi Matan,
> >
> > Did you think about OVS DPDK?
> > vDPA is a basic module for OVS, currently it will take some exception
> > path packet processing for OVS, so it still needs to integrate eth_dev.
> 
> I don't understand your question.
> 
> What do you mean by "integrate eth_dev"?

My questions is in OVS DPDK scenario vDPA device implements eth_dev ops,
so create a new class and move ifc code to this new class is not ok.

> > Thanks,
> > Rosen
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> > > Sent: Tuesday, January 07, 2020 15:57
> > > To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> > > <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> > > Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> > <xiao.w.wang@intel.com>
> > > Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas
> > > Monjalon <thomas@monjalon.net>
> > > Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA
> > > device drivers
> > >
> > > Hi all
> > >
> > > Any comments?
> > >
> > > From: Matan Azrad
> > > > As discussed and as described in RFC "[RFC] net: new vdpa PMD for
> > > > Mellanox devices", new vDPA driver is going to be added for
> > > > Mellanox devices - vDPA
> > > > mlx5 and more.
> > > >
> > > > The only vDPA driver now is the IFC driver that is located in net
> directory.
> > > >
> > > > The IFC driver and the new vDPA mlx5 driver provide the vDPA ops
> > > > introduced in librte_vhost and not the eth-dev ops.
> > > > All the others drivers in net class provide the eth-dev ops.
> > > > The set of features is also different.
> > > >
> > > > Create a new class for vDPA drivers and move IFC to this class.
> > > > Later, all the new drivers that implement the vDPA ops will be
> > > > added to the vDPA class.
> > > >
> > > > Also, a vDPA device driver features list was added to vDPA
> > documentation.
> > > >
> > > > Please review the features list and the series.
> > > >
> > > > Later on, I'm going to send the vDPA mlx5 driver.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > Matan Azrad (3):
> > > >   drivers: introduce vDPA class
> > > >   doc: add vDPA feature table
> > > >   drivers: move ifc driver to the vDPA class
> > > >
> > > >  MAINTAINERS                               |    6 +-
> > > >  doc/guides/conf.py                        |    5 +
> > > >  doc/guides/index.rst                      |    1 +
> > > >  doc/guides/nics/features/ifcvf.ini        |    8 -
> > > >  doc/guides/nics/ifc.rst                   |  106 ---
> > > >  doc/guides/nics/index.rst                 |    1 -
> > > >  doc/guides/vdpadevs/features/default.ini  |   55 ++
> > > >  doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
> > > >  doc/guides/vdpadevs/features_overview.rst |   65 ++
> > > >  doc/guides/vdpadevs/ifc.rst               |  106 +++
> > > >  doc/guides/vdpadevs/index.rst             |   15 +
> > > >  drivers/Makefile                          |    2 +
> > > >  drivers/meson.build                       |    1 +
> > > >  drivers/net/Makefile                      |    3 -
> > > >  drivers/net/ifc/Makefile                  |   34 -
> > > >  drivers/net/ifc/base/ifcvf.c              |  329 --------
> > > >  drivers/net/ifc/base/ifcvf.h              |  162 ----
> > > >  drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
> > > >  drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
> > > >  drivers/net/ifc/meson.build               |    9 -
> > > >  drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
> > > >  drivers/net/meson.build                   |    1 -
> > > >  drivers/vdpa/Makefile                     |   14 +
> > > >  drivers/vdpa/ifc/Makefile                 |   34 +
> > > >  drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
> > > >  drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
> > > >  drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
> > > >  drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280
> > > > +++++++++++++++++++++++++++++
> > > >  drivers/vdpa/ifc/meson.build              |    9 +
> > > >  drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
> > > >  drivers/vdpa/meson.build                  |    8 +
> > > >  31 files changed, 2152 insertions(+), 1991 deletions(-)  delete
> > > > mode
> > > > 100644 doc/guides/nics/features/ifcvf.ini
> > > >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > > > doc/guides/vdpadevs/features/default.ini
> > > >  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
> > > >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > > >  create mode 100644 doc/guides/vdpadevs/ifc.rst  create mode
> > > > 100644 doc/guides/vdpadevs/index.rst  delete mode 100644
> > > > drivers/net/ifc/Makefile  delete mode 100644
> > > > drivers/net/ifc/base/ifcvf.c delete mode 100644
> > > > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > > > drivers/net/ifc/base/ifcvf_osdep.h
> > > >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode
> > > > 100644 drivers/net/ifc/meson.build  delete mode 100644
> > > > drivers/net/ifc/rte_pmd_ifc_version.map
> > > >  create mode 100644 drivers/vdpa/Makefile  create mode 100644
> > > > drivers/vdpa/ifc/Makefile  create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf.c create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf_osdep.h
> > > >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> > > > 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> > > > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> > > >  create mode 100644 drivers/vdpa/meson.build
> > > >
> > > > --
> > > > 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-08 12:39       ` Xu, Rosen
@ 2020-01-08 12:58         ` Thomas Monjalon
  2020-01-09  2:27           ` Xu, Rosen
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-08 12:58 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang,
	Xiao W, Xu, Rosen
  Cc: Yigit, Ferruh, dev, Pei, Andy

08/01/2020 13:39, Xu, Rosen:
> From: Matan Azrad <matan@mellanox.com>
> > From: Xu, Rosen
> > > Did you think about OVS DPDK?
> > > vDPA is a basic module for OVS, currently it will take some exception
> > > path packet processing for OVS, so it still needs to integrate eth_dev.
> > 
> > I don't understand your question.
> > 
> > What do you mean by "integrate eth_dev"?
> 
> My questions is in OVS DPDK scenario vDPA device implements eth_dev ops,
> so create a new class and move ifc code to this new class is not ok.

1/ I don't understand the relation with OVS.

2/ no, vDPA device implements vDPA ops.
If it implements ethdev ops, it is an ethdev device.

Please show an example of what you claim.




^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-08 10:42         ` Matan Azrad
@ 2020-01-08 13:11           ` Andrew Rybchenko
  2020-01-08 17:01             ` Matan Azrad
  2020-01-09  2:15           ` Tiwei Bie
  1 sibling, 1 reply; 50+ messages in thread
From: Andrew Rybchenko @ 2020-01-08 13:11 UTC (permalink / raw)
  To: Matan Azrad, Tiwei Bie, Maxime Coquelin
  Cc: Zhihong Wang, Xiao Wang, Ferruh Yigit, dev, Thomas Monjalon

On 1/8/20 1:42 PM, Matan Azrad wrote:
> Hi all
> 
> Thanks very much for the review.
> Please see below.
> 
> From: Andrew Rybchenko
>> On 1/8/20 8:28 AM, Tiwei Bie wrote:
>>> On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
>>>> On 12/25/19 4:19 PM, Matan Azrad wrote:
>>>>> Add vDPA devices features table and explanation.
>>>>>
>>>>> Any vDPA driver can add its own supported features by ading a new
>>>>> ini file to the features directory in doc/guides/vdpadevs/features.
>>>>>
>>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>>>> ---
>>>>>  doc/guides/conf.py                        |  5 +++
>>>>>  doc/guides/vdpadevs/features/default.ini  | 55
>>>>> ++++++++++++++++++++++++++
>> doc/guides/vdpadevs/features_overview.rst | 65
>> +++++++++++++++++++++++++++++++
>>>>>  doc/guides/vdpadevs/index.rst             |  1 +
>>>>>  4 files changed, 126 insertions(+)
>>>>>  create mode 100644 doc/guides/vdpadevs/features/default.ini
>>>>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
>>>>>
>>>>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index
>>>>> 0892c06..c368fa5 100644
>>>>> --- a/doc/guides/conf.py
>>>>> +++ b/doc/guides/conf.py
>>>>> @@ -401,6 +401,11 @@ def setup(app):
>>>>>                              'Features',
>>>>>                              'Features availability in compression drivers',
>>>>>                              'Feature')
>>>>> +    table_file = dirname(__file__) +
>> '/vdpadevs/overview_feature_table.txt'
>>>>> +    generate_overview_table(table_file, 1,
>>>>> +                            'Features',
>>>>> +                            'Features availability in vDPA drivers',
>>>>> +                            'Feature')
>>>>>
>>>>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>>>>>          print('Upgrade sphinx to version >= 1.3.1 for '
>>>>> diff --git a/doc/guides/vdpadevs/features/default.ini
>>>>> b/doc/guides/vdpadevs/features/default.ini
>>>>> new file mode 100644
>>>>> index 0000000..a3e0bc7
>>>>> --- /dev/null
>>>>> +++ b/doc/guides/vdpadevs/features/default.ini
>>>>> @@ -0,0 +1,55 @@
>>>>> +;
>>>>> +; Features of a default vDPA driver.
>>>>> +;
>>>>> +; This file defines the features that are valid for inclusion in ;
>>>>> +the other driver files and also the order that they appear in ; the
>>>>> +features table in the documentation. The feature description ;
>>>>> +string should not exceed feature_str_len defined in conf.py.
>>>>> +;
>>>> I think some entries below could be removed for vDPA.
>>> +1
>>>
>>>>> +[Features]
>>>>> +csum                 =
>>>>> +guest csum           =
>>>>> +mac                  =
>>>>> +gso                  =
>>>>> +guest tso4           =
>>>>> +guest tso6           =
>>>>> +ecn                  =
>>>>> +ufo                  =
>>>>> +host tso4            =
>>>>> +host tso6            =
>>>>> +mrg rxbuf            =
>>>>> +ctrl vq              =
>>>>> +ctrl rx              =
>>>>> +any layout           =
>>>>> +guest announce       =
>>>>> +mq                   =
>>>>> +version 1            =
>>>>> +log all              =
>>>>> +protocol features    =
>>> We may not need to list this. The proto * would imply it.
> 
> So can you explain why this flag is exposed by the vhost features?
> 
>>>>> +indirect desc        =
>>>>> +event idx            =
>>>>> +mtu                  =
>>>>> +in_order             =
>>>>> +IOMMU platform       =
>>>>> +packed               =
>>>>> +proto mq             =
>>>>> +proto log shmfd      =
>>>>> +proto rarp           =
>>>>> +proto reply ack      =
>>>>> +proto slave req      =
>>> Ditto. This feature is to be used by other features.
>>> Features like host notifier would imply it.
> 
> So can you explain why this flag is exposed by the vhost protocol features?
> 
>>>>> +proto crypto session =
>>> We don't need to list this before we officially support the crypto
>>> vDPA device.
>>>
> 
> Ok, will remove.
> 
>>>>> +proto host notifier  =
>>>>> +proto pagefault      =
>>>>> +Multiprocess aware   =
>>> There is no support for this in library currently.
>>> To support it, we need to sync vhost fds and messages among processes.
>>>
> 
> Ok, will remove. 
> 
>>>>> +BSD nic_uio          =
>>>>> +Linux UIO            =
>>>> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
> 
> Ok, will remove.
> 
>>>>> +Linux VFIO           =
>>>>> +Other kdrv           =
>>>>> +ARMv7                =
>>>>> +ARMv8                =
>>>>> +Power8               =
>>>>> +x86-32               =
>>>>> +x86-64               =
>>>>> +Usage doc            =
>>>>> +Design doc           =
>>>>> +Perf doc             =
>>>>> \ No newline at end of file
>>>>> diff --git a/doc/guides/vdpadevs/features_overview.rst
>>>>> b/doc/guides/vdpadevs/features_overview.rst
>>>>> new file mode 100644
>>>>> index 0000000..c7745b7
>>>>> --- /dev/null
>>>>> +++ b/doc/guides/vdpadevs/features_overview.rst
>>>>> @@ -0,0 +1,65 @@
>>>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>>>> +    Copyright 2019 Mellanox Technologies, Ltd
>>>>> +
>>>>> +Overview of vDPA drivers features
>>>>> +=================================
>>>>> +
>>>>> +This section explains the supported features that are listed in the table
>> below.
>>>>> +
>>>>> +  * csum - Device can handle packets with partial checksum.
>>>>> +  * guest csum - Guest can handle packets with partial checksum.
>>>>> +  * mac - Device has given MAC address.
>>>>> +  * gso - Device can handle packets with any GSO type.
>>>>> +  * guest tso4 - Guest can receive TSOv4.
>>>>> +  * guest tso6 - Guest can receive TSOv6.
>>>>> +  * ecn - Device can receive TSO with ECN.
>>>>> +  * ufo - Device can receive UFO.
>>>>> +  * host tso4 - Device can receive TSOv4.
>>>>> +  * host tso6 - Device can receive TSOv6.
>>>>> +  * mrg rxbuf - Guest can merge receive buffers.
>>>>> +  * ctrl vq - Control channel is available.
>>>>> +  * ctrl rx - Control channel RX mode support.
>>>>> +  * any layout - Device can handle any descriptor layout.
>>>>> +  * guest announce - Guest can send gratuitous packets.
>>>>> +  * mq - Device supports Receive Flow Steering.
>>>>> +  * version 1 - v1.0 compliant.
>>>>> +  * log all - Device can log all write descriptors (live migration).
>>>>> +  * protocol features - Protocol features negotiation support.
>>>>> +  * indirect desc - Indirect buffer descriptors support.
>>>>> +  * event idx - Support for avail_idx and used_idx fields.
>>>>> +  * mtu - Host can advise the guest with its maximum supported MTU.
>>>>> +  * in_order - Device can use descriptors in ring order.
>>>>> +  * IOMMU platform - Device support IOMMU addresses.
>>>>> +  * packed - Device support packed virtio queues.
>>>>> +  * proto mq - Support the number of queues query.
>>>>> +  * proto log shmfd - Guest support setting log base.
>>>>> +  * proto rarp - Host can broadcast a fake RARP after live migration.
>>>>> +  * proto reply ack - Host support requested operation status ack.
>>>>> +  * proto slave req - Allow the slave to make requests to the master.
>>>>> +  * proto crypto session - Support crypto session creation.
>>>>> +  * proto host notifier - Host can register memory region based host
>> notifiers.
>>>>> +  * proto pagefault - Slave expose page-fault FD for migration process.
>>>>> +  * Multiprocess aware - Driver can be used for primary-secondary
>> process model.
>>>>> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
>>>>> +  * Linux UIO - Works with ``igb_uio`` kernel module.
>>>>> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
>>>>> +  * Other kdrv - Kernel module other than above ones supported.
>>>>> +  * ARMv7 - Support armv7 architecture.
>>>>> +  * ARMv8 - Support armv8a (64bit) architecture.
>>>>> +  * Power8 - Support PowerPC architecture.
>>>>> +  * x86-32 - Support 32bits x86 architecture.
>>>>> +  * x86-64 - Support 64bits x86 architecture.
>>>>> +  * Usage doc - Documentation describes usage, In
>> ``doc/guides/vdpadevs/``.
>>>>> +  * Design doc - Documentation describes design. In
>> ``doc/guides/vdpadevs/``.
>>>>> +  * Perf doc - Documentation describes performance values, In
>> ``doc/perf/``.
>>
>> Are you going to put Y mark for all these features in v20.02 release cycle?
> 
> No.
> 
>> Basically the question is: is it OK to have features that no driver supports?
>> "Dead" features do not look nice.
>> I would say yes for architecture support since it is better to list all
>> architectures supported by DPDK and make it clear which architectures are
>> supported by particular vDPA driver.
>> May be it is OK for features which are directly correspond to virtio/vhost
>> features (of course, it is very useful to know spec compliance).
>> In this case I think it would be very helpful to add references to spec
>> sections.
> 
> These features are supported in vhost library. So I think it may sense to add them.
> I think, especially in vDPA case, It is good for the vhost users to see what is supported and what is not supported by the vDPA driver.
> 
> Which spec are you talking about?

Vhost user protocol specification and virtio specification
(since some features directly correspond to virtio features).

[1]
https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html
[2] https://qemu.weilnetz.de/doc/interop/vhost-user.html

>> Also I like doc/guides/nics/features.rst and would like to know why the
>> practice is not used here. I'm talking about features description format.
> 
> Yes, but except nic class, no class uses it.

IMO it is one of best practices in DPDK which should be used
by other classes as well.

> Maybe it is because there are not a lot of different ways to add the configuration\capability.
> Here, for example, most of them are just in feature bitmap.

More formal description simplifies search and helps driver
developers and maintainers a lot.

>> [snip]
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-08 13:11           ` Andrew Rybchenko
@ 2020-01-08 17:01             ` Matan Azrad
  0 siblings, 0 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-08 17:01 UTC (permalink / raw)
  To: Andrew Rybchenko, Tiwei Bie, Maxime Coquelin
  Cc: Zhihong Wang, Xiao Wang, Ferruh Yigit, dev, Thomas Monjalon

Hi Andrew

From: Andrew Rybchenko
> Sent: Wednesday, January 8, 2020 3:11 PM
> To: Matan Azrad <matan@mellanox.com>; Tiwei Bie <tiwei.bie@intel.com>;
> Maxime Coquelin <maxime.coquelin@redhat.com>
> Cc: Zhihong Wang <zhihong.wang@intel.com>; Xiao Wang
> <xiao.w.wang@intel.com>; Ferruh Yigit <ferruh.yigit@intel.com>;
> dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
> 
> On 1/8/20 1:42 PM, Matan Azrad wrote:
> > Hi all
> >
> > Thanks very much for the review.
> > Please see below.
> >
> > From: Andrew Rybchenko
> >> On 1/8/20 8:28 AM, Tiwei Bie wrote:
> >>> On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> >>>> On 12/25/19 4:19 PM, Matan Azrad wrote:
> >>>>> Add vDPA devices features table and explanation.
> >>>>>
> >>>>> Any vDPA driver can add its own supported features by ading a new
> >>>>> ini file to the features directory in doc/guides/vdpadevs/features.
> >>>>>
> >>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>>>> ---
> >>>>>  doc/guides/conf.py                        |  5 +++
> >>>>>  doc/guides/vdpadevs/features/default.ini  | 55
> >>>>> ++++++++++++++++++++++++++
> >> doc/guides/vdpadevs/features_overview.rst | 65
> >> +++++++++++++++++++++++++++++++
> >>>>>  doc/guides/vdpadevs/index.rst             |  1 +
> >>>>>  4 files changed, 126 insertions(+)  create mode 100644
> >>>>> doc/guides/vdpadevs/features/default.ini
> >>>>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> >>>>>
> >>>>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index
> >>>>> 0892c06..c368fa5 100644
> >>>>> --- a/doc/guides/conf.py
> >>>>> +++ b/doc/guides/conf.py
> >>>>> @@ -401,6 +401,11 @@ def setup(app):
> >>>>>                              'Features',
> >>>>>                              'Features availability in compression drivers',
> >>>>>                              'Feature')
> >>>>> +    table_file = dirname(__file__) +
> >> '/vdpadevs/overview_feature_table.txt'
> >>>>> +    generate_overview_table(table_file, 1,
> >>>>> +                            'Features',
> >>>>> +                            'Features availability in vDPA drivers',
> >>>>> +                            'Feature')
> >>>>>
> >>>>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> >>>>>          print('Upgrade sphinx to version >= 1.3.1 for '
> >>>>> diff --git a/doc/guides/vdpadevs/features/default.ini
> >>>>> b/doc/guides/vdpadevs/features/default.ini
> >>>>> new file mode 100644
> >>>>> index 0000000..a3e0bc7
> >>>>> --- /dev/null
> >>>>> +++ b/doc/guides/vdpadevs/features/default.ini
> >>>>> @@ -0,0 +1,55 @@
> >>>>> +;
> >>>>> +; Features of a default vDPA driver.
> >>>>> +;
> >>>>> +; This file defines the features that are valid for inclusion in
> >>>>> +; the other driver files and also the order that they appear in ;
> >>>>> +the features table in the documentation. The feature description
> >>>>> +; string should not exceed feature_str_len defined in conf.py.
> >>>>> +;
> >>>> I think some entries below could be removed for vDPA.
> >>> +1
> >>>
> >>>>> +[Features]
> >>>>> +csum                 =
> >>>>> +guest csum           =
> >>>>> +mac                  =
> >>>>> +gso                  =
> >>>>> +guest tso4           =
> >>>>> +guest tso6           =
> >>>>> +ecn                  =
> >>>>> +ufo                  =
> >>>>> +host tso4            =
> >>>>> +host tso6            =
> >>>>> +mrg rxbuf            =
> >>>>> +ctrl vq              =
> >>>>> +ctrl rx              =
> >>>>> +any layout           =
> >>>>> +guest announce       =
> >>>>> +mq                   =
> >>>>> +version 1            =
> >>>>> +log all              =
> >>>>> +protocol features    =
> >>> We may not need to list this. The proto * would imply it.
> >
> > So can you explain why this flag is exposed by the vhost features?
> >
> >>>>> +indirect desc        =
> >>>>> +event idx            =
> >>>>> +mtu                  =
> >>>>> +in_order             =
> >>>>> +IOMMU platform       =
> >>>>> +packed               =
> >>>>> +proto mq             =
> >>>>> +proto log shmfd      =
> >>>>> +proto rarp           =
> >>>>> +proto reply ack      =
> >>>>> +proto slave req      =
> >>> Ditto. This feature is to be used by other features.
> >>> Features like host notifier would imply it.
> >
> > So can you explain why this flag is exposed by the vhost protocol features?
> >
> >>>>> +proto crypto session =
> >>> We don't need to list this before we officially support the crypto
> >>> vDPA device.
> >>>
> >
> > Ok, will remove.
> >
> >>>>> +proto host notifier  =
> >>>>> +proto pagefault      =
> >>>>> +Multiprocess aware   =
> >>> There is no support for this in library currently.
> >>> To support it, we need to sync vhost fds and messages among
> processes.
> >>>
> >
> > Ok, will remove.
> >
> >>>>> +BSD nic_uio          =
> >>>>> +Linux UIO            =
> >>>> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
> >
> > Ok, will remove.
> >
> >>>>> +Linux VFIO           =
> >>>>> +Other kdrv           =
> >>>>> +ARMv7                =
> >>>>> +ARMv8                =
> >>>>> +Power8               =
> >>>>> +x86-32               =
> >>>>> +x86-64               =
> >>>>> +Usage doc            =
> >>>>> +Design doc           =
> >>>>> +Perf doc             =
> >>>>> \ No newline at end of file
> >>>>> diff --git a/doc/guides/vdpadevs/features_overview.rst
> >>>>> b/doc/guides/vdpadevs/features_overview.rst
> >>>>> new file mode 100644
> >>>>> index 0000000..c7745b7
> >>>>> --- /dev/null
> >>>>> +++ b/doc/guides/vdpadevs/features_overview.rst
> >>>>> @@ -0,0 +1,65 @@
> >>>>> +..  SPDX-License-Identifier: BSD-3-Clause
> >>>>> +    Copyright 2019 Mellanox Technologies, Ltd
> >>>>> +
> >>>>> +Overview of vDPA drivers features
> >>>>> +=================================
> >>>>> +
> >>>>> +This section explains the supported features that are listed in
> >>>>> +the table
> >> below.
> >>>>> +
> >>>>> +  * csum - Device can handle packets with partial checksum.
> >>>>> +  * guest csum - Guest can handle packets with partial checksum.
> >>>>> +  * mac - Device has given MAC address.
> >>>>> +  * gso - Device can handle packets with any GSO type.
> >>>>> +  * guest tso4 - Guest can receive TSOv4.
> >>>>> +  * guest tso6 - Guest can receive TSOv6.
> >>>>> +  * ecn - Device can receive TSO with ECN.
> >>>>> +  * ufo - Device can receive UFO.
> >>>>> +  * host tso4 - Device can receive TSOv4.
> >>>>> +  * host tso6 - Device can receive TSOv6.
> >>>>> +  * mrg rxbuf - Guest can merge receive buffers.
> >>>>> +  * ctrl vq - Control channel is available.
> >>>>> +  * ctrl rx - Control channel RX mode support.
> >>>>> +  * any layout - Device can handle any descriptor layout.
> >>>>> +  * guest announce - Guest can send gratuitous packets.
> >>>>> +  * mq - Device supports Receive Flow Steering.
> >>>>> +  * version 1 - v1.0 compliant.
> >>>>> +  * log all - Device can log all write descriptors (live migration).
> >>>>> +  * protocol features - Protocol features negotiation support.
> >>>>> +  * indirect desc - Indirect buffer descriptors support.
> >>>>> +  * event idx - Support for avail_idx and used_idx fields.
> >>>>> +  * mtu - Host can advise the guest with its maximum supported
> MTU.
> >>>>> +  * in_order - Device can use descriptors in ring order.
> >>>>> +  * IOMMU platform - Device support IOMMU addresses.
> >>>>> +  * packed - Device support packed virtio queues.
> >>>>> +  * proto mq - Support the number of queues query.
> >>>>> +  * proto log shmfd - Guest support setting log base.
> >>>>> +  * proto rarp - Host can broadcast a fake RARP after live migration.
> >>>>> +  * proto reply ack - Host support requested operation status ack.
> >>>>> +  * proto slave req - Allow the slave to make requests to the master.
> >>>>> +  * proto crypto session - Support crypto session creation.
> >>>>> +  * proto host notifier - Host can register memory region based
> >>>>> + host
> >> notifiers.
> >>>>> +  * proto pagefault - Slave expose page-fault FD for migration
> process.
> >>>>> +  * Multiprocess aware - Driver can be used for primary-secondary
> >> process model.
> >>>>> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> >>>>> +  * Linux UIO - Works with ``igb_uio`` kernel module.
> >>>>> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> >>>>> +  * Other kdrv - Kernel module other than above ones supported.
> >>>>> +  * ARMv7 - Support armv7 architecture.
> >>>>> +  * ARMv8 - Support armv8a (64bit) architecture.
> >>>>> +  * Power8 - Support PowerPC architecture.
> >>>>> +  * x86-32 - Support 32bits x86 architecture.
> >>>>> +  * x86-64 - Support 64bits x86 architecture.
> >>>>> +  * Usage doc - Documentation describes usage, In
> >> ``doc/guides/vdpadevs/``.
> >>>>> +  * Design doc - Documentation describes design. In
> >> ``doc/guides/vdpadevs/``.
> >>>>> +  * Perf doc - Documentation describes performance values, In
> >> ``doc/perf/``.
> >>
> >> Are you going to put Y mark for all these features in v20.02 release cycle?
> >
> > No.
> >
> >> Basically the question is: is it OK to have features that no driver supports?
> >> "Dead" features do not look nice.
> >> I would say yes for architecture support since it is better to list
> >> all architectures supported by DPDK and make it clear which
> >> architectures are supported by particular vDPA driver.
> >> May be it is OK for features which are directly correspond to
> >> virtio/vhost features (of course, it is very useful to know spec
> compliance).
> >> In this case I think it would be very helpful to add references to
> >> spec sections.
> >
> > These features are supported in vhost library. So I think it may sense to add
> them.
> > I think, especially in vDPA case, It is good for the vhost users to see what is
> supported and what is not supported by the vDPA driver.
> >
> > Which spec are you talking about?
> 
> Vhost user protocol specification and virtio specification (since some features
> directly correspond to virtio features).
> 
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.
> oasis-open.org%2Fvirtio%2Fvirtio%2Fv1.1%2Fcsprd01%2Fvirtio-v1.1-
> csprd01.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C85f5ae0b
> 4d154e5e1d1a08d7943c4b1f%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%
> 7C0%7C637140859000068810&amp;sdata=f1sw2FwlsckXaCsLeZeGmcHieUxsE
> uQ8WfTVnHjp970%3D&amp;reserved=0
> [2]
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fqem
> u.weilnetz.de%2Fdoc%2Finterop%2Fvhost-
> user.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C85f5ae0b4d1
> 54e5e1d1a08d7943c4b1f%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0
> %7C637140859000068810&amp;sdata=twLCnQOKaWXgwXJ4PPUTcn4gF4S%2
> FptyeVL9kxpNYuWk%3D&amp;reserved=0
> 

Thanks, will add this useful links.

> >> Also I like doc/guides/nics/features.rst and would like to know why
> >> the practice is not used here. I'm talking about features description
> format.
> >
> > Yes, but except nic class, no class uses it.
> 
> IMO it is one of best practices in DPDK which should be used by other classes
> as well.
> > Maybe it is because there are not a lot of different ways to add the
> configuration\capability.
> > Here, for example, most of them are just in feature bitmap.
> 
> More formal description simplifies search and helps driver developers and
> maintainers a lot.


I can add more description how to expose the capability for the features.
All of them are really the same, so IMO we should not rewrite the same again and again. 

What do you think?

> >> [snip]
> >


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class
  2020-01-07 17:32   ` Maxime Coquelin
@ 2020-01-08 21:28     ` Thomas Monjalon
  2020-01-09  8:00       ` Maxime Coquelin
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-08 21:28 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Matan Azrad, Tiwei Bie, Zhihong Wang, Xiao Wang, dev, Ferruh Yigit, dev

07/01/2020 18:32, Maxime Coquelin:
> Hi Matan,
> 
> On 12/25/19 4:19 PM, Matan Azrad wrote:
> > The vDPA (vhost data path acceleration) drivers provide support for
> > the vDPA operations introduced by the rte_vhost library.
> > 
> > Any driver which provides the vDPA operations should be moved\added to
> > the vdpa class under drivers/vdpa/.
> > 
> > Create the general files for vDPA class in drivers and in documentation.
> > 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  doc/guides/index.rst          |  1 +
> >  doc/guides/vdpadevs/index.rst | 13 +++++++++++++
> >  drivers/Makefile              |  2 ++
> >  drivers/meson.build           |  1 +
> >  drivers/vdpa/Makefile         |  8 ++++++++
> >  drivers/vdpa/meson.build      |  8 ++++++++
> >  6 files changed, 33 insertions(+)
> >  create mode 100644 doc/guides/vdpadevs/index.rst
> >  create mode 100644 drivers/vdpa/Makefile
> >  create mode 100644 drivers/vdpa/meson.build
> > 
> 
> Looks good to me. Just wondering if we need a dedicated maintainer for
> this new class of devices?

We must create a new section in MAINTAINERS file for vDPA drivers.
Maxime, are you OK to merge those drivers in dpdk-next-virtio tree?



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-08 10:42         ` Matan Azrad
  2020-01-08 13:11           ` Andrew Rybchenko
@ 2020-01-09  2:15           ` Tiwei Bie
  2020-01-09  8:08             ` Matan Azrad
  1 sibling, 1 reply; 50+ messages in thread
From: Tiwei Bie @ 2020-01-09  2:15 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Andrew Rybchenko, Maxime Coquelin, Zhihong Wang, Xiao Wang,
	Ferruh Yigit, dev, Thomas Monjalon

On Wed, Jan 08, 2020 at 10:42:48AM +0000, Matan Azrad wrote:
> Hi all
> 
> Thanks very much for the review.
> Please see below.
> 
> From: Andrew Rybchenko
> > On 1/8/20 8:28 AM, Tiwei Bie wrote:
> > > On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> > >> On 12/25/19 4:19 PM, Matan Azrad wrote:
> > >>> Add vDPA devices features table and explanation.
> > >>>
> > >>> Any vDPA driver can add its own supported features by ading a new
> > >>> ini file to the features directory in doc/guides/vdpadevs/features.
> > >>>
> > >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> > >>> ---
> > >>>  doc/guides/conf.py                        |  5 +++
> > >>>  doc/guides/vdpadevs/features/default.ini  | 55
> > >>> ++++++++++++++++++++++++++
> > doc/guides/vdpadevs/features_overview.rst | 65
> > +++++++++++++++++++++++++++++++
> > >>>  doc/guides/vdpadevs/index.rst             |  1 +
> > >>>  4 files changed, 126 insertions(+)
> > >>>  create mode 100644 doc/guides/vdpadevs/features/default.ini
> > >>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > >>>
> > >>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index
> > >>> 0892c06..c368fa5 100644
> > >>> --- a/doc/guides/conf.py
> > >>> +++ b/doc/guides/conf.py
> > >>> @@ -401,6 +401,11 @@ def setup(app):
> > >>>                              'Features',
> > >>>                              'Features availability in compression drivers',
> > >>>                              'Feature')
> > >>> +    table_file = dirname(__file__) +
> > '/vdpadevs/overview_feature_table.txt'
> > >>> +    generate_overview_table(table_file, 1,
> > >>> +                            'Features',
> > >>> +                            'Features availability in vDPA drivers',
> > >>> +                            'Feature')
> > >>>
> > >>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> > >>>          print('Upgrade sphinx to version >= 1.3.1 for '
> > >>> diff --git a/doc/guides/vdpadevs/features/default.ini
> > >>> b/doc/guides/vdpadevs/features/default.ini
> > >>> new file mode 100644
> > >>> index 0000000..a3e0bc7
> > >>> --- /dev/null
> > >>> +++ b/doc/guides/vdpadevs/features/default.ini
> > >>> @@ -0,0 +1,55 @@
> > >>> +;
> > >>> +; Features of a default vDPA driver.
> > >>> +;
> > >>> +; This file defines the features that are valid for inclusion in ;
> > >>> +the other driver files and also the order that they appear in ; the
> > >>> +features table in the documentation. The feature description ;
> > >>> +string should not exceed feature_str_len defined in conf.py.
> > >>> +;
> > >> I think some entries below could be removed for vDPA.
> > > +1
> > >
> > >>> +[Features]
> > >>> +csum                 =
> > >>> +guest csum           =
> > >>> +mac                  =
> > >>> +gso                  =
> > >>> +guest tso4           =
> > >>> +guest tso6           =
> > >>> +ecn                  =
> > >>> +ufo                  =
> > >>> +host tso4            =
> > >>> +host tso6            =
> > >>> +mrg rxbuf            =
> > >>> +ctrl vq              =
> > >>> +ctrl rx              =
> > >>> +any layout           =
> > >>> +guest announce       =
> > >>> +mq                   =
> > >>> +version 1            =
> > >>> +log all              =
> > >>> +protocol features    =
> > > We may not need to list this. The proto * would imply it.
> 
> So can you explain why this flag is exposed by the vhost features?

This feature is needed in vhost-user to allow master and
slave to negotiate protocol features in a backward compatible
way. Supports of any proto features would imply the support
of this feature. If we want to shorten this list, it can be
a good candidate for removal.

> 
> > >>> +indirect desc        =
> > >>> +event idx            =
> > >>> +mtu                  =
> > >>> +in_order             =
> > >>> +IOMMU platform       =
> > >>> +packed               =
> > >>> +proto mq             =
> > >>> +proto log shmfd      =
> > >>> +proto rarp           =
> > >>> +proto reply ack      =
> > >>> +proto slave req      =
> > > Ditto. This feature is to be used by other features.
> > > Features like host notifier would imply it.
> 
> So can you explain why this flag is exposed by the vhost protocol features?

This feature allows master and slave to setup a slave channel
in a backward compatible way. Having a slave channel between
master and slave without any other features using it isn't very
useful. I.e. this feature is supposed to be used by the features
like pagefault, host notifier. And supports of these features
would imply the support of this feature as well. So it can be a
good candidate for removal to shorten this list.

> 
> > >>> +proto crypto session =
> > > We don't need to list this before we officially support the crypto
> > > vDPA device.
> > >
> 
> Ok, will remove.
> 
> > >>> +proto host notifier  =
> > >>> +proto pagefault      =
> > >>> +Multiprocess aware   =
> > > There is no support for this in library currently.
> > > To support it, we need to sync vhost fds and messages among processes.
> > >
> 
> Ok, will remove. 
> 
> > >>> +BSD nic_uio          =
> > >>> +Linux UIO            =
> > >> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
> 
> Ok, will remove.
> 
> > >>> +Linux VFIO           =
> > >>> +Other kdrv           =
> > >>> +ARMv7                =
> > >>> +ARMv8                =
> > >>> +Power8               =
> > >>> +x86-32               =
> > >>> +x86-64               =
> > >>> +Usage doc            =
> > >>> +Design doc           =
> > >>> +Perf doc             =
> > >>> \ No newline at end of file
> > >>> diff --git a/doc/guides/vdpadevs/features_overview.rst
> > >>> b/doc/guides/vdpadevs/features_overview.rst
> > >>> new file mode 100644
> > >>> index 0000000..c7745b7
> > >>> --- /dev/null
> > >>> +++ b/doc/guides/vdpadevs/features_overview.rst
> > >>> @@ -0,0 +1,65 @@
> > >>> +..  SPDX-License-Identifier: BSD-3-Clause
> > >>> +    Copyright 2019 Mellanox Technologies, Ltd
> > >>> +
> > >>> +Overview of vDPA drivers features
> > >>> +=================================
> > >>> +
> > >>> +This section explains the supported features that are listed in the table
> > below.
> > >>> +
> > >>> +  * csum - Device can handle packets with partial checksum.
> > >>> +  * guest csum - Guest can handle packets with partial checksum.
> > >>> +  * mac - Device has given MAC address.
> > >>> +  * gso - Device can handle packets with any GSO type.
> > >>> +  * guest tso4 - Guest can receive TSOv4.
> > >>> +  * guest tso6 - Guest can receive TSOv6.
> > >>> +  * ecn - Device can receive TSO with ECN.
> > >>> +  * ufo - Device can receive UFO.
> > >>> +  * host tso4 - Device can receive TSOv4.
> > >>> +  * host tso6 - Device can receive TSOv6.
> > >>> +  * mrg rxbuf - Guest can merge receive buffers.
> > >>> +  * ctrl vq - Control channel is available.
> > >>> +  * ctrl rx - Control channel RX mode support.
> > >>> +  * any layout - Device can handle any descriptor layout.
> > >>> +  * guest announce - Guest can send gratuitous packets.
> > >>> +  * mq - Device supports Receive Flow Steering.
> > >>> +  * version 1 - v1.0 compliant.
> > >>> +  * log all - Device can log all write descriptors (live migration).
> > >>> +  * protocol features - Protocol features negotiation support.
> > >>> +  * indirect desc - Indirect buffer descriptors support.
> > >>> +  * event idx - Support for avail_idx and used_idx fields.
> > >>> +  * mtu - Host can advise the guest with its maximum supported MTU.
> > >>> +  * in_order - Device can use descriptors in ring order.
> > >>> +  * IOMMU platform - Device support IOMMU addresses.
> > >>> +  * packed - Device support packed virtio queues.
> > >>> +  * proto mq - Support the number of queues query.
> > >>> +  * proto log shmfd - Guest support setting log base.
> > >>> +  * proto rarp - Host can broadcast a fake RARP after live migration.
> > >>> +  * proto reply ack - Host support requested operation status ack.
> > >>> +  * proto slave req - Allow the slave to make requests to the master.
> > >>> +  * proto crypto session - Support crypto session creation.
> > >>> +  * proto host notifier - Host can register memory region based host
> > notifiers.
> > >>> +  * proto pagefault - Slave expose page-fault FD for migration process.
> > >>> +  * Multiprocess aware - Driver can be used for primary-secondary
> > process model.
> > >>> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> > >>> +  * Linux UIO - Works with ``igb_uio`` kernel module.
> > >>> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> > >>> +  * Other kdrv - Kernel module other than above ones supported.
> > >>> +  * ARMv7 - Support armv7 architecture.
> > >>> +  * ARMv8 - Support armv8a (64bit) architecture.
> > >>> +  * Power8 - Support PowerPC architecture.
> > >>> +  * x86-32 - Support 32bits x86 architecture.
> > >>> +  * x86-64 - Support 64bits x86 architecture.
> > >>> +  * Usage doc - Documentation describes usage, In
> > ``doc/guides/vdpadevs/``.
> > >>> +  * Design doc - Documentation describes design. In
> > ``doc/guides/vdpadevs/``.
> > >>> +  * Perf doc - Documentation describes performance values, In
> > ``doc/perf/``.
> > 
> > Are you going to put Y mark for all these features in v20.02 release cycle?
> 
> No.
> 
> > Basically the question is: is it OK to have features that no driver supports?
> > "Dead" features do not look nice.
> > I would say yes for architecture support since it is better to list all
> > architectures supported by DPDK and make it clear which architectures are
> > supported by particular vDPA driver.
> > May be it is OK for features which are directly correspond to virtio/vhost
> > features (of course, it is very useful to know spec compliance).
> > In this case I think it would be very helpful to add references to spec
> > sections.
> 
> These features are supported in vhost library. So I think it may sense to add them.
> I think, especially in vDPA case, It is good for the vhost users to see what is supported and what is not supported by the vDPA driver.
> 
> Which spec are you talking about?
> 
> > Also I like doc/guides/nics/features.rst and would like to know why the
> > practice is not used here. I'm talking about features description format.
> 
> Yes, but except nic class, no class uses it.
> Maybe it is because there are not a lot of different ways to add the configuration\capability.
> Here, for example, most of them are just in feature bitmap.
>  
> > [snip]
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-08 12:58         ` Thomas Monjalon
@ 2020-01-09  2:27           ` Xu, Rosen
  2020-01-09  8:41             ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-09  2:27 UTC (permalink / raw)
  To: Thomas Monjalon, Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang,
	Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Pei, Andy

Hi,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, January 08, 2020 20:59
> To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>; Xu, Rosen <rosen.xu@intel.com>
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Pei, Andy
> <andy.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 08/01/2020 13:39, Xu, Rosen:
> > From: Matan Azrad <matan@mellanox.com>
> > > From: Xu, Rosen
> > > > Did you think about OVS DPDK?
> > > > vDPA is a basic module for OVS, currently it will take some
> > > > exception path packet processing for OVS, so it still needs to integrate
> eth_dev.
> > >
> > > I don't understand your question.
> > >
> > > What do you mean by "integrate eth_dev"?
> >
> > My questions is in OVS DPDK scenario vDPA device implements eth_dev
> > ops, so create a new class and move ifc code to this new class is not ok.
> 
> 1/ I don't understand the relation with OVS.
>
> 2/ no, vDPA device implements vDPA ops.
> If it implements ethdev ops, it is an ethdev device.
> 
> Please show an example of what you claim.

Answers of 1 and 2.

In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK needs to be implemented
as rte_eth_dev and provides eth_dev_ops such as packet TX/RX for OVS.

Take vHost(Virtio back end) for example, OVS startups vHost interface like this:
ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser
drivers/net/vhost implements vHost as rte_eth_dev and integrated in OVS.
OVS can send/receive packets to/from VM with rte_eth_tx_burst()  rte_eth_rx_burst()
which call eth_dev_ops implementation of drivers/net/vhost.

vDPA is also Virtio back end and works like vHost, same as vHost, it will be implemented as rte_eth_dev and
also be integrated into OVS.

So, it's not ok to move ifc code from drivers/net.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class
  2020-01-08 21:28     ` Thomas Monjalon
@ 2020-01-09  8:00       ` Maxime Coquelin
  0 siblings, 0 replies; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-09  8:00 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Tiwei Bie, Zhihong Wang, Xiao Wang, dev, Ferruh Yigit



On 1/8/20 10:28 PM, Thomas Monjalon wrote:
> 07/01/2020 18:32, Maxime Coquelin:
>> Hi Matan,
>>
>> On 12/25/19 4:19 PM, Matan Azrad wrote:
>>> The vDPA (vhost data path acceleration) drivers provide support for
>>> the vDPA operations introduced by the rte_vhost library.
>>>
>>> Any driver which provides the vDPA operations should be moved\added to
>>> the vdpa class under drivers/vdpa/.
>>>
>>> Create the general files for vDPA class in drivers and in documentation.
>>>
>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>> ---
>>>  doc/guides/index.rst          |  1 +
>>>  doc/guides/vdpadevs/index.rst | 13 +++++++++++++
>>>  drivers/Makefile              |  2 ++
>>>  drivers/meson.build           |  1 +
>>>  drivers/vdpa/Makefile         |  8 ++++++++
>>>  drivers/vdpa/meson.build      |  8 ++++++++
>>>  6 files changed, 33 insertions(+)
>>>  create mode 100644 doc/guides/vdpadevs/index.rst
>>>  create mode 100644 drivers/vdpa/Makefile
>>>  create mode 100644 drivers/vdpa/meson.build
>>>
>>
>> Looks good to me. Just wondering if we need a dedicated maintainer for
>> this new class of devices?
> 
> We must create a new section in MAINTAINERS file for vDPA drivers.
> Maxime, are you OK to merge those drivers in dpdk-next-virtio tree?

Sure, I can do that.

Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table
  2020-01-09  2:15           ` Tiwei Bie
@ 2020-01-09  8:08             ` Matan Azrad
  0 siblings, 0 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-09  8:08 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Andrew Rybchenko, Maxime Coquelin, Zhihong Wang, Xiao Wang,
	Ferruh Yigit, dev, Thomas Monjalon



From: Tiwei Bie
> On Wed, Jan 08, 2020 at 10:42:48AM +0000, Matan Azrad wrote:
> > Hi all
> >
> > Thanks very much for the review.
> > Please see below.
> >
> > From: Andrew Rybchenko
> > > On 1/8/20 8:28 AM, Tiwei Bie wrote:
> > > > On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> > > >> On 12/25/19 4:19 PM, Matan Azrad wrote:
> > > >>> Add vDPA devices features table and explanation.
> > > >>>
> > > >>> Any vDPA driver can add its own supported features by ading a
> > > >>> new ini file to the features directory in
> doc/guides/vdpadevs/features.
> > > >>>
> > > >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > >>> ---
> > > >>>  doc/guides/conf.py                        |  5 +++
> > > >>>  doc/guides/vdpadevs/features/default.ini  | 55
> > > >>> ++++++++++++++++++++++++++
> > > doc/guides/vdpadevs/features_overview.rst | 65
> > > +++++++++++++++++++++++++++++++
> > > >>>  doc/guides/vdpadevs/index.rst             |  1 +
> > > >>>  4 files changed, 126 insertions(+)  create mode 100644
> > > >>> doc/guides/vdpadevs/features/default.ini
> > > >>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > > >>>
> > > >>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index
> > > >>> 0892c06..c368fa5 100644
> > > >>> --- a/doc/guides/conf.py
> > > >>> +++ b/doc/guides/conf.py
> > > >>> @@ -401,6 +401,11 @@ def setup(app):
> > > >>>                              'Features',
> > > >>>                              'Features availability in compression drivers',
> > > >>>                              'Feature')
> > > >>> +    table_file = dirname(__file__) +
> > > '/vdpadevs/overview_feature_table.txt'
> > > >>> +    generate_overview_table(table_file, 1,
> > > >>> +                            'Features',
> > > >>> +                            'Features availability in vDPA drivers',
> > > >>> +                            'Feature')
> > > >>>
> > > >>>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> > > >>>          print('Upgrade sphinx to version >= 1.3.1 for '
> > > >>> diff --git a/doc/guides/vdpadevs/features/default.ini
> > > >>> b/doc/guides/vdpadevs/features/default.ini
> > > >>> new file mode 100644
> > > >>> index 0000000..a3e0bc7
> > > >>> --- /dev/null
> > > >>> +++ b/doc/guides/vdpadevs/features/default.ini
> > > >>> @@ -0,0 +1,55 @@
> > > >>> +;
> > > >>> +; Features of a default vDPA driver.
> > > >>> +;
> > > >>> +; This file defines the features that are valid for inclusion
> > > >>> +in ; the other driver files and also the order that they appear
> > > >>> +in ; the features table in the documentation. The feature
> > > >>> +description ; string should not exceed feature_str_len defined in
> conf.py.
> > > >>> +;
> > > >> I think some entries below could be removed for vDPA.
> > > > +1
> > > >
> > > >>> +[Features]
> > > >>> +csum                 =
> > > >>> +guest csum           =
> > > >>> +mac                  =
> > > >>> +gso                  =
> > > >>> +guest tso4           =
> > > >>> +guest tso6           =
> > > >>> +ecn                  =
> > > >>> +ufo                  =
> > > >>> +host tso4            =
> > > >>> +host tso6            =
> > > >>> +mrg rxbuf            =
> > > >>> +ctrl vq              =
> > > >>> +ctrl rx              =
> > > >>> +any layout           =
> > > >>> +guest announce       =
> > > >>> +mq                   =
> > > >>> +version 1            =
> > > >>> +log all              =
> > > >>> +protocol features    =
> > > > We may not need to list this. The proto * would imply it.
> >
> > So can you explain why this flag is exposed by the vhost features?
> 
> This feature is needed in vhost-user to allow master and slave to negotiate
> protocol features in a backward compatible way. Supports of any proto
> features would imply the support of this feature. If we want to shorten this
> list, it can be a good candidate for removal.
> 

Ok, will remove.

> > > >>> +indirect desc        =
> > > >>> +event idx            =
> > > >>> +mtu                  =
> > > >>> +in_order             =
> > > >>> +IOMMU platform       =
> > > >>> +packed               =
> > > >>> +proto mq             =
> > > >>> +proto log shmfd      =
> > > >>> +proto rarp           =
> > > >>> +proto reply ack      =
> > > >>> +proto slave req      =
> > > > Ditto. This feature is to be used by other features.
> > > > Features like host notifier would imply it.
> >
> > So can you explain why this flag is exposed by the vhost protocol features?
> 
> This feature allows master and slave to setup a slave channel in a backward
> compatible way. Having a slave channel between master and slave without
> any other features using it isn't very useful. I.e. this feature is supposed to be
> used by the features like pagefault, host notifier. And supports of these
> features would imply the support of this feature as well. So it can be a good
> candidate for removal to shorten this list.


Ok, will remove.

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  2:27           ` Xu, Rosen
@ 2020-01-09  8:41             ` Thomas Monjalon
  2020-01-09  9:23               ` Maxime Coquelin
  2020-01-09 10:53               ` Xu, Rosen
  0 siblings, 2 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-09  8:41 UTC (permalink / raw)
  To: Xu, Rosen
  Cc: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang,
	Xiao W, Yigit, Ferruh, dev, Pei, Andy

09/01/2020 03:27, Xu, Rosen:
> Hi,
> 
> From: Thomas Monjalon <thomas@monjalon.net>
> > 08/01/2020 13:39, Xu, Rosen:
> > > From: Matan Azrad <matan@mellanox.com>
> > > > From: Xu, Rosen
> > > > > Did you think about OVS DPDK?
> > > > > vDPA is a basic module for OVS, currently it will take some
> > > > > exception path packet processing for OVS, so it still needs to integrate
> > eth_dev.
> > > >
> > > > I don't understand your question.
> > > >
> > > > What do you mean by "integrate eth_dev"?
> > >
> > > My questions is in OVS DPDK scenario vDPA device implements eth_dev
> > > ops, so create a new class and move ifc code to this new class is not ok.
> > 
> > 1/ I don't understand the relation with OVS.
> >
> > 2/ no, vDPA device implements vDPA ops.
> > If it implements ethdev ops, it is an ethdev device.
> > 
> > Please show an example of what you claim.
> 
> Answers of 1 and 2.
> 
> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK needs to be implemented
> as rte_eth_dev and provides eth_dev_ops such as packet TX/RX for OVS.

No, OVS is also using the vhost API for vhost port.

> Take vHost(Virtio back end) for example, OVS startups vHost interface like this:
> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser
> drivers/net/vhost implements vHost as rte_eth_dev and integrated in OVS.
> OVS can send/receive packets to/from VM with rte_eth_tx_burst()  rte_eth_rx_burst()
> which call eth_dev_ops implementation of drivers/net/vhost.

No, it is using rte_vhost_dequeue_burst() and rte_vhost_enqueue_burst()
which are not in ethdev.

> vDPA is also Virtio back end and works like vHost, same as vHost,
> it will be implemented as rte_eth_dev and also be integrated into OVS.

No, vDPA is not "implemented as rte_eth_dev".

> So, it's not ok to move ifc code from drivers/net.

drivers/net/ifc has no ethdev implementation at all.


Rosen, I'm sorry, these arguments look irrelevant,
so I won't consider them as blocking the integration of this patch.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  8:41             ` Thomas Monjalon
@ 2020-01-09  9:23               ` Maxime Coquelin
  2020-01-09  9:49                 ` Xu, Rosen
  2020-01-09 10:53               ` Xu, Rosen
  1 sibling, 1 reply; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-09  9:23 UTC (permalink / raw)
  To: Thomas Monjalon, Xu, Rosen
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy



On 1/9/20 9:41 AM, Thomas Monjalon wrote:
> 09/01/2020 03:27, Xu, Rosen:
>> Hi,
>>
>> From: Thomas Monjalon <thomas@monjalon.net>
>>> 08/01/2020 13:39, Xu, Rosen:
>>>> From: Matan Azrad <matan@mellanox.com>
>>>>> From: Xu, Rosen
>>>>>> Did you think about OVS DPDK?
>>>>>> vDPA is a basic module for OVS, currently it will take some
>>>>>> exception path packet processing for OVS, so it still needs to integrate
>>> eth_dev.
>>>>>
>>>>> I don't understand your question.
>>>>>
>>>>> What do you mean by "integrate eth_dev"?
>>>>
>>>> My questions is in OVS DPDK scenario vDPA device implements eth_dev
>>>> ops, so create a new class and move ifc code to this new class is not ok.
>>>
>>> 1/ I don't understand the relation with OVS.
>>>
>>> 2/ no, vDPA device implements vDPA ops.
>>> If it implements ethdev ops, it is an ethdev device.
>>>
>>> Please show an example of what you claim.
>>
>> Answers of 1 and 2.
>>
>> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK needs to be implemented
>> as rte_eth_dev and provides eth_dev_ops such as packet TX/RX for OVS.
> 
> No, OVS is also using the vhost API for vhost port.
> 
>> Take vHost(Virtio back end) for example, OVS startups vHost interface like this:
>> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser
>> drivers/net/vhost implements vHost as rte_eth_dev and integrated in OVS.
>> OVS can send/receive packets to/from VM with rte_eth_tx_burst()  rte_eth_rx_burst()
>> which call eth_dev_ops implementation of drivers/net/vhost.
> 
> No, it is using rte_vhost_dequeue_burst() and rte_vhost_enqueue_burst()
> which are not in ethdev.
> 
>> vDPA is also Virtio back end and works like vHost, same as vHost,
>> it will be implemented as rte_eth_dev and also be integrated into OVS.
> 
> No, vDPA is not "implemented as rte_eth_dev".
> 
>> So, it's not ok to move ifc code from drivers/net.
> 
> drivers/net/ifc has no ethdev implementation at all.
> 
> 
> Rosen, I'm sorry, these arguments look irrelevant,
> so I won't consider them as blocking the integration of this patch.
> 
> 

I agree with Thomas, the vDPA drivers do not implement the ethdev ops.
And OVS does not use the Vhost PMD for the Vhost-user ports, but
directly call the librte_vhost APIs.

Regards,
Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  9:23               ` Maxime Coquelin
@ 2020-01-09  9:49                 ` Xu, Rosen
  2020-01-09 10:42                   ` Maxime Coquelin
  2020-01-09 10:42                   ` Maxime Coquelin
  0 siblings, 2 replies; 50+ messages in thread
From: Xu, Rosen @ 2020-01-09  9:49 UTC (permalink / raw)
  To: Maxime Coquelin, Thomas Monjalon
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy



> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, January 09, 2020 17:24
> To: Thomas Monjalon <thomas@monjalon.net>; Xu, Rosen
> <rosen.xu@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 
> 
> On 1/9/20 9:41 AM, Thomas Monjalon wrote:
> > 09/01/2020 03:27, Xu, Rosen:
> >> Hi,
> >>
> >> From: Thomas Monjalon <thomas@monjalon.net>
> >>> 08/01/2020 13:39, Xu, Rosen:
> >>>> From: Matan Azrad <matan@mellanox.com>
> >>>>> From: Xu, Rosen
> >>>>>> Did you think about OVS DPDK?
> >>>>>> vDPA is a basic module for OVS, currently it will take some
> >>>>>> exception path packet processing for OVS, so it still needs to
> >>>>>> integrate
> >>> eth_dev.
> >>>>>
> >>>>> I don't understand your question.
> >>>>>
> >>>>> What do you mean by "integrate eth_dev"?
> >>>>
> >>>> My questions is in OVS DPDK scenario vDPA device implements eth_dev
> >>>> ops, so create a new class and move ifc code to this new class is not ok.
> >>>
> >>> 1/ I don't understand the relation with OVS.
> >>>
> >>> 2/ no, vDPA device implements vDPA ops.
> >>> If it implements ethdev ops, it is an ethdev device.
> >>>
> >>> Please show an example of what you claim.
> >>
> >> Answers of 1 and 2.
> >>
> >> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
> >> needs to be implemented as rte_eth_dev and provides eth_dev_ops such
> as packet TX/RX for OVS.
> >
> > No, OVS is also using the vhost API for vhost port.
> >
> >> Take vHost(Virtio back end) for example, OVS startups vHost interface like
> this:
> >> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> >> type=dpdkvhostuser drivers/net/vhost implements vHost as rte_eth_dev
> and integrated in OVS.
> >> OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> >> rte_eth_rx_burst() which call eth_dev_ops implementation of
> drivers/net/vhost.
> >
> > No, it is using rte_vhost_dequeue_burst() and
> > rte_vhost_enqueue_burst() which are not in ethdev.
> >
> >> vDPA is also Virtio back end and works like vHost, same as vHost, it
> >> will be implemented as rte_eth_dev and also be integrated into OVS.
> >
> > No, vDPA is not "implemented as rte_eth_dev".
> >
> >> So, it's not ok to move ifc code from drivers/net.
> >
> > drivers/net/ifc has no ethdev implementation at all.
> >
> >
> > Rosen, I'm sorry, these arguments look irrelevant, so I won't consider
> > them as blocking the integration of this patch.
> >
> >
> 
> I agree with Thomas, the vDPA drivers do not implement the ethdev ops.

For OVS hasn't integrated vDPA, it doesn't implement ethdev ops, but there are many
discussions in OVS community about vDPA, it seems vDPA will be supported in OVS in
the near feature.

> And OVS does not use the Vhost PMD for the Vhost-user ports, but directly
> call the librte_vhost APIs.

I'm afraid you are wrong, pls read these documents which introduce how to use vHost-user PMD in OVS:
http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/

> Regards,
> Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  9:49                 ` Xu, Rosen
@ 2020-01-09 10:42                   ` Maxime Coquelin
  2020-01-10  2:40                     ` Xu, Rosen
  2020-01-09 10:42                   ` Maxime Coquelin
  1 sibling, 1 reply; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-09 10:42 UTC (permalink / raw)
  To: Xu, Rosen, Thomas Monjalon
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy



On 1/9/20 10:49 AM, Xu, Rosen wrote:
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, January 09, 2020 17:24
>> To: Thomas Monjalon <thomas@monjalon.net>; Xu, Rosen
>> <rosen.xu@intel.com>
>> Cc: Matan Azrad <matan@mellanox.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>> Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
>> <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
>> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
>> drivers
>>
>>
>>
>> On 1/9/20 9:41 AM, Thomas Monjalon wrote:
>>> 09/01/2020 03:27, Xu, Rosen:
>>>> Hi,
>>>>
>>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>>> 08/01/2020 13:39, Xu, Rosen:
>>>>>> From: Matan Azrad <matan@mellanox.com>
>>>>>>> From: Xu, Rosen
>>>>>>>> Did you think about OVS DPDK?
>>>>>>>> vDPA is a basic module for OVS, currently it will take some
>>>>>>>> exception path packet processing for OVS, so it still needs to
>>>>>>>> integrate
>>>>> eth_dev.
>>>>>>>
>>>>>>> I don't understand your question.
>>>>>>>
>>>>>>> What do you mean by "integrate eth_dev"?
>>>>>>
>>>>>> My questions is in OVS DPDK scenario vDPA device implements eth_dev
>>>>>> ops, so create a new class and move ifc code to this new class is not ok.
>>>>>
>>>>> 1/ I don't understand the relation with OVS.
>>>>>
>>>>> 2/ no, vDPA device implements vDPA ops.
>>>>> If it implements ethdev ops, it is an ethdev device.
>>>>>
>>>>> Please show an example of what you claim.
>>>>
>>>> Answers of 1 and 2.
>>>>
>>>> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
>>>> needs to be implemented as rte_eth_dev and provides eth_dev_ops such
>> as packet TX/RX for OVS.
>>>
>>> No, OVS is also using the vhost API for vhost port.
>>>
>>>> Take vHost(Virtio back end) for example, OVS startups vHost interface like
>> this:
>>>> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
>>>> type=dpdkvhostuser drivers/net/vhost implements vHost as rte_eth_dev
>> and integrated in OVS.
>>>> OVS can send/receive packets to/from VM with rte_eth_tx_burst()
>>>> rte_eth_rx_burst() which call eth_dev_ops implementation of
>> drivers/net/vhost.
>>>
>>> No, it is using rte_vhost_dequeue_burst() and
>>> rte_vhost_enqueue_burst() which are not in ethdev.
>>>
>>>> vDPA is also Virtio back end and works like vHost, same as vHost, it
>>>> will be implemented as rte_eth_dev and also be integrated into OVS.
>>>
>>> No, vDPA is not "implemented as rte_eth_dev".
>>>
>>>> So, it's not ok to move ifc code from drivers/net.
>>>
>>> drivers/net/ifc has no ethdev implementation at all.
>>>
>>>
>>> Rosen, I'm sorry, these arguments look irrelevant, so I won't consider
>>> them as blocking the integration of this patch.
>>>
>>>
>>
>> I agree with Thomas, the vDPA drivers do not implement the ethdev ops.
> 
> For OVS hasn't integrated vDPA, it doesn't implement ethdev ops, but there are many
> discussions in OVS community about vDPA, it seems vDPA will be supported in OVS in
> the near feature.

I agree with this statement, but if you look at Mellanox series being
reviewed, it is defining a new type of port and not use the regular DPDK
port type.

>> And OVS does not use the Vhost PMD for the Vhost-user ports, but directly
>> call the librte_vhost APIs.
> 
> I'm afraid you are wrong, pls read these documents which introduce how to use vHost-user PMD in OVS:
> http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
> http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/

I can confirm that below command to add ports is not using Vhost PMD but
directly the  librte_vhost API:

$ ovs-vsctl add-port br0 dpdkvhostclient0 \
    -- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
       options:vhost-server-path=/tmp/dpdkvhostclient0

Please check the OVS source code.

It is possible  to use the Vhost PMD as a regular DPDK port, but this is
not with above command, and not the recommended way.

>> Regards,
>> Maxime
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  9:49                 ` Xu, Rosen
  2020-01-09 10:42                   ` Maxime Coquelin
@ 2020-01-09 10:42                   ` Maxime Coquelin
  1 sibling, 0 replies; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-09 10:42 UTC (permalink / raw)
  To: Xu, Rosen, Thomas Monjalon
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy



On 1/9/20 10:49 AM, Xu, Rosen wrote:
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, January 09, 2020 17:24
>> To: Thomas Monjalon <thomas@monjalon.net>; Xu, Rosen
>> <rosen.xu@intel.com>
>> Cc: Matan Azrad <matan@mellanox.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>> Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
>> <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
>> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
>> drivers
>>
>>
>>
>> On 1/9/20 9:41 AM, Thomas Monjalon wrote:
>>> 09/01/2020 03:27, Xu, Rosen:
>>>> Hi,
>>>>
>>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>>> 08/01/2020 13:39, Xu, Rosen:
>>>>>> From: Matan Azrad <matan@mellanox.com>
>>>>>>> From: Xu, Rosen
>>>>>>>> Did you think about OVS DPDK?
>>>>>>>> vDPA is a basic module for OVS, currently it will take some
>>>>>>>> exception path packet processing for OVS, so it still needs to
>>>>>>>> integrate
>>>>> eth_dev.
>>>>>>>
>>>>>>> I don't understand your question.
>>>>>>>
>>>>>>> What do you mean by "integrate eth_dev"?
>>>>>>
>>>>>> My questions is in OVS DPDK scenario vDPA device implements eth_dev
>>>>>> ops, so create a new class and move ifc code to this new class is not ok.
>>>>>
>>>>> 1/ I don't understand the relation with OVS.
>>>>>
>>>>> 2/ no, vDPA device implements vDPA ops.
>>>>> If it implements ethdev ops, it is an ethdev device.
>>>>>
>>>>> Please show an example of what you claim.
>>>>
>>>> Answers of 1 and 2.
>>>>
>>>> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
>>>> needs to be implemented as rte_eth_dev and provides eth_dev_ops such
>> as packet TX/RX for OVS.
>>>
>>> No, OVS is also using the vhost API for vhost port.
>>>
>>>> Take vHost(Virtio back end) for example, OVS startups vHost interface like
>> this:
>>>> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
>>>> type=dpdkvhostuser drivers/net/vhost implements vHost as rte_eth_dev
>> and integrated in OVS.
>>>> OVS can send/receive packets to/from VM with rte_eth_tx_burst()
>>>> rte_eth_rx_burst() which call eth_dev_ops implementation of
>> drivers/net/vhost.
>>>
>>> No, it is using rte_vhost_dequeue_burst() and
>>> rte_vhost_enqueue_burst() which are not in ethdev.
>>>
>>>> vDPA is also Virtio back end and works like vHost, same as vHost, it
>>>> will be implemented as rte_eth_dev and also be integrated into OVS.
>>>
>>> No, vDPA is not "implemented as rte_eth_dev".
>>>
>>>> So, it's not ok to move ifc code from drivers/net.
>>>
>>> drivers/net/ifc has no ethdev implementation at all.
>>>
>>>
>>> Rosen, I'm sorry, these arguments look irrelevant, so I won't consider
>>> them as blocking the integration of this patch.
>>>
>>>
>>
>> I agree with Thomas, the vDPA drivers do not implement the ethdev ops.
> 
> For OVS hasn't integrated vDPA, it doesn't implement ethdev ops, but there are many
> discussions in OVS community about vDPA, it seems vDPA will be supported in OVS in
> the near feature.

I agree with this statement, but if you look at Mellanox series being
reviewed, it is defining a new type of port and not use the regular DPDK
port type.

>> And OVS does not use the Vhost PMD for the Vhost-user ports, but directly
>> call the librte_vhost APIs.
> 
> I'm afraid you are wrong, pls read these documents which introduce how to use vHost-user PMD in OVS:
> http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
> http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/

I can confirm that below command to add ports is not using Vhost PMD but
directly the  librte_vhost API:

$ ovs-vsctl add-port br0 dpdkvhostclient0 \
    -- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
       options:vhost-server-path=/tmp/dpdkvhostclient0

Please check the OVS source code.

It is possible  to use the Vhost PMD as a regular DPDK port, but this is
not with above command, and not the recommended way.

>> Regards,
>> Maxime
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09  8:41             ` Thomas Monjalon
  2020-01-09  9:23               ` Maxime Coquelin
@ 2020-01-09 10:53               ` Xu, Rosen
  2020-01-09 11:34                 ` Matan Azrad
  1 sibling, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-09 10:53 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang,
	Xiao W, Yigit, Ferruh, dev, Pei, Andy



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, January 09, 2020 16:41
> To: Xu, Rosen <rosen.xu@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 09/01/2020 03:27, Xu, Rosen:
> > Hi,
> >
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 08/01/2020 13:39, Xu, Rosen:
> > > > From: Matan Azrad <matan@mellanox.com>
> > > > > From: Xu, Rosen
> > > > > > Did you think about OVS DPDK?
> > > > > > vDPA is a basic module for OVS, currently it will take some
> > > > > > exception path packet processing for OVS, so it still needs to
> > > > > > integrate
> > > eth_dev.
> > > > >
> > > > > I don't understand your question.
> > > > >
> > > > > What do you mean by "integrate eth_dev"?
> > > >
> > > > My questions is in OVS DPDK scenario vDPA device implements
> > > > eth_dev ops, so create a new class and move ifc code to this new class
> is not ok.
> > >
> > > 1/ I don't understand the relation with OVS.
> > >
> > > 2/ no, vDPA device implements vDPA ops.
> > > If it implements ethdev ops, it is an ethdev device.
> > >
> > > Please show an example of what you claim.
> >
> > Answers of 1 and 2.
> >
> > In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK needs
> > to be implemented as rte_eth_dev and provides eth_dev_ops such as
> packet TX/RX for OVS.
> 
> No, OVS is also using the vhost API for vhost port.

Yes, vhost pmd is not a good example.

> > Take vHost(Virtio back end) for example, OVS startups vHost interface like
> this:
> > ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> > type=dpdkvhostuser drivers/net/vhost implements vHost as rte_eth_dev
> and integrated in OVS.
> > OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> > rte_eth_rx_burst() which call eth_dev_ops implementation of
> drivers/net/vhost.
>
> No, it is using rte_vhost_dequeue_burst() and rte_vhost_enqueue_burst()
> which are not in ethdev.
>
> > vDPA is also Virtio back end and works like vHost, same as vHost, it
> > will be implemented as rte_eth_dev and also be integrated into OVS.
> 
> No, vDPA is not "implemented as rte_eth_dev".

Currently, vDPA isn't integrated with OVS.
 
> > So, it's not ok to move ifc code from drivers/net.
> 
> drivers/net/ifc has no ethdev implementation at all.

For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev,
but there are many discussions in OVS community about vDPA,
some are from Mellanox, it seems vDPA port will be implemented
as rte_eth_dev port in OVS in the near feature.
https://patchwork.ozlabs.org/patch/1178474/

Matan,
Could you clarify how OVS integrates vDPA in Mellanox patch?

> 
> Rosen, I'm sorry, these arguments look irrelevant, so I won't consider them
> as blocking the integration of this patch.

What I mentioned is not blocking the integration of this patch, I just want to get
clarification from Matan how to integrate vDPA port in OVS.




^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v2 0/3] Introduce new class for vDPA device drivers
  2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
                   ` (3 preceding siblings ...)
  2020-01-07  7:57 ` [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
@ 2020-01-09 11:00 ` Matan Azrad
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 1/3] drivers: introduce vDPA class Matan Azrad
                     ` (3 more replies)
  4 siblings, 4 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 11:00 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon, Andrew Rybchenko

As discussed and as described in RFC "[RFC] net: new vdpa PMD for Mellanox devices", new vDPA driver is going to be added for Mellanox devices - vDPA mlx5 and more.

The only vDPA driver now is the IFC driver that is located in net directory.

The IFC driver and the new vDPA mlx5 driver provide the vDPA ops introduced in librte_vhost and not the eth-dev ops.
All the others drivers in net class provide the eth-dev ops.
The set of features is also different.

Create a new class for vDPA drivers and move IFC to this class.
Later, all the new drivers that implement the vDPA ops will be added to the vDPA class.

Also, a vDPA device driver features list was added to vDPA documentation.

Please review the features list and the series.

Later on, I'm going to send the vDPA mlx5 driver.

Thanks.

v2:
Apply comments from Maxime Coquelin, Andrew Rybchenko and Tiwei Bie.


Matan Azrad (3):
  drivers: introduce vDPA class
  doc: add vDPA feature table
  drivers: move ifc driver to the vDPA class

 MAINTAINERS                               |   19 +-
 doc/guides/conf.py                        |    5 +
 doc/guides/index.rst                      |    1 +
 doc/guides/nics/features/ifcvf.ini        |    8 -
 doc/guides/nics/ifc.rst                   |  106 ---
 doc/guides/nics/index.rst                 |    1 -
 doc/guides/vdpadevs/features/default.ini  |   50 ++
 doc/guides/vdpadevs/features/ifcvf.ini    |    8 +
 doc/guides/vdpadevs/features_overview.rst |   74 ++
 doc/guides/vdpadevs/ifc.rst               |  106 +++
 doc/guides/vdpadevs/index.rst             |   15 +
 drivers/Makefile                          |    2 +
 drivers/meson.build                       |    1 +
 drivers/net/Makefile                      |    3 -
 drivers/net/ifc/Makefile                  |   34 -
 drivers/net/ifc/base/ifcvf.c              |  329 --------
 drivers/net/ifc/base/ifcvf.h              |  162 ----
 drivers/net/ifc/base/ifcvf_osdep.h        |   52 --
 drivers/net/ifc/ifcvf_vdpa.c              | 1280 -----------------------------
 drivers/net/ifc/meson.build               |    9 -
 drivers/net/ifc/rte_pmd_ifc_version.map   |    3 -
 drivers/net/meson.build                   |    1 -
 drivers/vdpa/Makefile                     |   14 +
 drivers/vdpa/ifc/Makefile                 |   34 +
 drivers/vdpa/ifc/base/ifcvf.c             |  329 ++++++++
 drivers/vdpa/ifc/base/ifcvf.h             |  162 ++++
 drivers/vdpa/ifc/base/ifcvf_osdep.h       |   52 ++
 drivers/vdpa/ifc/ifcvf_vdpa.c             | 1280 +++++++++++++++++++++++++++++
 drivers/vdpa/ifc/meson.build              |    9 +
 drivers/vdpa/ifc/rte_pmd_ifc_version.map  |    3 +
 drivers/vdpa/meson.build                  |    8 +
 31 files changed, 2164 insertions(+), 1996 deletions(-)
 delete mode 100644 doc/guides/nics/features/ifcvf.ini
 delete mode 100644 doc/guides/nics/ifc.rst
 create mode 100644 doc/guides/vdpadevs/features/default.ini
 create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
 create mode 100644 doc/guides/vdpadevs/features_overview.rst
 create mode 100644 doc/guides/vdpadevs/ifc.rst
 create mode 100644 doc/guides/vdpadevs/index.rst
 delete mode 100644 drivers/net/ifc/Makefile
 delete mode 100644 drivers/net/ifc/base/ifcvf.c
 delete mode 100644 drivers/net/ifc/base/ifcvf.h
 delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 delete mode 100644 drivers/net/ifc/meson.build
 delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/Makefile
 create mode 100644 drivers/vdpa/ifc/Makefile
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
 create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/vdpa/ifc/meson.build
 create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/meson.build

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] drivers: introduce vDPA class
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
@ 2020-01-09 11:00   ` Matan Azrad
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table Matan Azrad
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 11:00 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon, Andrew Rybchenko

The vDPA (vhost data path acceleration) drivers provide support for
the vDPA operations introduced by the rte_vhost library.

Any driver which provides the vDPA operations should be moved\added to
the vdpa class under drivers/vdpa/.

Create the general files for vDPA class in drivers and in documentation.

The management tree for vDPA drivers is
git://dpdk.org/next/dpdk-next-virtio.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 MAINTAINERS                   |  5 +++++
 doc/guides/index.rst          |  1 +
 doc/guides/vdpadevs/index.rst | 13 +++++++++++++
 drivers/Makefile              |  2 ++
 drivers/meson.build           |  1 +
 drivers/vdpa/Makefile         |  8 ++++++++
 drivers/vdpa/meson.build      |  8 ++++++++
 7 files changed, 38 insertions(+)
 create mode 100644 doc/guides/vdpadevs/index.rst
 create mode 100644 drivers/vdpa/Makefile
 create mode 100644 drivers/vdpa/meson.build

diff --git a/MAINTAINERS b/MAINTAINERS
index 9b5c80f..17c2df7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1089,6 +1089,11 @@ F: doc/guides/compressdevs/zlib.rst
 F: doc/guides/compressdevs/features/zlib.ini
 
 
+vDPA Drivers
+------------
+T: git://dpdk.org/next/dpdk-next-virtio
+
+
 Eventdev Drivers
 ----------------
 M: Jerin Jacob <jerinj@marvell.com>
diff --git a/doc/guides/index.rst b/doc/guides/index.rst
index 8a1601b..988c6ea 100644
--- a/doc/guides/index.rst
+++ b/doc/guides/index.rst
@@ -19,6 +19,7 @@ DPDK documentation
    bbdevs/index
    cryptodevs/index
    compressdevs/index
+   vdpadevs/index
    eventdevs/index
    rawdevs/index
    mempool/index
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
new file mode 100644
index 0000000..d69dc91
--- /dev/null
+++ b/doc/guides/vdpadevs/index.rst
@@ -0,0 +1,13 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright 2019 Mellanox Technologies, Ltd
+
+vDPA Device Drivers
+===================
+
+The following are a list of vDPA(vhost data path acceleration) device drivers,
+which can be used from an application through vhost API.
+
+.. toctree::
+    :maxdepth: 2
+    :numbered:
+
diff --git a/drivers/Makefile b/drivers/Makefile
index 7d5da5d..46374ca 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -18,6 +18,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += common/qat
 DEPDIRS-common/qat := bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_COMPRESSDEV) += compress
 DEPDIRS-compress := bus mempool
+DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vdpa
+DEPDIRS-vdpa := common bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += event
 DEPDIRS-event := common bus mempool net
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += raw
diff --git a/drivers/meson.build b/drivers/meson.build
index 2850d0f..29708cc 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -13,6 +13,7 @@ dpdk_driver_classes = ['common',
 	       'raw',     # depends on common, bus and net.
 	       'crypto',  # depends on common, bus and mempool (net in future).
 	       'compress', # depends on common, bus, mempool.
+	       'vdpa',    # depends on common, bus and mempool.
 	       'event',   # depends on common, bus, mempool and net.
 	       'baseband'] # depends on common and bus.
 
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
new file mode 100644
index 0000000..82a2b70
--- /dev/null
+++ b/drivers/vdpa/Makefile
@@ -0,0 +1,8 @@
+#   SPDX-License-Identifier: BSD-3-Clause
+#   Copyright 2019 Mellanox Technologies, Ltd
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# DIRS-$(<configuration>) += <directory>
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
new file mode 100644
index 0000000..a839ff5
--- /dev/null
+++ b/drivers/vdpa/meson.build
@@ -0,0 +1,8 @@
+#   SPDX-License-Identifier: BSD-3-Clause
+#   Copyright 2019 Mellanox Technologies, Ltd
+
+drivers = []
+std_deps = ['bus_pci', 'kvargs']
+std_deps += ['vhost']
+config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
+driver_name_fmt = 'rte_pmd_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 1/3] drivers: introduce vDPA class Matan Azrad
@ 2020-01-09 11:00   ` Matan Azrad
  2020-01-10 18:26     ` Thomas Monjalon
  2020-01-13 22:40     ` Thomas Monjalon
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
  2020-01-13 23:08   ` [dpdk-dev] [PATCH v2 0/3] Introduce new class for vDPA device drivers Thomas Monjalon
  3 siblings, 2 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 11:00 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon, Andrew Rybchenko

Add vDPA devices features table and explanation.

Any vDPA driver can add its own supported features by ading a new ini
file to the features directory in doc/guides/vdpadevs/features.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/conf.py                        |  5 +++
 doc/guides/vdpadevs/features/default.ini  | 50 +++++++++++++++++++++
 doc/guides/vdpadevs/features_overview.rst | 74 +++++++++++++++++++++++++++++++
 doc/guides/vdpadevs/index.rst             |  1 +
 4 files changed, 130 insertions(+)
 create mode 100644 doc/guides/vdpadevs/features/default.ini
 create mode 100644 doc/guides/vdpadevs/features_overview.rst

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 0892c06..c368fa5 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -401,6 +401,11 @@ def setup(app):
                             'Features',
                             'Features availability in compression drivers',
                             'Feature')
+    table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
+    generate_overview_table(table_file, 1,
+                            'Features',
+                            'Features availability in vDPA drivers',
+                            'Feature')
 
     if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
         print('Upgrade sphinx to version >= 1.3.1 for '
diff --git a/doc/guides/vdpadevs/features/default.ini b/doc/guides/vdpadevs/features/default.ini
new file mode 100644
index 0000000..518e4f1
--- /dev/null
+++ b/doc/guides/vdpadevs/features/default.ini
@@ -0,0 +1,50 @@
+;
+; Features of a default vDPA driver.
+;
+; This file defines the features that are valid for inclusion in
+; the other driver files and also the order that they appear in
+; the features table in the documentation. The feature description
+; string should not exceed feature_str_len defined in conf.py.
+;
+[Features]
+csum                 =
+guest csum           =
+mac                  =
+gso                  =
+guest tso4           =
+guest tso6           =
+ecn                  =
+ufo                  =
+host tso4            =
+host tso6            =
+mrg rxbuf            =
+ctrl vq              =
+ctrl rx              =
+any layout           =
+guest announce       =
+mq                   =
+version 1            =
+log all              =
+indirect desc        =
+event idx            =
+mtu                  =
+in_order             =
+IOMMU platform       =
+packed               =
+proto mq             =
+proto log shmfd      =
+proto rarp           =
+proto reply ack      =
+proto host notifier  =
+proto pagefault      =
+BSD nic_uio          =
+Linux VFIO           =
+Other kdrv           =
+ARMv7                =
+ARMv8                =
+Power8               =
+x86-32               =
+x86-64               =
+Usage doc            =
+Design doc           =
+Perf doc             =
\ No newline at end of file
diff --git a/doc/guides/vdpadevs/features_overview.rst b/doc/guides/vdpadevs/features_overview.rst
new file mode 100644
index 0000000..3ce1db1
--- /dev/null
+++ b/doc/guides/vdpadevs/features_overview.rst
@@ -0,0 +1,74 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright 2019 Mellanox Technologies, Ltd
+
+Overview of vDPA drivers features
+=================================
+
+This section explains the supported features that are listed in the table below.
+
+  * csum - Device can handle packets with partial checksum.
+  * guest csum - Guest can handle packets with partial checksum.
+  * mac - Device has given MAC address.
+  * gso - Device can handle packets with any GSO type.
+  * guest tso4 - Guest can receive TSOv4.
+  * guest tso6 - Guest can receive TSOv6.
+  * ecn - Device can receive TSO with ECN.
+  * ufo - Device can receive UFO.
+  * host tso4 - Device can receive TSOv4.
+  * host tso6 - Device can receive TSOv6.
+  * mrg rxbuf - Guest can merge receive buffers.
+  * ctrl vq - Control channel is available.
+  * ctrl rx - Control channel RX mode support.
+  * any layout - Device can handle any descriptor layout.
+  * guest announce - Guest can send gratuitous packets.
+  * mq - Device supports Receive Flow Steering.
+  * version 1 - v1.0 compliant.
+  * log all - Device can log all write descriptors (live migration).
+  * indirect desc - Indirect buffer descriptors support.
+  * event idx - Support for avail_idx and used_idx fields.
+  * mtu - Host can advise the guest with its maximum supported MTU.
+  * in_order - Device can use descriptors in ring order.
+  * IOMMU platform - Device support IOMMU addresses.
+  * packed - Device support packed virtio queues.
+  * proto mq - Support the number of queues query.
+  * proto log shmfd - Guest support setting log base.
+  * proto rarp - Host can broadcast a fake RARP after live migration.
+  * proto reply ack - Host support requested operation status ack.
+  * proto host notifier - Host can register memory region based host notifiers.
+  * proto pagefault - Slave expose page-fault FD for migration process.
+  * BSD nic_uio - BSD ``nic_uio`` module supported.
+  * Linux VFIO - Works with ``vfio-pci`` kernel module.
+  * Other kdrv - Kernel module other than above ones supported.
+  * ARMv7 - Support armv7 architecture.
+  * ARMv8 - Support armv8a (64bit) architecture.
+  * Power8 - Support PowerPC architecture.
+  * x86-32 - Support 32bits x86 architecture.
+  * x86-64 - Support 64bits x86 architecture.
+  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
+  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
+  * Perf doc - Documentation describes performance values, In ``doc/perf/``.
+
+.. note::
+
+   Most of the features capabilities should be provided by the drivers via the
+   next vDPA operations: ``get_features`` and ``get_protocol_features``.
+
+
+Useful links
+============
+
+  * `OASIS: Virtual I/O Device (VIRTIO) Version 1.1 <https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01>`_.
+  * `QEMU: Vhost-user Protocol <https://qemu.weilnetz.de/doc/interop/vhost-user>`_.
+
+
+Features table
+==============
+
+.. _table_vdpa_pmd_features:
+
+.. include:: overview_feature_table.txt
+
+.. Note::
+
+   Features marked with "P" are partially supported. Refer to the appropriate
+   driver guide in the following sections for details.
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
index d69dc91..89e2b03 100644
--- a/doc/guides/vdpadevs/index.rst
+++ b/doc/guides/vdpadevs/index.rst
@@ -11,3 +11,4 @@ which can be used from an application through vhost API.
     :maxdepth: 2
     :numbered:
 
+    features_overview
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 1/3] drivers: introduce vDPA class Matan Azrad
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table Matan Azrad
@ 2020-01-09 11:00   ` Matan Azrad
  2020-01-09 17:25     ` Matan Azrad
  2020-01-13 22:57     ` Thomas Monjalon
  2020-01-13 23:08   ` [dpdk-dev] [PATCH v2 0/3] Introduce new class for vDPA device drivers Thomas Monjalon
  3 siblings, 2 replies; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 11:00 UTC (permalink / raw)
  To: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon, Andrew Rybchenko

A new vDPA class was recently introduced.

IFC driver implements the vDPA operations, hence it should be moved to
the vDPA class.

Move it.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 MAINTAINERS                              |   14 +-
 doc/guides/nics/features/ifcvf.ini       |    8 -
 doc/guides/nics/ifc.rst                  |  106 ---
 doc/guides/nics/index.rst                |    1 -
 doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
 doc/guides/vdpadevs/ifc.rst              |  106 +++
 doc/guides/vdpadevs/index.rst            |    1 +
 drivers/net/Makefile                     |    3 -
 drivers/net/ifc/Makefile                 |   34 -
 drivers/net/ifc/base/ifcvf.c             |  329 --------
 drivers/net/ifc/base/ifcvf.h             |  162 ----
 drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
 drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
 drivers/net/ifc/meson.build              |    9 -
 drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
 drivers/net/meson.build                  |    1 -
 drivers/vdpa/Makefile                    |    6 +
 drivers/vdpa/ifc/Makefile                |   34 +
 drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
 drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
 drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
 drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280 ++++++++++++++++++++++++++++++
 drivers/vdpa/ifc/meson.build             |    9 +
 drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
 drivers/vdpa/meson.build                 |    2 +-
 25 files changed, 1997 insertions(+), 1997 deletions(-)
 delete mode 100644 doc/guides/nics/features/ifcvf.ini
 delete mode 100644 doc/guides/nics/ifc.rst
 create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
 create mode 100644 doc/guides/vdpadevs/ifc.rst
 delete mode 100644 drivers/net/ifc/Makefile
 delete mode 100644 drivers/net/ifc/base/ifcvf.c
 delete mode 100644 drivers/net/ifc/base/ifcvf.h
 delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 delete mode 100644 drivers/net/ifc/meson.build
 delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
 create mode 100644 drivers/vdpa/ifc/Makefile
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
 create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
 create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/vdpa/ifc/meson.build
 create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 17c2df7..16facba 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -679,14 +679,6 @@ T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/iavf/
 F: doc/guides/nics/features/iavf*.ini
 
-Intel ifc
-M: Xiao Wang <xiao.w.wang@intel.com>
-T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/ifc/
-F: doc/guides/nics/ifc.rst
-F: doc/guides/nics/features/ifc*.ini
-
-Intel ice
 M: Qiming Yang <qiming.yang@intel.com>
 M: Wenzhuo Lu <wenzhuo.lu@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
@@ -1093,6 +1085,12 @@ vDPA Drivers
 ------------
 T: git://dpdk.org/next/dpdk-next-virtio
 
+Intel ifc
+M: Xiao Wang <xiao.w.wang@intel.com>
+F: drivers/vdpa/ifc/
+F: doc/guides/vdpadevs/ifc.rst
+F: doc/guides/vdpadevs/features/ifcvf.ini
+
 
 Eventdev Drivers
 ----------------
diff --git a/doc/guides/nics/features/ifcvf.ini b/doc/guides/nics/features/ifcvf.ini
deleted file mode 100644
index ef1fc47..0000000
--- a/doc/guides/nics/features/ifcvf.ini
+++ /dev/null
@@ -1,8 +0,0 @@
-;
-; Supported features of the 'ifcvf' vDPA driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-x86-32               = Y
-x86-64               = Y
diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst
deleted file mode 100644
index 12a2a34..0000000
--- a/doc/guides/nics/ifc.rst
+++ /dev/null
@@ -1,106 +0,0 @@
-..  SPDX-License-Identifier: BSD-3-Clause
-    Copyright(c) 2018 Intel Corporation.
-
-IFCVF vDPA driver
-=================
-
-The IFCVF vDPA (vhost data path acceleration) driver provides support for the
-Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
-works as a HW vhost backend which can send/receive packets to/from virtio
-directly by DMA. Besides, it supports dirty page logging and device state
-report/restore, this driver enables its vDPA functionality.
-
-
-Pre-Installation Configuration
-------------------------------
-
-Config File Options
-~~~~~~~~~~~~~~~~~~~
-
-The following option can be modified in the ``config`` file.
-
-- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
-
-  Toggle compilation of the ``librte_pmd_ifc`` driver.
-
-
-IFCVF vDPA Implementation
--------------------------
-
-IFCVF's vendor ID and device ID are same as that of virtio net pci device,
-with its specific subsystem vendor ID and device ID. To let the device be
-probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
-device is to be used in vDPA mode, rather than polling mode, virtio pmd will
-skip when it detects this message. If no this parameter specified, device
-will not be used as a vDPA device, and it will be driven by virtio pmd.
-
-Different VF devices serve different virtio frontends which are in different
-VMs, so each VF needs to have its own DMA address translation service. During
-the driver probe a new container is created for this device, with this
-container vDPA driver can program DMA remapping table with the VM's memory
-region information.
-
-The device argument "sw-live-migration=1" will configure the driver into SW
-assisted live migration mode. In this mode, the driver will set up a SW relay
-thread when LM happens, this thread will help device to log dirty pages. Thus
-this mode does not require HW to implement a dirty page logging function block,
-but will consume some percentage of CPU resource depending on the network
-throughput. If no this parameter specified, driver will rely on device's logging
-capability.
-
-Key IFCVF vDPA driver ops
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-- ifcvf_dev_config:
-  Enable VF data path with virtio information provided by vhost lib, including
-  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
-  route HW interrupt to virtio driver, create notify relay thread to translate
-  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
-
-  This function gets called to set up HW data path backend when virtio driver
-  in VM gets ready.
-
-- ifcvf_dev_close:
-  Revoke all the setup in ifcvf_dev_config.
-
-  This function gets called when virtio driver stops device in VM.
-
-To create a vhost port with IFC VF
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-- Create a vhost socket and assign a VF's device ID to this socket via
-  vhost API. When QEMU vhost connection gets ready, the assigned VF will
-  get configured automatically.
-
-
-Features
---------
-
-Features of the IFCVF driver are:
-
-- Compatibility with virtio 0.95 and 1.0.
-- SW assisted vDPA live migration.
-
-
-Prerequisites
--------------
-
-- Platform with IOMMU feature. IFC VF needs address translation service to
-  Rx/Tx directly with virtio driver in VM.
-
-
-Limitations
------------
-
-Dependency on vfio-pci
-~~~~~~~~~~~~~~~~~~~~~~
-
-vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector
-is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
-allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
-
-Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-IFC VF doesn't support RARP packet generation, virtio frontend supporting
-VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index d61c27f..8c540c0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -31,7 +31,6 @@ Network Interface Controller Drivers
     hns3
     i40e
     ice
-    ifc
     igb
     ipn3ke
     ixgbe
diff --git a/doc/guides/vdpadevs/features/ifcvf.ini b/doc/guides/vdpadevs/features/ifcvf.ini
new file mode 100644
index 0000000..ef1fc47
--- /dev/null
+++ b/doc/guides/vdpadevs/features/ifcvf.ini
@@ -0,0 +1,8 @@
+;
+; Supported features of the 'ifcvf' vDPA driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+x86-32               = Y
+x86-64               = Y
diff --git a/doc/guides/vdpadevs/ifc.rst b/doc/guides/vdpadevs/ifc.rst
new file mode 100644
index 0000000..12a2a34
--- /dev/null
+++ b/doc/guides/vdpadevs/ifc.rst
@@ -0,0 +1,106 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018 Intel Corporation.
+
+IFCVF vDPA driver
+=================
+
+The IFCVF vDPA (vhost data path acceleration) driver provides support for the
+Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
+works as a HW vhost backend which can send/receive packets to/from virtio
+directly by DMA. Besides, it supports dirty page logging and device state
+report/restore, this driver enables its vDPA functionality.
+
+
+Pre-Installation Configuration
+------------------------------
+
+Config File Options
+~~~~~~~~~~~~~~~~~~~
+
+The following option can be modified in the ``config`` file.
+
+- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
+
+  Toggle compilation of the ``librte_pmd_ifc`` driver.
+
+
+IFCVF vDPA Implementation
+-------------------------
+
+IFCVF's vendor ID and device ID are same as that of virtio net pci device,
+with its specific subsystem vendor ID and device ID. To let the device be
+probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
+device is to be used in vDPA mode, rather than polling mode, virtio pmd will
+skip when it detects this message. If no this parameter specified, device
+will not be used as a vDPA device, and it will be driven by virtio pmd.
+
+Different VF devices serve different virtio frontends which are in different
+VMs, so each VF needs to have its own DMA address translation service. During
+the driver probe a new container is created for this device, with this
+container vDPA driver can program DMA remapping table with the VM's memory
+region information.
+
+The device argument "sw-live-migration=1" will configure the driver into SW
+assisted live migration mode. In this mode, the driver will set up a SW relay
+thread when LM happens, this thread will help device to log dirty pages. Thus
+this mode does not require HW to implement a dirty page logging function block,
+but will consume some percentage of CPU resource depending on the network
+throughput. If no this parameter specified, driver will rely on device's logging
+capability.
+
+Key IFCVF vDPA driver ops
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- ifcvf_dev_config:
+  Enable VF data path with virtio information provided by vhost lib, including
+  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
+  route HW interrupt to virtio driver, create notify relay thread to translate
+  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
+
+  This function gets called to set up HW data path backend when virtio driver
+  in VM gets ready.
+
+- ifcvf_dev_close:
+  Revoke all the setup in ifcvf_dev_config.
+
+  This function gets called when virtio driver stops device in VM.
+
+To create a vhost port with IFC VF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- Create a vhost socket and assign a VF's device ID to this socket via
+  vhost API. When QEMU vhost connection gets ready, the assigned VF will
+  get configured automatically.
+
+
+Features
+--------
+
+Features of the IFCVF driver are:
+
+- Compatibility with virtio 0.95 and 1.0.
+- SW assisted vDPA live migration.
+
+
+Prerequisites
+-------------
+
+- Platform with IOMMU feature. IFC VF needs address translation service to
+  Rx/Tx directly with virtio driver in VM.
+
+
+Limitations
+-----------
+
+Dependency on vfio-pci
+~~~~~~~~~~~~~~~~~~~~~~
+
+vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector
+is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
+allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
+
+Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+IFC VF doesn't support RARP packet generation, virtio frontend supporting
+VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
index 89e2b03..6cf0827 100644
--- a/doc/guides/vdpadevs/index.rst
+++ b/doc/guides/vdpadevs/index.rst
@@ -12,3 +12,4 @@ which can be used from an application through vhost API.
     :numbered:
 
     features_overview
+    ifc
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index cee3036..cca3c44 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -71,9 +71,6 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
 
 ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
-ifeq ($(CONFIG_RTE_EAL_VFIO),y)
-DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
-endif
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 
 ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifc/Makefile
deleted file mode 100644
index fe227b8..0000000
--- a/drivers/net/ifc/Makefile
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 Intel Corporation
-
-include $(RTE_SDK)/mk/rte.vars.mk
-
-#
-# library name
-#
-LIB = librte_pmd_ifc.a
-
-LDLIBS += -lpthread
-LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
-LDLIBS += -lrte_kvargs
-
-CFLAGS += -O3
-CFLAGS += $(WERROR_FLAGS)
-CFLAGS += -DALLOW_EXPERIMENTAL_API
-
-#
-# Add extra flags for base driver source files to disable warnings in them
-#
-BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/base/*.c))))
-
-VPATH += $(SRCDIR)/base
-
-EXPORT_MAP := rte_pmd_ifc_version.map
-
-#
-# all source are stored in SRCS-y
-#
-SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
-SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
-
-include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifc/base/ifcvf.c
deleted file mode 100644
index 3c0b2df..0000000
--- a/drivers/net/ifc/base/ifcvf.c
+++ /dev/null
@@ -1,329 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include "ifcvf.h"
-#include "ifcvf_osdep.h"
-
-STATIC void *
-get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
-{
-	u8 bar = cap->bar;
-	u32 length = cap->length;
-	u32 offset = cap->offset;
-
-	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
-		DEBUGOUT("invalid bar: %u\n", bar);
-		return NULL;
-	}
-
-	if (offset + length < offset) {
-		DEBUGOUT("offset(%u) + length(%u) overflows\n",
-			offset, length);
-		return NULL;
-	}
-
-	if (offset + length > hw->mem_resource[cap->bar].len) {
-		DEBUGOUT("offset(%u) + length(%u) overflows bar length(%u)",
-			offset, length, (u32)hw->mem_resource[cap->bar].len);
-		return NULL;
-	}
-
-	return hw->mem_resource[bar].addr + offset;
-}
-
-int
-ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
-{
-	int ret;
-	u8 pos;
-	struct ifcvf_pci_cap cap;
-
-	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
-	if (ret < 0) {
-		DEBUGOUT("failed to read pci capability list\n");
-		return -1;
-	}
-
-	while (pos) {
-		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
-				sizeof(cap), pos);
-		if (ret < 0) {
-			DEBUGOUT("failed to read cap at pos: %x", pos);
-			break;
-		}
-
-		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
-			goto next;
-
-		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
-				"len: %u\n", cap.cfg_type, cap.bar,
-				cap.offset, cap.length);
-
-		switch (cap.cfg_type) {
-		case IFCVF_PCI_CAP_COMMON_CFG:
-			hw->common_cfg = get_cap_addr(hw, &cap);
-			break;
-		case IFCVF_PCI_CAP_NOTIFY_CFG:
-			PCI_READ_CONFIG_DWORD(dev, &hw->notify_off_multiplier,
-					pos + sizeof(cap));
-			hw->notify_base = get_cap_addr(hw, &cap);
-			hw->notify_region = cap.bar;
-			break;
-		case IFCVF_PCI_CAP_ISR_CFG:
-			hw->isr = get_cap_addr(hw, &cap);
-			break;
-		case IFCVF_PCI_CAP_DEVICE_CFG:
-			hw->dev_cfg = get_cap_addr(hw, &cap);
-			break;
-		}
-next:
-		pos = cap.cap_next;
-	}
-
-	hw->lm_cfg = hw->mem_resource[4].addr;
-
-	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
-			hw->isr == NULL || hw->dev_cfg == NULL) {
-		DEBUGOUT("capability incomplete\n");
-		return -1;
-	}
-
-	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
-			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
-			"multiplier: %u\n",
-			hw->common_cfg, hw->dev_cfg,
-			hw->isr, hw->notify_base,
-			hw->notify_off_multiplier);
-
-	return 0;
-}
-
-STATIC u8
-ifcvf_get_status(struct ifcvf_hw *hw)
-{
-	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
-}
-
-STATIC void
-ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
-{
-	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
-}
-
-STATIC void
-ifcvf_reset(struct ifcvf_hw *hw)
-{
-	ifcvf_set_status(hw, 0);
-
-	/* flush status write */
-	while (ifcvf_get_status(hw))
-		msec_delay(1);
-}
-
-STATIC void
-ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
-{
-	if (status != 0)
-		status |= ifcvf_get_status(hw);
-
-	ifcvf_set_status(hw, status);
-	ifcvf_get_status(hw);
-}
-
-u64
-ifcvf_get_features(struct ifcvf_hw *hw)
-{
-	u32 features_lo, features_hi;
-	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
-	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
-
-	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
-	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
-
-	return ((u64)features_hi << 32) | features_lo;
-}
-
-STATIC void
-ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
-{
-	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
-	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg->guest_feature);
-
-	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
-	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
-}
-
-STATIC int
-ifcvf_config_features(struct ifcvf_hw *hw)
-{
-	u64 host_features;
-
-	host_features = ifcvf_get_features(hw);
-	hw->req_features &= host_features;
-
-	ifcvf_set_features(hw, hw->req_features);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
-
-	if (!(ifcvf_get_status(hw) & IFCVF_CONFIG_STATUS_FEATURES_OK)) {
-		DEBUGOUT("failed to set FEATURES_OK status\n");
-		return -1;
-	}
-
-	return 0;
-}
-
-STATIC void
-io_write64_twopart(u64 val, u32 *lo, u32 *hi)
-{
-	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
-	IFCVF_WRITE_REG32(val >> 32, hi);
-}
-
-STATIC int
-ifcvf_hw_enable(struct ifcvf_hw *hw)
-{
-	struct ifcvf_pci_common_cfg *cfg;
-	u8 *lm_cfg;
-	u32 i;
-	u16 notify_off;
-
-	cfg = hw->common_cfg;
-	lm_cfg = hw->lm_cfg;
-
-	IFCVF_WRITE_REG16(0, &cfg->msix_config);
-	if (IFCVF_READ_REG16(&cfg->msix_config) == IFCVF_MSI_NO_VECTOR) {
-		DEBUGOUT("msix vec alloc failed for device config\n");
-		return -1;
-	}
-
-	for (i = 0; i < hw->nr_vring; i++) {
-		IFCVF_WRITE_REG16(i, &cfg->queue_select);
-		io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
-				&cfg->queue_desc_hi);
-		io_write64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
-				&cfg->queue_avail_hi);
-		io_write64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
-				&cfg->queue_used_hi);
-		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
-
-		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
-				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
-			(u32)hw->vring[i].last_avail_idx |
-			((u32)hw->vring[i].last_used_idx << 16);
-
-		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
-		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
-				IFCVF_MSI_NO_VECTOR) {
-			DEBUGOUT("queue %u, msix vec alloc failed\n",
-					i);
-			return -1;
-		}
-
-		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
-		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
-				notify_off * hw->notify_off_multiplier);
-		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
-	}
-
-	return 0;
-}
-
-STATIC void
-ifcvf_hw_disable(struct ifcvf_hw *hw)
-{
-	u32 i;
-	struct ifcvf_pci_common_cfg *cfg;
-	u32 ring_state;
-
-	cfg = hw->common_cfg;
-
-	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
-	for (i = 0; i < hw->nr_vring; i++) {
-		IFCVF_WRITE_REG16(i, &cfg->queue_select);
-		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
-		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->queue_msix_vector);
-		ring_state = *(u32 *)(hw->lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
-				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
-		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
-		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
-	}
-}
-
-int
-ifcvf_start_hw(struct ifcvf_hw *hw)
-{
-	ifcvf_reset(hw);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
-
-	if (ifcvf_config_features(hw) < 0)
-		return -1;
-
-	if (ifcvf_hw_enable(hw) < 0)
-		return -1;
-
-	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
-	return 0;
-}
-
-void
-ifcvf_stop_hw(struct ifcvf_hw *hw)
-{
-	ifcvf_hw_disable(hw);
-	ifcvf_reset(hw);
-}
-
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
-{
-	u8 *lm_cfg;
-
-	lm_cfg = hw->lm_cfg;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
-		log_base & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
-		(log_base >> 32) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
-		(log_base + log_size) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
-		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
-
-	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_ENABLE_VF;
-}
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw)
-{
-	u8 *lm_cfg;
-
-	lm_cfg = hw->lm_cfg;
-	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
-}
-
-void
-ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
-{
-	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
-}
-
-u8
-ifcvf_get_notify_region(struct ifcvf_hw *hw)
-{
-	return hw->notify_region;
-}
-
-u64
-ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
-{
-	return (u8 *)hw->notify_addr[qid] -
-		(u8 *)hw->mem_resource[hw->notify_region].addr;
-}
diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
deleted file mode 100644
index 9be2770..0000000
--- a/drivers/net/ifc/base/ifcvf.h
+++ /dev/null
@@ -1,162 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _IFCVF_H_
-#define _IFCVF_H_
-
-#include "ifcvf_osdep.h"
-
-#define IFCVF_VENDOR_ID		0x1AF4
-#define IFCVF_DEVICE_ID		0x1041
-#define IFCVF_SUBSYS_VENDOR_ID	0x8086
-#define IFCVF_SUBSYS_DEVICE_ID	0x001A
-
-#define IFCVF_MAX_QUEUES		1
-#define VIRTIO_F_IOMMU_PLATFORM		33
-
-/* Common configuration */
-#define IFCVF_PCI_CAP_COMMON_CFG	1
-/* Notifications */
-#define IFCVF_PCI_CAP_NOTIFY_CFG	2
-/* ISR Status */
-#define IFCVF_PCI_CAP_ISR_CFG		3
-/* Device specific configuration */
-#define IFCVF_PCI_CAP_DEVICE_CFG	4
-/* PCI configuration access */
-#define IFCVF_PCI_CAP_PCI_CFG		5
-
-#define IFCVF_CONFIG_STATUS_RESET     0x00
-#define IFCVF_CONFIG_STATUS_ACK       0x01
-#define IFCVF_CONFIG_STATUS_DRIVER    0x02
-#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
-#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
-#define IFCVF_CONFIG_STATUS_FAILED    0x80
-
-#define IFCVF_MSI_NO_VECTOR	0xffff
-#define IFCVF_PCI_MAX_RESOURCE	6
-
-#define IFCVF_LM_CFG_SIZE		0x40
-#define IFCVF_LM_RING_STATE_OFFSET	0x20
-
-#define IFCVF_LM_LOGGING_CTRL		0x0
-
-#define IFCVF_LM_BASE_ADDR_LOW		0x10
-#define IFCVF_LM_BASE_ADDR_HIGH		0x14
-#define IFCVF_LM_END_ADDR_LOW		0x18
-#define IFCVF_LM_END_ADDR_HIGH		0x1c
-
-#define IFCVF_LM_DISABLE		0x0
-#define IFCVF_LM_ENABLE_VF		0x1
-#define IFCVF_LM_ENABLE_PF		0x3
-#define IFCVF_LOG_BASE			0x100000000000
-#define IFCVF_MEDIATED_VRING		0x200000000000
-
-#define IFCVF_32_BIT_MASK		0xffffffff
-
-
-struct ifcvf_pci_cap {
-	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
-	u8 cap_next;            /* Generic PCI field: next ptr. */
-	u8 cap_len;             /* Generic PCI field: capability length */
-	u8 cfg_type;            /* Identifies the structure. */
-	u8 bar;                 /* Where to find it. */
-	u8 padding[3];          /* Pad to full dword. */
-	u32 offset;             /* Offset within bar. */
-	u32 length;             /* Length of the structure, in bytes. */
-};
-
-struct ifcvf_pci_notify_cap {
-	struct ifcvf_pci_cap cap;
-	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
-};
-
-struct ifcvf_pci_common_cfg {
-	/* About the whole device. */
-	u32 device_feature_select;
-	u32 device_feature;
-	u32 guest_feature_select;
-	u32 guest_feature;
-	u16 msix_config;
-	u16 num_queues;
-	u8 device_status;
-	u8 config_generation;
-
-	/* About a specific virtqueue. */
-	u16 queue_select;
-	u16 queue_size;
-	u16 queue_msix_vector;
-	u16 queue_enable;
-	u16 queue_notify_off;
-	u32 queue_desc_lo;
-	u32 queue_desc_hi;
-	u32 queue_avail_lo;
-	u32 queue_avail_hi;
-	u32 queue_used_lo;
-	u32 queue_used_hi;
-};
-
-struct ifcvf_net_config {
-	u8    mac[6];
-	u16   status;
-	u16   max_virtqueue_pairs;
-} __attribute__((packed));
-
-struct ifcvf_pci_mem_resource {
-	u64      phys_addr; /**< Physical address, 0 if not resource. */
-	u64      len;       /**< Length of the resource. */
-	u8       *addr;     /**< Virtual address, NULL when not mapped. */
-};
-
-struct vring_info {
-	u64 desc;
-	u64 avail;
-	u64 used;
-	u16 size;
-	u16 last_avail_idx;
-	u16 last_used_idx;
-};
-
-struct ifcvf_hw {
-	u64    req_features;
-	u8     notify_region;
-	u32    notify_off_multiplier;
-	struct ifcvf_pci_common_cfg *common_cfg;
-	struct ifcvf_net_config *dev_cfg;
-	u8     *isr;
-	u16    *notify_base;
-	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
-	u8     *lm_cfg;
-	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
-	u8 nr_vring;
-	struct ifcvf_pci_mem_resource mem_resource[IFCVF_PCI_MAX_RESOURCE];
-};
-
-int
-ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
-
-u64
-ifcvf_get_features(struct ifcvf_hw *hw);
-
-int
-ifcvf_start_hw(struct ifcvf_hw *hw);
-
-void
-ifcvf_stop_hw(struct ifcvf_hw *hw);
-
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw);
-
-void
-ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
-
-u8
-ifcvf_get_notify_region(struct ifcvf_hw *hw);
-
-u64
-ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
-
-#endif /* _IFCVF_H_ */
diff --git a/drivers/net/ifc/base/ifcvf_osdep.h b/drivers/net/ifc/base/ifcvf_osdep.h
deleted file mode 100644
index 6aef25e..0000000
--- a/drivers/net/ifc/base/ifcvf_osdep.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _IFCVF_OSDEP_H_
-#define _IFCVF_OSDEP_H_
-
-#include <stdint.h>
-#include <linux/pci_regs.h>
-
-#include <rte_cycles.h>
-#include <rte_pci.h>
-#include <rte_bus_pci.h>
-#include <rte_log.h>
-#include <rte_io.h>
-
-#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
-#define STATIC                  static
-
-#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
-
-#define IFCVF_READ_REG8(reg)		rte_read8(reg)
-#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
-#define IFCVF_READ_REG16(reg)		rte_read16(reg)
-#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
-#define IFCVF_READ_REG32(reg)		rte_read32(reg)
-#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
-
-typedef struct rte_pci_device PCI_DEV;
-
-#define PCI_READ_CONFIG_BYTE(dev, val, where) \
-	rte_pci_read_config(dev, val, 1, where)
-
-#define PCI_READ_CONFIG_DWORD(dev, val, where) \
-	rte_pci_read_config(dev, val, 4, where)
-
-typedef uint8_t    u8;
-typedef int8_t     s8;
-typedef uint16_t   u16;
-typedef int16_t    s16;
-typedef uint32_t   u32;
-typedef int32_t    s32;
-typedef int64_t    s64;
-typedef uint64_t   u64;
-
-static inline int
-PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int where)
-{
-	return rte_pci_read_config(dev, val, size, where);
-}
-
-#endif /* _IFCVF_OSDEP_H_ */
diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
deleted file mode 100644
index da4667b..0000000
--- a/drivers/net/ifc/ifcvf_vdpa.c
+++ /dev/null
@@ -1,1280 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include <unistd.h>
-#include <pthread.h>
-#include <fcntl.h>
-#include <string.h>
-#include <sys/ioctl.h>
-#include <sys/epoll.h>
-#include <linux/virtio_net.h>
-#include <stdbool.h>
-
-#include <rte_malloc.h>
-#include <rte_memory.h>
-#include <rte_bus_pci.h>
-#include <rte_vhost.h>
-#include <rte_vdpa.h>
-#include <rte_vfio.h>
-#include <rte_spinlock.h>
-#include <rte_log.h>
-#include <rte_kvargs.h>
-#include <rte_devargs.h>
-
-#include "base/ifcvf.h"
-
-#define DRV_LOG(level, fmt, args...) \
-	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
-		"IFCVF %s(): " fmt "\n", __func__, ##args)
-
-#ifndef PAGE_SIZE
-#define PAGE_SIZE 4096
-#endif
-
-#define IFCVF_USED_RING_LEN(size) \
-	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
-
-#define IFCVF_VDPA_MODE		"vdpa"
-#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
-
-static const char * const ifcvf_valid_arguments[] = {
-	IFCVF_VDPA_MODE,
-	IFCVF_SW_FALLBACK_LM,
-	NULL
-};
-
-static int ifcvf_vdpa_logtype;
-
-struct ifcvf_internal {
-	struct rte_vdpa_dev_addr dev_addr;
-	struct rte_pci_device *pdev;
-	struct ifcvf_hw hw;
-	int vfio_container_fd;
-	int vfio_group_fd;
-	int vfio_dev_fd;
-	pthread_t tid;	/* thread for notify relay */
-	int epfd;
-	int vid;
-	int did;
-	uint16_t max_queues;
-	uint64_t features;
-	rte_atomic32_t started;
-	rte_atomic32_t dev_attached;
-	rte_atomic32_t running;
-	rte_spinlock_t lock;
-	bool sw_lm;
-	bool sw_fallback_running;
-	/* mediated vring for sw fallback */
-	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
-	/* eventfd for used ring interrupt */
-	int intr_fd[IFCVF_MAX_QUEUES * 2];
-};
-
-struct internal_list {
-	TAILQ_ENTRY(internal_list) next;
-	struct ifcvf_internal *internal;
-};
-
-TAILQ_HEAD(internal_list_head, internal_list);
-static struct internal_list_head internal_list =
-	TAILQ_HEAD_INITIALIZER(internal_list);
-
-static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
-
-static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
-
-static struct internal_list *
-find_internal_resource_by_did(int did)
-{
-	int found = 0;
-	struct internal_list *list;
-
-	pthread_mutex_lock(&internal_list_lock);
-
-	TAILQ_FOREACH(list, &internal_list, next) {
-		if (did == list->internal->did) {
-			found = 1;
-			break;
-		}
-	}
-
-	pthread_mutex_unlock(&internal_list_lock);
-
-	if (!found)
-		return NULL;
-
-	return list;
-}
-
-static struct internal_list *
-find_internal_resource_by_dev(struct rte_pci_device *pdev)
-{
-	int found = 0;
-	struct internal_list *list;
-
-	pthread_mutex_lock(&internal_list_lock);
-
-	TAILQ_FOREACH(list, &internal_list, next) {
-		if (pdev == list->internal->pdev) {
-			found = 1;
-			break;
-		}
-	}
-
-	pthread_mutex_unlock(&internal_list_lock);
-
-	if (!found)
-		return NULL;
-
-	return list;
-}
-
-static int
-ifcvf_vfio_setup(struct ifcvf_internal *internal)
-{
-	struct rte_pci_device *dev = internal->pdev;
-	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
-	int iommu_group_num;
-	int i, ret;
-
-	internal->vfio_dev_fd = -1;
-	internal->vfio_group_fd = -1;
-	internal->vfio_container_fd = -1;
-
-	rte_pci_device_name(&dev->addr, devname, RTE_DEV_NAME_MAX_LEN);
-	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
-			&iommu_group_num);
-	if (ret <= 0) {
-		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
-		return -1;
-	}
-
-	internal->vfio_container_fd = rte_vfio_container_create();
-	if (internal->vfio_container_fd < 0)
-		return -1;
-
-	internal->vfio_group_fd = rte_vfio_container_group_bind(
-			internal->vfio_container_fd, iommu_group_num);
-	if (internal->vfio_group_fd < 0)
-		goto err;
-
-	if (rte_pci_map_device(dev))
-		goto err;
-
-	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
-
-	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE, IFCVF_PCI_MAX_RESOURCE);
-			i++) {
-		internal->hw.mem_resource[i].addr =
-			internal->pdev->mem_resource[i].addr;
-		internal->hw.mem_resource[i].phys_addr =
-			internal->pdev->mem_resource[i].phys_addr;
-		internal->hw.mem_resource[i].len =
-			internal->pdev->mem_resource[i].len;
-	}
-
-	return 0;
-
-err:
-	rte_vfio_container_destroy(internal->vfio_container_fd);
-	return -1;
-}
-
-static int
-ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
-{
-	uint32_t i;
-	int ret;
-	struct rte_vhost_memory *mem = NULL;
-	int vfio_container_fd;
-
-	ret = rte_vhost_get_mem_table(internal->vid, &mem);
-	if (ret < 0) {
-		DRV_LOG(ERR, "failed to get VM memory layout.");
-		goto exit;
-	}
-
-	vfio_container_fd = internal->vfio_container_fd;
-
-	for (i = 0; i < mem->nregions; i++) {
-		struct rte_vhost_mem_region *reg;
-
-		reg = &mem->regions[i];
-		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
-			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
-			do_map ? "DMA map" : "DMA unmap", i,
-			reg->host_user_addr, reg->guest_phys_addr, reg->size);
-
-		if (do_map) {
-			ret = rte_vfio_container_dma_map(vfio_container_fd,
-				reg->host_user_addr, reg->guest_phys_addr,
-				reg->size);
-			if (ret < 0) {
-				DRV_LOG(ERR, "DMA map failed.");
-				goto exit;
-			}
-		} else {
-			ret = rte_vfio_container_dma_unmap(vfio_container_fd,
-				reg->host_user_addr, reg->guest_phys_addr,
-				reg->size);
-			if (ret < 0) {
-				DRV_LOG(ERR, "DMA unmap failed.");
-				goto exit;
-			}
-		}
-	}
-
-exit:
-	if (mem)
-		free(mem);
-	return ret;
-}
-
-static uint64_t
-hva_to_gpa(int vid, uint64_t hva)
-{
-	struct rte_vhost_memory *mem = NULL;
-	struct rte_vhost_mem_region *reg;
-	uint32_t i;
-	uint64_t gpa = 0;
-
-	if (rte_vhost_get_mem_table(vid, &mem) < 0)
-		goto exit;
-
-	for (i = 0; i < mem->nregions; i++) {
-		reg = &mem->regions[i];
-
-		if (hva >= reg->host_user_addr &&
-				hva < reg->host_user_addr + reg->size) {
-			gpa = hva - reg->host_user_addr + reg->guest_phys_addr;
-			break;
-		}
-	}
-
-exit:
-	if (mem)
-		free(mem);
-	return gpa;
-}
-
-static int
-vdpa_ifcvf_start(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	int i, nr_vring;
-	int vid;
-	struct rte_vhost_vring vq;
-	uint64_t gpa;
-
-	vid = internal->vid;
-	nr_vring = rte_vhost_get_vring_num(vid);
-	rte_vhost_get_negotiated_features(vid, &hw->req_features);
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
-			return -1;
-		}
-		hw->vring[i].desc = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for available ring.");
-			return -1;
-		}
-		hw->vring[i].avail = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for used ring.");
-			return -1;
-		}
-		hw->vring[i].used = gpa;
-
-		hw->vring[i].size = vq.size;
-		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
-				&hw->vring[i].last_used_idx);
-	}
-	hw->nr_vring = i;
-
-	return ifcvf_start_hw(&internal->hw);
-}
-
-static void
-vdpa_ifcvf_stop(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	uint32_t i;
-	int vid;
-	uint64_t features = 0;
-	uint64_t log_base = 0, log_size = 0;
-	uint64_t len;
-
-	vid = internal->vid;
-	ifcvf_stop_hw(hw);
-
-	for (i = 0; i < hw->nr_vring; i++)
-		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
-				hw->vring[i].last_used_idx);
-
-	if (internal->sw_lm)
-		return;
-
-	rte_vhost_get_negotiated_features(vid, &features);
-	if (RTE_VHOST_NEED_LOG(features)) {
-		ifcvf_disable_logging(hw);
-		rte_vhost_get_log_base(internal->vid, &log_base, &log_size);
-		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
-				log_base, IFCVF_LOG_BASE, log_size);
-		/*
-		 * IFCVF marks dirty memory pages for only packet buffer,
-		 * SW helps to mark the used ring as dirty after device stops.
-		 */
-		for (i = 0; i < hw->nr_vring; i++) {
-			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
-			rte_vhost_log_used_vring(vid, i, 0, len);
-		}
-	}
-}
-
-#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
-		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
-static int
-vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
-{
-	int ret;
-	uint32_t i, nr_vring;
-	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
-	struct vfio_irq_set *irq_set;
-	int *fd_ptr;
-	struct rte_vhost_vring vring;
-	int fd;
-
-	vring.callfd = -1;
-
-	nr_vring = rte_vhost_get_vring_num(internal->vid);
-
-	irq_set = (struct vfio_irq_set *)irq_set_buf;
-	irq_set->argsz = sizeof(irq_set_buf);
-	irq_set->count = nr_vring + 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-			 VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-	fd_ptr = (int *)&irq_set->data;
-	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev->intr_handle.fd;
-
-	for (i = 0; i < nr_vring; i++)
-		internal->intr_fd[i] = -1;
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
-		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
-		if ((i & 1) == 0 && m_rx == true) {
-			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
-			if (fd < 0) {
-				DRV_LOG(ERR, "can't setup eventfd: %s",
-					strerror(errno));
-				return -1;
-			}
-			internal->intr_fd[i] = fd;
-			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
-		}
-	}
-
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-	if (ret) {
-		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
-{
-	int ret;
-	uint32_t i, nr_vring;
-	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
-	struct vfio_irq_set *irq_set;
-
-	irq_set = (struct vfio_irq_set *)irq_set_buf;
-	irq_set->argsz = sizeof(irq_set_buf);
-	irq_set->count = 0;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	nr_vring = rte_vhost_get_vring_num(internal->vid);
-	for (i = 0; i < nr_vring; i++) {
-		if (internal->intr_fd[i] >= 0)
-			close(internal->intr_fd[i]);
-		internal->intr_fd[i] = -1;
-	}
-
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-	if (ret) {
-		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	return 0;
-}
-
-static void *
-notify_relay(void *arg)
-{
-	int i, kickfd, epfd, nfds = 0;
-	uint32_t qid, q_num;
-	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
-	struct epoll_event ev;
-	uint64_t buf;
-	int nbytes;
-	struct rte_vhost_vring vring;
-	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
-	struct ifcvf_hw *hw = &internal->hw;
-
-	q_num = rte_vhost_get_vring_num(internal->vid);
-
-	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
-	if (epfd < 0) {
-		DRV_LOG(ERR, "failed to create epoll instance.");
-		return NULL;
-	}
-	internal->epfd = epfd;
-
-	vring.kickfd = -1;
-	for (qid = 0; qid < q_num; qid++) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
-		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-	}
-
-	for (;;) {
-		nfds = epoll_wait(epfd, events, q_num, -1);
-		if (nfds < 0) {
-			if (errno == EINTR)
-				continue;
-			DRV_LOG(ERR, "epoll_wait return fail\n");
-			return NULL;
-		}
-
-		for (i = 0; i < nfds; i++) {
-			qid = events[i].data.u32;
-			kickfd = (uint32_t)(events[i].data.u64 >> 32);
-			do {
-				nbytes = read(kickfd, &buf, 8);
-				if (nbytes < 0) {
-					if (errno == EINTR ||
-					    errno == EWOULDBLOCK ||
-					    errno == EAGAIN)
-						continue;
-					DRV_LOG(INFO, "Error reading "
-						"kickfd: %s",
-						strerror(errno));
-				}
-				break;
-			} while (1);
-
-			ifcvf_notify_queue(hw, qid);
-		}
-	}
-
-	return NULL;
-}
-
-static int
-setup_notify_relay(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	ret = pthread_create(&internal->tid, NULL, notify_relay,
-			(void *)internal);
-	if (ret) {
-		DRV_LOG(ERR, "failed to create notify relay pthread.");
-		return -1;
-	}
-	return 0;
-}
-
-static int
-unset_notify_relay(struct ifcvf_internal *internal)
-{
-	void *status;
-
-	if (internal->tid) {
-		pthread_cancel(internal->tid);
-		pthread_join(internal->tid, &status);
-	}
-	internal->tid = 0;
-
-	if (internal->epfd >= 0)
-		close(internal->epfd);
-	internal->epfd = -1;
-
-	return 0;
-}
-
-static int
-update_datapath(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	rte_spinlock_lock(&internal->lock);
-
-	if (!rte_atomic32_read(&internal->running) &&
-	    (rte_atomic32_read(&internal->started) &&
-	     rte_atomic32_read(&internal->dev_attached))) {
-		ret = ifcvf_dma_map(internal, 1);
-		if (ret)
-			goto err;
-
-		ret = vdpa_enable_vfio_intr(internal, 0);
-		if (ret)
-			goto err;
-
-		ret = vdpa_ifcvf_start(internal);
-		if (ret)
-			goto err;
-
-		ret = setup_notify_relay(internal);
-		if (ret)
-			goto err;
-
-		rte_atomic32_set(&internal->running, 1);
-	} else if (rte_atomic32_read(&internal->running) &&
-		   (!rte_atomic32_read(&internal->started) ||
-		    !rte_atomic32_read(&internal->dev_attached))) {
-		ret = unset_notify_relay(internal);
-		if (ret)
-			goto err;
-
-		vdpa_ifcvf_stop(internal);
-
-		ret = vdpa_disable_vfio_intr(internal);
-		if (ret)
-			goto err;
-
-		ret = ifcvf_dma_map(internal, 0);
-		if (ret)
-			goto err;
-
-		rte_atomic32_set(&internal->running, 0);
-	}
-
-	rte_spinlock_unlock(&internal->lock);
-	return 0;
-err:
-	rte_spinlock_unlock(&internal->lock);
-	return ret;
-}
-
-static int
-m_ifcvf_start(struct ifcvf_internal *internal)
-{
-	struct ifcvf_hw *hw = &internal->hw;
-	uint32_t i, nr_vring;
-	int vid, ret;
-	struct rte_vhost_vring vq;
-	void *vring_buf;
-	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
-	uint64_t size;
-	uint64_t gpa;
-
-	memset(&vq, 0, sizeof(vq));
-	vid = internal->vid;
-	nr_vring = rte_vhost_get_vring_num(vid);
-	rte_vhost_get_negotiated_features(vid, &hw->req_features);
-
-	for (i = 0; i < nr_vring; i++) {
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-
-		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
-				PAGE_SIZE);
-		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
-		vring_init(&internal->m_vring[i], vq.size, vring_buf,
-				PAGE_SIZE);
-
-		ret = rte_vfio_container_dma_map(internal->vfio_container_fd,
-			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
-		if (ret < 0) {
-			DRV_LOG(ERR, "mediated vring DMA map failed.");
-			goto error;
-		}
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
-			return -1;
-		}
-		hw->vring[i].desc = gpa;
-
-		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
-		if (gpa == 0) {
-			DRV_LOG(ERR, "Fail to get GPA for available ring.");
-			return -1;
-		}
-		hw->vring[i].avail = gpa;
-
-		/* Direct I/O for Tx queue, relay for Rx queue */
-		if (i & 1) {
-			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
-			if (gpa == 0) {
-				DRV_LOG(ERR, "Fail to get GPA for used ring.");
-				return -1;
-			}
-			hw->vring[i].used = gpa;
-		} else {
-			hw->vring[i].used = m_vring_iova +
-				(char *)internal->m_vring[i].used -
-				(char *)internal->m_vring[i].desc;
-		}
-
-		hw->vring[i].size = vq.size;
-
-		rte_vhost_get_vring_base(vid, i,
-				&internal->m_vring[i].avail->idx,
-				&internal->m_vring[i].used->idx);
-
-		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
-				&hw->vring[i].last_used_idx);
-
-		m_vring_iova += size;
-	}
-	hw->nr_vring = nr_vring;
-
-	return ifcvf_start_hw(&internal->hw);
-
-error:
-	for (i = 0; i < nr_vring; i++)
-		if (internal->m_vring[i].desc)
-			rte_free(internal->m_vring[i].desc);
-
-	return -1;
-}
-
-static int
-m_ifcvf_stop(struct ifcvf_internal *internal)
-{
-	int vid;
-	uint32_t i;
-	struct rte_vhost_vring vq;
-	struct ifcvf_hw *hw = &internal->hw;
-	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
-	uint64_t size, len;
-
-	vid = internal->vid;
-	ifcvf_stop_hw(hw);
-
-	for (i = 0; i < hw->nr_vring; i++) {
-		/* synchronize remaining new used entries if any */
-		if ((i & 1) == 0)
-			update_used_ring(internal, i);
-
-		rte_vhost_get_vhost_vring(vid, i, &vq);
-		len = IFCVF_USED_RING_LEN(vq.size);
-		rte_vhost_log_used_vring(vid, i, 0, len);
-
-		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
-				PAGE_SIZE);
-		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
-			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
-			m_vring_iova, size);
-
-		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
-				hw->vring[i].last_used_idx);
-		rte_free(internal->m_vring[i].desc);
-		m_vring_iova += size;
-	}
-
-	return 0;
-}
-
-static void
-update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
-{
-	rte_vdpa_relay_vring_used(internal->vid, qid, &internal->m_vring[qid]);
-	rte_vhost_vring_call(internal->vid, qid);
-}
-
-static void *
-vring_relay(void *arg)
-{
-	int i, vid, epfd, fd, nfds;
-	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
-	struct rte_vhost_vring vring;
-	uint16_t qid, q_num;
-	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
-	struct epoll_event ev;
-	int nbytes;
-	uint64_t buf;
-
-	vid = internal->vid;
-	q_num = rte_vhost_get_vring_num(vid);
-
-	/* add notify fd and interrupt fd to epoll */
-	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
-	if (epfd < 0) {
-		DRV_LOG(ERR, "failed to create epoll instance.");
-		return NULL;
-	}
-	internal->epfd = epfd;
-
-	vring.kickfd = -1;
-	for (qid = 0; qid < q_num; qid++) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		rte_vhost_get_vhost_vring(vid, qid, &vring);
-		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-	}
-
-	for (qid = 0; qid < q_num; qid += 2) {
-		ev.events = EPOLLIN | EPOLLPRI;
-		/* leave a flag to mark it's for interrupt */
-		ev.data.u64 = 1 | qid << 1 |
-			(uint64_t)internal->intr_fd[qid] << 32;
-		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid], &ev)
-				< 0) {
-			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
-			return NULL;
-		}
-		update_used_ring(internal, qid);
-	}
-
-	/* start relay with a first kick */
-	for (qid = 0; qid < q_num; qid++)
-		ifcvf_notify_queue(&internal->hw, qid);
-
-	/* listen to the events and react accordingly */
-	for (;;) {
-		nfds = epoll_wait(epfd, events, q_num * 2, -1);
-		if (nfds < 0) {
-			if (errno == EINTR)
-				continue;
-			DRV_LOG(ERR, "epoll_wait return fail\n");
-			return NULL;
-		}
-
-		for (i = 0; i < nfds; i++) {
-			fd = (uint32_t)(events[i].data.u64 >> 32);
-			do {
-				nbytes = read(fd, &buf, 8);
-				if (nbytes < 0) {
-					if (errno == EINTR ||
-					    errno == EWOULDBLOCK ||
-					    errno == EAGAIN)
-						continue;
-					DRV_LOG(INFO, "Error reading "
-						"kickfd: %s",
-						strerror(errno));
-				}
-				break;
-			} while (1);
-
-			qid = events[i].data.u32 >> 1;
-
-			if (events[i].data.u32 & 1)
-				update_used_ring(internal, qid);
-			else
-				ifcvf_notify_queue(&internal->hw, qid);
-		}
-	}
-
-	return NULL;
-}
-
-static int
-setup_vring_relay(struct ifcvf_internal *internal)
-{
-	int ret;
-
-	ret = pthread_create(&internal->tid, NULL, vring_relay,
-			(void *)internal);
-	if (ret) {
-		DRV_LOG(ERR, "failed to create ring relay pthread.");
-		return -1;
-	}
-	return 0;
-}
-
-static int
-unset_vring_relay(struct ifcvf_internal *internal)
-{
-	void *status;
-
-	if (internal->tid) {
-		pthread_cancel(internal->tid);
-		pthread_join(internal->tid, &status);
-	}
-	internal->tid = 0;
-
-	if (internal->epfd >= 0)
-		close(internal->epfd);
-	internal->epfd = -1;
-
-	return 0;
-}
-
-static int
-ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
-{
-	int ret;
-	int vid = internal->vid;
-
-	/* stop the direct IO data path */
-	unset_notify_relay(internal);
-	vdpa_ifcvf_stop(internal);
-	vdpa_disable_vfio_intr(internal);
-
-	ret = rte_vhost_host_notifier_ctrl(vid, false);
-	if (ret && ret != -ENOTSUP)
-		goto error;
-
-	/* set up interrupt for interrupt relay */
-	ret = vdpa_enable_vfio_intr(internal, 1);
-	if (ret)
-		goto unmap;
-
-	/* config the VF */
-	ret = m_ifcvf_start(internal);
-	if (ret)
-		goto unset_intr;
-
-	/* set up vring relay thread */
-	ret = setup_vring_relay(internal);
-	if (ret)
-		goto stop_vf;
-
-	rte_vhost_host_notifier_ctrl(vid, true);
-
-	internal->sw_fallback_running = true;
-
-	return 0;
-
-stop_vf:
-	m_ifcvf_stop(internal);
-unset_intr:
-	vdpa_disable_vfio_intr(internal);
-unmap:
-	ifcvf_dma_map(internal, 0);
-error:
-	return -1;
-}
-
-static int
-ifcvf_dev_config(int vid)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-	internal->vid = vid;
-	rte_atomic32_set(&internal->dev_attached, 1);
-	update_datapath(internal);
-
-	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
-		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
-
-	return 0;
-}
-
-static int
-ifcvf_dev_close(int vid)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-
-	if (internal->sw_fallback_running) {
-		/* unset ring relay */
-		unset_vring_relay(internal);
-
-		/* reset VF */
-		m_ifcvf_stop(internal);
-
-		/* remove interrupt setting */
-		vdpa_disable_vfio_intr(internal);
-
-		/* unset DMA map for guest memory */
-		ifcvf_dma_map(internal, 0);
-
-		internal->sw_fallback_running = false;
-	} else {
-		rte_atomic32_set(&internal->dev_attached, 0);
-		update_datapath(internal);
-	}
-
-	return 0;
-}
-
-static int
-ifcvf_set_features(int vid)
-{
-	uint64_t features = 0;
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-	uint64_t log_base = 0, log_size = 0;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-	rte_vhost_get_negotiated_features(vid, &features);
-
-	if (!RTE_VHOST_NEED_LOG(features))
-		return 0;
-
-	if (internal->sw_lm) {
-		ifcvf_sw_fallback_switchover(internal);
-	} else {
-		rte_vhost_get_log_base(vid, &log_base, &log_size);
-		rte_vfio_container_dma_map(internal->vfio_container_fd,
-				log_base, IFCVF_LOG_BASE, log_size);
-		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE, log_size);
-	}
-
-	return 0;
-}
-
-static int
-ifcvf_get_vfio_group_fd(int vid)
-{
-	int did;
-	struct internal_list *list;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	return list->internal->vfio_group_fd;
-}
-
-static int
-ifcvf_get_vfio_device_fd(int vid)
-{
-	int did;
-	struct internal_list *list;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	return list->internal->vfio_dev_fd;
-}
-
-static int
-ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
-{
-	int did;
-	struct internal_list *list;
-	struct ifcvf_internal *internal;
-	struct vfio_region_info reg = { .argsz = sizeof(reg) };
-	int ret;
-
-	did = rte_vhost_get_vdpa_device_id(vid);
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	internal = list->internal;
-
-	reg.index = ifcvf_get_notify_region(&internal->hw);
-	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
-	if (ret) {
-		DRV_LOG(ERR, "Get not get device region info: %s",
-				strerror(errno));
-		return -1;
-	}
-
-	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) + reg.offset;
-	*size = 0x1000;
-
-	return 0;
-}
-
-static int
-ifcvf_get_queue_num(int did, uint32_t *queue_num)
-{
-	struct internal_list *list;
-
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	*queue_num = list->internal->max_queues;
-
-	return 0;
-}
-
-static int
-ifcvf_get_vdpa_features(int did, uint64_t *features)
-{
-	struct internal_list *list;
-
-	list = find_internal_resource_by_did(did);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device id: %d", did);
-		return -1;
-	}
-
-	*features = list->internal->features;
-
-	return 0;
-}
-
-#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
-		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
-		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
-		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
-		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
-		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
-static int
-ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
-{
-	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
-	return 0;
-}
-
-static struct rte_vdpa_dev_ops ifcvf_ops = {
-	.get_queue_num = ifcvf_get_queue_num,
-	.get_features = ifcvf_get_vdpa_features,
-	.get_protocol_features = ifcvf_get_protocol_features,
-	.dev_conf = ifcvf_dev_config,
-	.dev_close = ifcvf_dev_close,
-	.set_vring_state = NULL,
-	.set_features = ifcvf_set_features,
-	.migration_done = NULL,
-	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
-	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
-	.get_notify_area = ifcvf_get_notify_area,
-};
-
-static inline int
-open_int(const char *key __rte_unused, const char *value, void *extra_args)
-{
-	uint16_t *n = extra_args;
-
-	if (value == NULL || extra_args == NULL)
-		return -EINVAL;
-
-	*n = (uint16_t)strtoul(value, NULL, 0);
-	if (*n == USHRT_MAX && errno == ERANGE)
-		return -1;
-
-	return 0;
-}
-
-static int
-ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
-		struct rte_pci_device *pci_dev)
-{
-	uint64_t features;
-	struct ifcvf_internal *internal = NULL;
-	struct internal_list *list = NULL;
-	int vdpa_mode = 0;
-	int sw_fallback_lm = 0;
-	struct rte_kvargs *kvlist = NULL;
-	int ret = 0;
-
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
-	if (!pci_dev->device.devargs)
-		return 1;
-
-	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
-			ifcvf_valid_arguments);
-	if (kvlist == NULL)
-		return 1;
-
-	/* probe only when vdpa mode is specified */
-	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
-		rte_kvargs_free(kvlist);
-		return 1;
-	}
-
-	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
-			&vdpa_mode);
-	if (ret < 0 || vdpa_mode == 0) {
-		rte_kvargs_free(kvlist);
-		return 1;
-	}
-
-	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
-	if (list == NULL)
-		goto error;
-
-	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
-	if (internal == NULL)
-		goto error;
-
-	internal->pdev = pci_dev;
-	rte_spinlock_init(&internal->lock);
-
-	if (ifcvf_vfio_setup(internal) < 0) {
-		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
-		goto error;
-	}
-
-	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
-		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
-		goto error;
-	}
-
-	internal->max_queues = IFCVF_MAX_QUEUES;
-	features = ifcvf_get_features(&internal->hw);
-	internal->features = (features &
-		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
-		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
-		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
-		(1ULL << VIRTIO_NET_F_STATUS) |
-		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
-		(1ULL << VHOST_F_LOG_ALL);
-
-	internal->dev_addr.pci_addr = pci_dev->addr;
-	internal->dev_addr.type = PCI_ADDR;
-	list->internal = internal;
-
-	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
-		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
-				&open_int, &sw_fallback_lm);
-		if (ret < 0)
-			goto error;
-	}
-	internal->sw_lm = sw_fallback_lm;
-
-	internal->did = rte_vdpa_register_device(&internal->dev_addr,
-				&ifcvf_ops);
-	if (internal->did < 0) {
-		DRV_LOG(ERR, "failed to register device %s", pci_dev->name);
-		goto error;
-	}
-
-	pthread_mutex_lock(&internal_list_lock);
-	TAILQ_INSERT_TAIL(&internal_list, list, next);
-	pthread_mutex_unlock(&internal_list_lock);
-
-	rte_atomic32_set(&internal->started, 1);
-	update_datapath(internal);
-
-	rte_kvargs_free(kvlist);
-	return 0;
-
-error:
-	rte_kvargs_free(kvlist);
-	rte_free(list);
-	rte_free(internal);
-	return -1;
-}
-
-static int
-ifcvf_pci_remove(struct rte_pci_device *pci_dev)
-{
-	struct ifcvf_internal *internal;
-	struct internal_list *list;
-
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
-	list = find_internal_resource_by_dev(pci_dev);
-	if (list == NULL) {
-		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
-		return -1;
-	}
-
-	internal = list->internal;
-	rte_atomic32_set(&internal->started, 0);
-	update_datapath(internal);
-
-	rte_pci_unmap_device(internal->pdev);
-	rte_vfio_container_destroy(internal->vfio_container_fd);
-	rte_vdpa_unregister_device(internal->did);
-
-	pthread_mutex_lock(&internal_list_lock);
-	TAILQ_REMOVE(&internal_list, list, next);
-	pthread_mutex_unlock(&internal_list_lock);
-
-	rte_free(list);
-	rte_free(internal);
-
-	return 0;
-}
-
-/*
- * IFCVF has the same vendor ID and device ID as virtio net PCI
- * device, with its specific subsystem vendor ID and device ID.
- */
-static const struct rte_pci_id pci_id_ifcvf_map[] = {
-	{ .class_id = RTE_CLASS_ANY_ID,
-	  .vendor_id = IFCVF_VENDOR_ID,
-	  .device_id = IFCVF_DEVICE_ID,
-	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
-	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
-	},
-
-	{ .vendor_id = 0, /* sentinel */
-	},
-};
-
-static struct rte_pci_driver rte_ifcvf_vdpa = {
-	.id_table = pci_id_ifcvf_map,
-	.drv_flags = 0,
-	.probe = ifcvf_pci_probe,
-	.remove = ifcvf_pci_remove,
-};
-
-RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
-RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
-RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
-
-RTE_INIT(ifcvf_vdpa_init_log)
-{
-	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
-	if (ifcvf_vdpa_logtype >= 0)
-		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
-}
diff --git a/drivers/net/ifc/meson.build b/drivers/net/ifc/meson.build
deleted file mode 100644
index adc9ed9..0000000
--- a/drivers/net/ifc/meson.build
+++ /dev/null
@@ -1,9 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 Intel Corporation
-
-build = dpdk_conf.has('RTE_LIBRTE_VHOST')
-reason = 'missing dependency, DPDK vhost library'
-allow_experimental_apis = true
-sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
-includes += include_directories('base')
-deps += 'vhost'
diff --git a/drivers/net/ifc/rte_pmd_ifc_version.map b/drivers/net/ifc/rte_pmd_ifc_version.map
deleted file mode 100644
index f9f17e4..0000000
--- a/drivers/net/ifc/rte_pmd_ifc_version.map
+++ /dev/null
@@ -1,3 +0,0 @@
-DPDK_20.0 {
-	local: *;
-};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index c300afb..b0ea8fe 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -21,7 +21,6 @@ drivers = ['af_packet',
 	'hns3',
 	'iavf',
 	'ice',
-	'ifc',
 	'ipn3ke',
 	'ixgbe',
 	'kni',
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index 82a2b70..27fec96 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,4 +5,10 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # DIRS-$(<configuration>) += <directory>
 
+ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
+ifeq ($(CONFIG_RTE_EAL_VFIO),y)
+DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
+endif
+endif # $(CONFIG_RTE_LIBRTE_VHOST)
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/vdpa/ifc/Makefile b/drivers/vdpa/ifc/Makefile
new file mode 100644
index 0000000..fe227b8
--- /dev/null
+++ b/drivers/vdpa/ifc/Makefile
@@ -0,0 +1,34 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ifc.a
+
+LDLIBS += -lpthread
+LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
+LDLIBS += -lrte_kvargs
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
+#
+# Add extra flags for base driver source files to disable warnings in them
+#
+BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/base/*.c))))
+
+VPATH += $(SRCDIR)/base
+
+EXPORT_MAP := rte_pmd_ifc_version.map
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
+SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
new file mode 100644
index 0000000..3c0b2df
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include "ifcvf.h"
+#include "ifcvf_osdep.h"
+
+STATIC void *
+get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
+{
+	u8 bar = cap->bar;
+	u32 length = cap->length;
+	u32 offset = cap->offset;
+
+	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
+		DEBUGOUT("invalid bar: %u\n", bar);
+		return NULL;
+	}
+
+	if (offset + length < offset) {
+		DEBUGOUT("offset(%u) + length(%u) overflows\n",
+			offset, length);
+		return NULL;
+	}
+
+	if (offset + length > hw->mem_resource[cap->bar].len) {
+		DEBUGOUT("offset(%u) + length(%u) overflows bar length(%u)",
+			offset, length, (u32)hw->mem_resource[cap->bar].len);
+		return NULL;
+	}
+
+	return hw->mem_resource[bar].addr + offset;
+}
+
+int
+ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
+{
+	int ret;
+	u8 pos;
+	struct ifcvf_pci_cap cap;
+
+	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
+	if (ret < 0) {
+		DEBUGOUT("failed to read pci capability list\n");
+		return -1;
+	}
+
+	while (pos) {
+		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
+				sizeof(cap), pos);
+		if (ret < 0) {
+			DEBUGOUT("failed to read cap at pos: %x", pos);
+			break;
+		}
+
+		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
+			goto next;
+
+		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
+				"len: %u\n", cap.cfg_type, cap.bar,
+				cap.offset, cap.length);
+
+		switch (cap.cfg_type) {
+		case IFCVF_PCI_CAP_COMMON_CFG:
+			hw->common_cfg = get_cap_addr(hw, &cap);
+			break;
+		case IFCVF_PCI_CAP_NOTIFY_CFG:
+			PCI_READ_CONFIG_DWORD(dev, &hw->notify_off_multiplier,
+					pos + sizeof(cap));
+			hw->notify_base = get_cap_addr(hw, &cap);
+			hw->notify_region = cap.bar;
+			break;
+		case IFCVF_PCI_CAP_ISR_CFG:
+			hw->isr = get_cap_addr(hw, &cap);
+			break;
+		case IFCVF_PCI_CAP_DEVICE_CFG:
+			hw->dev_cfg = get_cap_addr(hw, &cap);
+			break;
+		}
+next:
+		pos = cap.cap_next;
+	}
+
+	hw->lm_cfg = hw->mem_resource[4].addr;
+
+	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+			hw->isr == NULL || hw->dev_cfg == NULL) {
+		DEBUGOUT("capability incomplete\n");
+		return -1;
+	}
+
+	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
+			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
+			"multiplier: %u\n",
+			hw->common_cfg, hw->dev_cfg,
+			hw->isr, hw->notify_base,
+			hw->notify_off_multiplier);
+
+	return 0;
+}
+
+STATIC u8
+ifcvf_get_status(struct ifcvf_hw *hw)
+{
+	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
+}
+
+STATIC void
+ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
+{
+	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
+}
+
+STATIC void
+ifcvf_reset(struct ifcvf_hw *hw)
+{
+	ifcvf_set_status(hw, 0);
+
+	/* flush status write */
+	while (ifcvf_get_status(hw))
+		msec_delay(1);
+}
+
+STATIC void
+ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
+{
+	if (status != 0)
+		status |= ifcvf_get_status(hw);
+
+	ifcvf_set_status(hw, status);
+	ifcvf_get_status(hw);
+}
+
+u64
+ifcvf_get_features(struct ifcvf_hw *hw)
+{
+	u32 features_lo, features_hi;
+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
+	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
+
+	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
+	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
+
+	return ((u64)features_hi << 32) | features_lo;
+}
+
+STATIC void
+ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
+{
+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
+	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg->guest_feature);
+
+	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
+	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
+}
+
+STATIC int
+ifcvf_config_features(struct ifcvf_hw *hw)
+{
+	u64 host_features;
+
+	host_features = ifcvf_get_features(hw);
+	hw->req_features &= host_features;
+
+	ifcvf_set_features(hw, hw->req_features);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
+
+	if (!(ifcvf_get_status(hw) & IFCVF_CONFIG_STATUS_FEATURES_OK)) {
+		DEBUGOUT("failed to set FEATURES_OK status\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+STATIC void
+io_write64_twopart(u64 val, u32 *lo, u32 *hi)
+{
+	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
+	IFCVF_WRITE_REG32(val >> 32, hi);
+}
+
+STATIC int
+ifcvf_hw_enable(struct ifcvf_hw *hw)
+{
+	struct ifcvf_pci_common_cfg *cfg;
+	u8 *lm_cfg;
+	u32 i;
+	u16 notify_off;
+
+	cfg = hw->common_cfg;
+	lm_cfg = hw->lm_cfg;
+
+	IFCVF_WRITE_REG16(0, &cfg->msix_config);
+	if (IFCVF_READ_REG16(&cfg->msix_config) == IFCVF_MSI_NO_VECTOR) {
+		DEBUGOUT("msix vec alloc failed for device config\n");
+		return -1;
+	}
+
+	for (i = 0; i < hw->nr_vring; i++) {
+		IFCVF_WRITE_REG16(i, &cfg->queue_select);
+		io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
+				&cfg->queue_desc_hi);
+		io_write64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
+				&cfg->queue_avail_hi);
+		io_write64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
+				&cfg->queue_used_hi);
+		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
+
+		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
+				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
+			(u32)hw->vring[i].last_avail_idx |
+			((u32)hw->vring[i].last_used_idx << 16);
+
+		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
+		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
+				IFCVF_MSI_NO_VECTOR) {
+			DEBUGOUT("queue %u, msix vec alloc failed\n",
+					i);
+			return -1;
+		}
+
+		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
+		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
+				notify_off * hw->notify_off_multiplier);
+		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
+	}
+
+	return 0;
+}
+
+STATIC void
+ifcvf_hw_disable(struct ifcvf_hw *hw)
+{
+	u32 i;
+	struct ifcvf_pci_common_cfg *cfg;
+	u32 ring_state;
+
+	cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
+	for (i = 0; i < hw->nr_vring; i++) {
+		IFCVF_WRITE_REG16(i, &cfg->queue_select);
+		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
+		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->queue_msix_vector);
+		ring_state = *(u32 *)(hw->lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
+				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
+		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
+		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
+	}
+}
+
+int
+ifcvf_start_hw(struct ifcvf_hw *hw)
+{
+	ifcvf_reset(hw);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
+
+	if (ifcvf_config_features(hw) < 0)
+		return -1;
+
+	if (ifcvf_hw_enable(hw) < 0)
+		return -1;
+
+	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
+	return 0;
+}
+
+void
+ifcvf_stop_hw(struct ifcvf_hw *hw)
+{
+	ifcvf_hw_disable(hw);
+	ifcvf_reset(hw);
+}
+
+void
+ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
+{
+	u8 *lm_cfg;
+
+	lm_cfg = hw->lm_cfg;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
+		log_base & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
+		(log_base >> 32) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
+		(log_base + log_size) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
+		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
+
+	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_ENABLE_VF;
+}
+
+void
+ifcvf_disable_logging(struct ifcvf_hw *hw)
+{
+	u8 *lm_cfg;
+
+	lm_cfg = hw->lm_cfg;
+	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
+}
+
+void
+ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
+{
+	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
+}
+
+u8
+ifcvf_get_notify_region(struct ifcvf_hw *hw)
+{
+	return hw->notify_region;
+}
+
+u64
+ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
+{
+	return (u8 *)hw->notify_addr[qid] -
+		(u8 *)hw->mem_resource[hw->notify_region].addr;
+}
diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
new file mode 100644
index 0000000..9be2770
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf.h
@@ -0,0 +1,162 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _IFCVF_H_
+#define _IFCVF_H_
+
+#include "ifcvf_osdep.h"
+
+#define IFCVF_VENDOR_ID		0x1AF4
+#define IFCVF_DEVICE_ID		0x1041
+#define IFCVF_SUBSYS_VENDOR_ID	0x8086
+#define IFCVF_SUBSYS_DEVICE_ID	0x001A
+
+#define IFCVF_MAX_QUEUES		1
+#define VIRTIO_F_IOMMU_PLATFORM		33
+
+/* Common configuration */
+#define IFCVF_PCI_CAP_COMMON_CFG	1
+/* Notifications */
+#define IFCVF_PCI_CAP_NOTIFY_CFG	2
+/* ISR Status */
+#define IFCVF_PCI_CAP_ISR_CFG		3
+/* Device specific configuration */
+#define IFCVF_PCI_CAP_DEVICE_CFG	4
+/* PCI configuration access */
+#define IFCVF_PCI_CAP_PCI_CFG		5
+
+#define IFCVF_CONFIG_STATUS_RESET     0x00
+#define IFCVF_CONFIG_STATUS_ACK       0x01
+#define IFCVF_CONFIG_STATUS_DRIVER    0x02
+#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
+#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
+#define IFCVF_CONFIG_STATUS_FAILED    0x80
+
+#define IFCVF_MSI_NO_VECTOR	0xffff
+#define IFCVF_PCI_MAX_RESOURCE	6
+
+#define IFCVF_LM_CFG_SIZE		0x40
+#define IFCVF_LM_RING_STATE_OFFSET	0x20
+
+#define IFCVF_LM_LOGGING_CTRL		0x0
+
+#define IFCVF_LM_BASE_ADDR_LOW		0x10
+#define IFCVF_LM_BASE_ADDR_HIGH		0x14
+#define IFCVF_LM_END_ADDR_LOW		0x18
+#define IFCVF_LM_END_ADDR_HIGH		0x1c
+
+#define IFCVF_LM_DISABLE		0x0
+#define IFCVF_LM_ENABLE_VF		0x1
+#define IFCVF_LM_ENABLE_PF		0x3
+#define IFCVF_LOG_BASE			0x100000000000
+#define IFCVF_MEDIATED_VRING		0x200000000000
+
+#define IFCVF_32_BIT_MASK		0xffffffff
+
+
+struct ifcvf_pci_cap {
+	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
+	u8 cap_next;            /* Generic PCI field: next ptr. */
+	u8 cap_len;             /* Generic PCI field: capability length */
+	u8 cfg_type;            /* Identifies the structure. */
+	u8 bar;                 /* Where to find it. */
+	u8 padding[3];          /* Pad to full dword. */
+	u32 offset;             /* Offset within bar. */
+	u32 length;             /* Length of the structure, in bytes. */
+};
+
+struct ifcvf_pci_notify_cap {
+	struct ifcvf_pci_cap cap;
+	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
+};
+
+struct ifcvf_pci_common_cfg {
+	/* About the whole device. */
+	u32 device_feature_select;
+	u32 device_feature;
+	u32 guest_feature_select;
+	u32 guest_feature;
+	u16 msix_config;
+	u16 num_queues;
+	u8 device_status;
+	u8 config_generation;
+
+	/* About a specific virtqueue. */
+	u16 queue_select;
+	u16 queue_size;
+	u16 queue_msix_vector;
+	u16 queue_enable;
+	u16 queue_notify_off;
+	u32 queue_desc_lo;
+	u32 queue_desc_hi;
+	u32 queue_avail_lo;
+	u32 queue_avail_hi;
+	u32 queue_used_lo;
+	u32 queue_used_hi;
+};
+
+struct ifcvf_net_config {
+	u8    mac[6];
+	u16   status;
+	u16   max_virtqueue_pairs;
+} __attribute__((packed));
+
+struct ifcvf_pci_mem_resource {
+	u64      phys_addr; /**< Physical address, 0 if not resource. */
+	u64      len;       /**< Length of the resource. */
+	u8       *addr;     /**< Virtual address, NULL when not mapped. */
+};
+
+struct vring_info {
+	u64 desc;
+	u64 avail;
+	u64 used;
+	u16 size;
+	u16 last_avail_idx;
+	u16 last_used_idx;
+};
+
+struct ifcvf_hw {
+	u64    req_features;
+	u8     notify_region;
+	u32    notify_off_multiplier;
+	struct ifcvf_pci_common_cfg *common_cfg;
+	struct ifcvf_net_config *dev_cfg;
+	u8     *isr;
+	u16    *notify_base;
+	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
+	u8     *lm_cfg;
+	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
+	u8 nr_vring;
+	struct ifcvf_pci_mem_resource mem_resource[IFCVF_PCI_MAX_RESOURCE];
+};
+
+int
+ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
+
+u64
+ifcvf_get_features(struct ifcvf_hw *hw);
+
+int
+ifcvf_start_hw(struct ifcvf_hw *hw);
+
+void
+ifcvf_stop_hw(struct ifcvf_hw *hw);
+
+void
+ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
+
+void
+ifcvf_disable_logging(struct ifcvf_hw *hw);
+
+void
+ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
+
+u8
+ifcvf_get_notify_region(struct ifcvf_hw *hw);
+
+u64
+ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
+
+#endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifc/base/ifcvf_osdep.h b/drivers/vdpa/ifc/base/ifcvf_osdep.h
new file mode 100644
index 0000000..6aef25e
--- /dev/null
+++ b/drivers/vdpa/ifc/base/ifcvf_osdep.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _IFCVF_OSDEP_H_
+#define _IFCVF_OSDEP_H_
+
+#include <stdint.h>
+#include <linux/pci_regs.h>
+
+#include <rte_cycles.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+#include <rte_log.h>
+#include <rte_io.h>
+
+#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
+#define STATIC                  static
+
+#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
+
+#define IFCVF_READ_REG8(reg)		rte_read8(reg)
+#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
+#define IFCVF_READ_REG16(reg)		rte_read16(reg)
+#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
+#define IFCVF_READ_REG32(reg)		rte_read32(reg)
+#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
+
+typedef struct rte_pci_device PCI_DEV;
+
+#define PCI_READ_CONFIG_BYTE(dev, val, where) \
+	rte_pci_read_config(dev, val, 1, where)
+
+#define PCI_READ_CONFIG_DWORD(dev, val, where) \
+	rte_pci_read_config(dev, val, 4, where)
+
+typedef uint8_t    u8;
+typedef int8_t     s8;
+typedef uint16_t   u16;
+typedef int16_t    s16;
+typedef uint32_t   u32;
+typedef int32_t    s32;
+typedef int64_t    s64;
+typedef uint64_t   u64;
+
+static inline int
+PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int where)
+{
+	return rte_pci_read_config(dev, val, size, where);
+}
+
+#endif /* _IFCVF_OSDEP_H_ */
diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
new file mode 100644
index 0000000..da4667b
--- /dev/null
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -0,0 +1,1280 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <pthread.h>
+#include <fcntl.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/epoll.h>
+#include <linux/virtio_net.h>
+#include <stdbool.h>
+
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_bus_pci.h>
+#include <rte_vhost.h>
+#include <rte_vdpa.h>
+#include <rte_vfio.h>
+#include <rte_spinlock.h>
+#include <rte_log.h>
+#include <rte_kvargs.h>
+#include <rte_devargs.h>
+
+#include "base/ifcvf.h"
+
+#define DRV_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
+		"IFCVF %s(): " fmt "\n", __func__, ##args)
+
+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif
+
+#define IFCVF_USED_RING_LEN(size) \
+	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
+
+#define IFCVF_VDPA_MODE		"vdpa"
+#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
+
+static const char * const ifcvf_valid_arguments[] = {
+	IFCVF_VDPA_MODE,
+	IFCVF_SW_FALLBACK_LM,
+	NULL
+};
+
+static int ifcvf_vdpa_logtype;
+
+struct ifcvf_internal {
+	struct rte_vdpa_dev_addr dev_addr;
+	struct rte_pci_device *pdev;
+	struct ifcvf_hw hw;
+	int vfio_container_fd;
+	int vfio_group_fd;
+	int vfio_dev_fd;
+	pthread_t tid;	/* thread for notify relay */
+	int epfd;
+	int vid;
+	int did;
+	uint16_t max_queues;
+	uint64_t features;
+	rte_atomic32_t started;
+	rte_atomic32_t dev_attached;
+	rte_atomic32_t running;
+	rte_spinlock_t lock;
+	bool sw_lm;
+	bool sw_fallback_running;
+	/* mediated vring for sw fallback */
+	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
+	/* eventfd for used ring interrupt */
+	int intr_fd[IFCVF_MAX_QUEUES * 2];
+};
+
+struct internal_list {
+	TAILQ_ENTRY(internal_list) next;
+	struct ifcvf_internal *internal;
+};
+
+TAILQ_HEAD(internal_list_head, internal_list);
+static struct internal_list_head internal_list =
+	TAILQ_HEAD_INITIALIZER(internal_list);
+
+static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
+
+static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
+
+static struct internal_list *
+find_internal_resource_by_did(int did)
+{
+	int found = 0;
+	struct internal_list *list;
+
+	pthread_mutex_lock(&internal_list_lock);
+
+	TAILQ_FOREACH(list, &internal_list, next) {
+		if (did == list->internal->did) {
+			found = 1;
+			break;
+		}
+	}
+
+	pthread_mutex_unlock(&internal_list_lock);
+
+	if (!found)
+		return NULL;
+
+	return list;
+}
+
+static struct internal_list *
+find_internal_resource_by_dev(struct rte_pci_device *pdev)
+{
+	int found = 0;
+	struct internal_list *list;
+
+	pthread_mutex_lock(&internal_list_lock);
+
+	TAILQ_FOREACH(list, &internal_list, next) {
+		if (pdev == list->internal->pdev) {
+			found = 1;
+			break;
+		}
+	}
+
+	pthread_mutex_unlock(&internal_list_lock);
+
+	if (!found)
+		return NULL;
+
+	return list;
+}
+
+static int
+ifcvf_vfio_setup(struct ifcvf_internal *internal)
+{
+	struct rte_pci_device *dev = internal->pdev;
+	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
+	int iommu_group_num;
+	int i, ret;
+
+	internal->vfio_dev_fd = -1;
+	internal->vfio_group_fd = -1;
+	internal->vfio_container_fd = -1;
+
+	rte_pci_device_name(&dev->addr, devname, RTE_DEV_NAME_MAX_LEN);
+	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
+			&iommu_group_num);
+	if (ret <= 0) {
+		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
+		return -1;
+	}
+
+	internal->vfio_container_fd = rte_vfio_container_create();
+	if (internal->vfio_container_fd < 0)
+		return -1;
+
+	internal->vfio_group_fd = rte_vfio_container_group_bind(
+			internal->vfio_container_fd, iommu_group_num);
+	if (internal->vfio_group_fd < 0)
+		goto err;
+
+	if (rte_pci_map_device(dev))
+		goto err;
+
+	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
+
+	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE, IFCVF_PCI_MAX_RESOURCE);
+			i++) {
+		internal->hw.mem_resource[i].addr =
+			internal->pdev->mem_resource[i].addr;
+		internal->hw.mem_resource[i].phys_addr =
+			internal->pdev->mem_resource[i].phys_addr;
+		internal->hw.mem_resource[i].len =
+			internal->pdev->mem_resource[i].len;
+	}
+
+	return 0;
+
+err:
+	rte_vfio_container_destroy(internal->vfio_container_fd);
+	return -1;
+}
+
+static int
+ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
+{
+	uint32_t i;
+	int ret;
+	struct rte_vhost_memory *mem = NULL;
+	int vfio_container_fd;
+
+	ret = rte_vhost_get_mem_table(internal->vid, &mem);
+	if (ret < 0) {
+		DRV_LOG(ERR, "failed to get VM memory layout.");
+		goto exit;
+	}
+
+	vfio_container_fd = internal->vfio_container_fd;
+
+	for (i = 0; i < mem->nregions; i++) {
+		struct rte_vhost_mem_region *reg;
+
+		reg = &mem->regions[i];
+		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
+			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
+			do_map ? "DMA map" : "DMA unmap", i,
+			reg->host_user_addr, reg->guest_phys_addr, reg->size);
+
+		if (do_map) {
+			ret = rte_vfio_container_dma_map(vfio_container_fd,
+				reg->host_user_addr, reg->guest_phys_addr,
+				reg->size);
+			if (ret < 0) {
+				DRV_LOG(ERR, "DMA map failed.");
+				goto exit;
+			}
+		} else {
+			ret = rte_vfio_container_dma_unmap(vfio_container_fd,
+				reg->host_user_addr, reg->guest_phys_addr,
+				reg->size);
+			if (ret < 0) {
+				DRV_LOG(ERR, "DMA unmap failed.");
+				goto exit;
+			}
+		}
+	}
+
+exit:
+	if (mem)
+		free(mem);
+	return ret;
+}
+
+static uint64_t
+hva_to_gpa(int vid, uint64_t hva)
+{
+	struct rte_vhost_memory *mem = NULL;
+	struct rte_vhost_mem_region *reg;
+	uint32_t i;
+	uint64_t gpa = 0;
+
+	if (rte_vhost_get_mem_table(vid, &mem) < 0)
+		goto exit;
+
+	for (i = 0; i < mem->nregions; i++) {
+		reg = &mem->regions[i];
+
+		if (hva >= reg->host_user_addr &&
+				hva < reg->host_user_addr + reg->size) {
+			gpa = hva - reg->host_user_addr + reg->guest_phys_addr;
+			break;
+		}
+	}
+
+exit:
+	if (mem)
+		free(mem);
+	return gpa;
+}
+
+static int
+vdpa_ifcvf_start(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	int i, nr_vring;
+	int vid;
+	struct rte_vhost_vring vq;
+	uint64_t gpa;
+
+	vid = internal->vid;
+	nr_vring = rte_vhost_get_vring_num(vid);
+	rte_vhost_get_negotiated_features(vid, &hw->req_features);
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
+			return -1;
+		}
+		hw->vring[i].desc = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for available ring.");
+			return -1;
+		}
+		hw->vring[i].avail = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for used ring.");
+			return -1;
+		}
+		hw->vring[i].used = gpa;
+
+		hw->vring[i].size = vq.size;
+		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
+				&hw->vring[i].last_used_idx);
+	}
+	hw->nr_vring = i;
+
+	return ifcvf_start_hw(&internal->hw);
+}
+
+static void
+vdpa_ifcvf_stop(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	uint32_t i;
+	int vid;
+	uint64_t features = 0;
+	uint64_t log_base = 0, log_size = 0;
+	uint64_t len;
+
+	vid = internal->vid;
+	ifcvf_stop_hw(hw);
+
+	for (i = 0; i < hw->nr_vring; i++)
+		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
+				hw->vring[i].last_used_idx);
+
+	if (internal->sw_lm)
+		return;
+
+	rte_vhost_get_negotiated_features(vid, &features);
+	if (RTE_VHOST_NEED_LOG(features)) {
+		ifcvf_disable_logging(hw);
+		rte_vhost_get_log_base(internal->vid, &log_base, &log_size);
+		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
+				log_base, IFCVF_LOG_BASE, log_size);
+		/*
+		 * IFCVF marks dirty memory pages for only packet buffer,
+		 * SW helps to mark the used ring as dirty after device stops.
+		 */
+		for (i = 0; i < hw->nr_vring; i++) {
+			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
+			rte_vhost_log_used_vring(vid, i, 0, len);
+		}
+	}
+}
+
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
+static int
+vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
+{
+	int ret;
+	uint32_t i, nr_vring;
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+	int *fd_ptr;
+	struct rte_vhost_vring vring;
+	int fd;
+
+	vring.callfd = -1;
+
+	nr_vring = rte_vhost_get_vring_num(internal->vid);
+
+	irq_set = (struct vfio_irq_set *)irq_set_buf;
+	irq_set->argsz = sizeof(irq_set_buf);
+	irq_set->count = nr_vring + 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+			 VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	irq_set->start = 0;
+	fd_ptr = (int *)&irq_set->data;
+	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev->intr_handle.fd;
+
+	for (i = 0; i < nr_vring; i++)
+		internal->intr_fd[i] = -1;
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
+		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
+		if ((i & 1) == 0 && m_rx == true) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				DRV_LOG(ERR, "can't setup eventfd: %s",
+					strerror(errno));
+				return -1;
+			}
+			internal->intr_fd[i] = fd;
+			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
+		}
+	}
+
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+	if (ret) {
+		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
+{
+	int ret;
+	uint32_t i, nr_vring;
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+
+	irq_set = (struct vfio_irq_set *)irq_set_buf;
+	irq_set->argsz = sizeof(irq_set_buf);
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	nr_vring = rte_vhost_get_vring_num(internal->vid);
+	for (i = 0; i < nr_vring; i++) {
+		if (internal->intr_fd[i] >= 0)
+			close(internal->intr_fd[i]);
+		internal->intr_fd[i] = -1;
+	}
+
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+	if (ret) {
+		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static void *
+notify_relay(void *arg)
+{
+	int i, kickfd, epfd, nfds = 0;
+	uint32_t qid, q_num;
+	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
+	struct epoll_event ev;
+	uint64_t buf;
+	int nbytes;
+	struct rte_vhost_vring vring;
+	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
+	struct ifcvf_hw *hw = &internal->hw;
+
+	q_num = rte_vhost_get_vring_num(internal->vid);
+
+	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
+	if (epfd < 0) {
+		DRV_LOG(ERR, "failed to create epoll instance.");
+		return NULL;
+	}
+	internal->epfd = epfd;
+
+	vring.kickfd = -1;
+	for (qid = 0; qid < q_num; qid++) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
+		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+	}
+
+	for (;;) {
+		nfds = epoll_wait(epfd, events, q_num, -1);
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			DRV_LOG(ERR, "epoll_wait return fail\n");
+			return NULL;
+		}
+
+		for (i = 0; i < nfds; i++) {
+			qid = events[i].data.u32;
+			kickfd = (uint32_t)(events[i].data.u64 >> 32);
+			do {
+				nbytes = read(kickfd, &buf, 8);
+				if (nbytes < 0) {
+					if (errno == EINTR ||
+					    errno == EWOULDBLOCK ||
+					    errno == EAGAIN)
+						continue;
+					DRV_LOG(INFO, "Error reading "
+						"kickfd: %s",
+						strerror(errno));
+				}
+				break;
+			} while (1);
+
+			ifcvf_notify_queue(hw, qid);
+		}
+	}
+
+	return NULL;
+}
+
+static int
+setup_notify_relay(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	ret = pthread_create(&internal->tid, NULL, notify_relay,
+			(void *)internal);
+	if (ret) {
+		DRV_LOG(ERR, "failed to create notify relay pthread.");
+		return -1;
+	}
+	return 0;
+}
+
+static int
+unset_notify_relay(struct ifcvf_internal *internal)
+{
+	void *status;
+
+	if (internal->tid) {
+		pthread_cancel(internal->tid);
+		pthread_join(internal->tid, &status);
+	}
+	internal->tid = 0;
+
+	if (internal->epfd >= 0)
+		close(internal->epfd);
+	internal->epfd = -1;
+
+	return 0;
+}
+
+static int
+update_datapath(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	rte_spinlock_lock(&internal->lock);
+
+	if (!rte_atomic32_read(&internal->running) &&
+	    (rte_atomic32_read(&internal->started) &&
+	     rte_atomic32_read(&internal->dev_attached))) {
+		ret = ifcvf_dma_map(internal, 1);
+		if (ret)
+			goto err;
+
+		ret = vdpa_enable_vfio_intr(internal, 0);
+		if (ret)
+			goto err;
+
+		ret = vdpa_ifcvf_start(internal);
+		if (ret)
+			goto err;
+
+		ret = setup_notify_relay(internal);
+		if (ret)
+			goto err;
+
+		rte_atomic32_set(&internal->running, 1);
+	} else if (rte_atomic32_read(&internal->running) &&
+		   (!rte_atomic32_read(&internal->started) ||
+		    !rte_atomic32_read(&internal->dev_attached))) {
+		ret = unset_notify_relay(internal);
+		if (ret)
+			goto err;
+
+		vdpa_ifcvf_stop(internal);
+
+		ret = vdpa_disable_vfio_intr(internal);
+		if (ret)
+			goto err;
+
+		ret = ifcvf_dma_map(internal, 0);
+		if (ret)
+			goto err;
+
+		rte_atomic32_set(&internal->running, 0);
+	}
+
+	rte_spinlock_unlock(&internal->lock);
+	return 0;
+err:
+	rte_spinlock_unlock(&internal->lock);
+	return ret;
+}
+
+static int
+m_ifcvf_start(struct ifcvf_internal *internal)
+{
+	struct ifcvf_hw *hw = &internal->hw;
+	uint32_t i, nr_vring;
+	int vid, ret;
+	struct rte_vhost_vring vq;
+	void *vring_buf;
+	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
+	uint64_t size;
+	uint64_t gpa;
+
+	memset(&vq, 0, sizeof(vq));
+	vid = internal->vid;
+	nr_vring = rte_vhost_get_vring_num(vid);
+	rte_vhost_get_negotiated_features(vid, &hw->req_features);
+
+	for (i = 0; i < nr_vring; i++) {
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+
+		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
+				PAGE_SIZE);
+		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
+		vring_init(&internal->m_vring[i], vq.size, vring_buf,
+				PAGE_SIZE);
+
+		ret = rte_vfio_container_dma_map(internal->vfio_container_fd,
+			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
+		if (ret < 0) {
+			DRV_LOG(ERR, "mediated vring DMA map failed.");
+			goto error;
+		}
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
+			return -1;
+		}
+		hw->vring[i].desc = gpa;
+
+		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
+		if (gpa == 0) {
+			DRV_LOG(ERR, "Fail to get GPA for available ring.");
+			return -1;
+		}
+		hw->vring[i].avail = gpa;
+
+		/* Direct I/O for Tx queue, relay for Rx queue */
+		if (i & 1) {
+			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
+			if (gpa == 0) {
+				DRV_LOG(ERR, "Fail to get GPA for used ring.");
+				return -1;
+			}
+			hw->vring[i].used = gpa;
+		} else {
+			hw->vring[i].used = m_vring_iova +
+				(char *)internal->m_vring[i].used -
+				(char *)internal->m_vring[i].desc;
+		}
+
+		hw->vring[i].size = vq.size;
+
+		rte_vhost_get_vring_base(vid, i,
+				&internal->m_vring[i].avail->idx,
+				&internal->m_vring[i].used->idx);
+
+		rte_vhost_get_vring_base(vid, i, &hw->vring[i].last_avail_idx,
+				&hw->vring[i].last_used_idx);
+
+		m_vring_iova += size;
+	}
+	hw->nr_vring = nr_vring;
+
+	return ifcvf_start_hw(&internal->hw);
+
+error:
+	for (i = 0; i < nr_vring; i++)
+		if (internal->m_vring[i].desc)
+			rte_free(internal->m_vring[i].desc);
+
+	return -1;
+}
+
+static int
+m_ifcvf_stop(struct ifcvf_internal *internal)
+{
+	int vid;
+	uint32_t i;
+	struct rte_vhost_vring vq;
+	struct ifcvf_hw *hw = &internal->hw;
+	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
+	uint64_t size, len;
+
+	vid = internal->vid;
+	ifcvf_stop_hw(hw);
+
+	for (i = 0; i < hw->nr_vring; i++) {
+		/* synchronize remaining new used entries if any */
+		if ((i & 1) == 0)
+			update_used_ring(internal, i);
+
+		rte_vhost_get_vhost_vring(vid, i, &vq);
+		len = IFCVF_USED_RING_LEN(vq.size);
+		rte_vhost_log_used_vring(vid, i, 0, len);
+
+		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
+				PAGE_SIZE);
+		rte_vfio_container_dma_unmap(internal->vfio_container_fd,
+			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
+			m_vring_iova, size);
+
+		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
+				hw->vring[i].last_used_idx);
+		rte_free(internal->m_vring[i].desc);
+		m_vring_iova += size;
+	}
+
+	return 0;
+}
+
+static void
+update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
+{
+	rte_vdpa_relay_vring_used(internal->vid, qid, &internal->m_vring[qid]);
+	rte_vhost_vring_call(internal->vid, qid);
+}
+
+static void *
+vring_relay(void *arg)
+{
+	int i, vid, epfd, fd, nfds;
+	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
+	struct rte_vhost_vring vring;
+	uint16_t qid, q_num;
+	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
+	struct epoll_event ev;
+	int nbytes;
+	uint64_t buf;
+
+	vid = internal->vid;
+	q_num = rte_vhost_get_vring_num(vid);
+
+	/* add notify fd and interrupt fd to epoll */
+	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
+	if (epfd < 0) {
+		DRV_LOG(ERR, "failed to create epoll instance.");
+		return NULL;
+	}
+	internal->epfd = epfd;
+
+	vring.kickfd = -1;
+	for (qid = 0; qid < q_num; qid++) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		rte_vhost_get_vhost_vring(vid, qid, &vring);
+		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+	}
+
+	for (qid = 0; qid < q_num; qid += 2) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		/* leave a flag to mark it's for interrupt */
+		ev.data.u64 = 1 | qid << 1 |
+			(uint64_t)internal->intr_fd[qid] << 32;
+		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid], &ev)
+				< 0) {
+			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
+			return NULL;
+		}
+		update_used_ring(internal, qid);
+	}
+
+	/* start relay with a first kick */
+	for (qid = 0; qid < q_num; qid++)
+		ifcvf_notify_queue(&internal->hw, qid);
+
+	/* listen to the events and react accordingly */
+	for (;;) {
+		nfds = epoll_wait(epfd, events, q_num * 2, -1);
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			DRV_LOG(ERR, "epoll_wait return fail\n");
+			return NULL;
+		}
+
+		for (i = 0; i < nfds; i++) {
+			fd = (uint32_t)(events[i].data.u64 >> 32);
+			do {
+				nbytes = read(fd, &buf, 8);
+				if (nbytes < 0) {
+					if (errno == EINTR ||
+					    errno == EWOULDBLOCK ||
+					    errno == EAGAIN)
+						continue;
+					DRV_LOG(INFO, "Error reading "
+						"kickfd: %s",
+						strerror(errno));
+				}
+				break;
+			} while (1);
+
+			qid = events[i].data.u32 >> 1;
+
+			if (events[i].data.u32 & 1)
+				update_used_ring(internal, qid);
+			else
+				ifcvf_notify_queue(&internal->hw, qid);
+		}
+	}
+
+	return NULL;
+}
+
+static int
+setup_vring_relay(struct ifcvf_internal *internal)
+{
+	int ret;
+
+	ret = pthread_create(&internal->tid, NULL, vring_relay,
+			(void *)internal);
+	if (ret) {
+		DRV_LOG(ERR, "failed to create ring relay pthread.");
+		return -1;
+	}
+	return 0;
+}
+
+static int
+unset_vring_relay(struct ifcvf_internal *internal)
+{
+	void *status;
+
+	if (internal->tid) {
+		pthread_cancel(internal->tid);
+		pthread_join(internal->tid, &status);
+	}
+	internal->tid = 0;
+
+	if (internal->epfd >= 0)
+		close(internal->epfd);
+	internal->epfd = -1;
+
+	return 0;
+}
+
+static int
+ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
+{
+	int ret;
+	int vid = internal->vid;
+
+	/* stop the direct IO data path */
+	unset_notify_relay(internal);
+	vdpa_ifcvf_stop(internal);
+	vdpa_disable_vfio_intr(internal);
+
+	ret = rte_vhost_host_notifier_ctrl(vid, false);
+	if (ret && ret != -ENOTSUP)
+		goto error;
+
+	/* set up interrupt for interrupt relay */
+	ret = vdpa_enable_vfio_intr(internal, 1);
+	if (ret)
+		goto unmap;
+
+	/* config the VF */
+	ret = m_ifcvf_start(internal);
+	if (ret)
+		goto unset_intr;
+
+	/* set up vring relay thread */
+	ret = setup_vring_relay(internal);
+	if (ret)
+		goto stop_vf;
+
+	rte_vhost_host_notifier_ctrl(vid, true);
+
+	internal->sw_fallback_running = true;
+
+	return 0;
+
+stop_vf:
+	m_ifcvf_stop(internal);
+unset_intr:
+	vdpa_disable_vfio_intr(internal);
+unmap:
+	ifcvf_dma_map(internal, 0);
+error:
+	return -1;
+}
+
+static int
+ifcvf_dev_config(int vid)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+	internal->vid = vid;
+	rte_atomic32_set(&internal->dev_attached, 1);
+	update_datapath(internal);
+
+	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
+		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
+
+	return 0;
+}
+
+static int
+ifcvf_dev_close(int vid)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+
+	if (internal->sw_fallback_running) {
+		/* unset ring relay */
+		unset_vring_relay(internal);
+
+		/* reset VF */
+		m_ifcvf_stop(internal);
+
+		/* remove interrupt setting */
+		vdpa_disable_vfio_intr(internal);
+
+		/* unset DMA map for guest memory */
+		ifcvf_dma_map(internal, 0);
+
+		internal->sw_fallback_running = false;
+	} else {
+		rte_atomic32_set(&internal->dev_attached, 0);
+		update_datapath(internal);
+	}
+
+	return 0;
+}
+
+static int
+ifcvf_set_features(int vid)
+{
+	uint64_t features = 0;
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+	uint64_t log_base = 0, log_size = 0;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+	rte_vhost_get_negotiated_features(vid, &features);
+
+	if (!RTE_VHOST_NEED_LOG(features))
+		return 0;
+
+	if (internal->sw_lm) {
+		ifcvf_sw_fallback_switchover(internal);
+	} else {
+		rte_vhost_get_log_base(vid, &log_base, &log_size);
+		rte_vfio_container_dma_map(internal->vfio_container_fd,
+				log_base, IFCVF_LOG_BASE, log_size);
+		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE, log_size);
+	}
+
+	return 0;
+}
+
+static int
+ifcvf_get_vfio_group_fd(int vid)
+{
+	int did;
+	struct internal_list *list;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	return list->internal->vfio_group_fd;
+}
+
+static int
+ifcvf_get_vfio_device_fd(int vid)
+{
+	int did;
+	struct internal_list *list;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	return list->internal->vfio_dev_fd;
+}
+
+static int
+ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+	struct vfio_region_info reg = { .argsz = sizeof(reg) };
+	int ret;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+
+	reg.index = ifcvf_get_notify_region(&internal->hw);
+	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
+	if (ret) {
+		DRV_LOG(ERR, "Get not get device region info: %s",
+				strerror(errno));
+		return -1;
+	}
+
+	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) + reg.offset;
+	*size = 0x1000;
+
+	return 0;
+}
+
+static int
+ifcvf_get_queue_num(int did, uint32_t *queue_num)
+{
+	struct internal_list *list;
+
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	*queue_num = list->internal->max_queues;
+
+	return 0;
+}
+
+static int
+ifcvf_get_vdpa_features(int did, uint64_t *features)
+{
+	struct internal_list *list;
+
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	*features = list->internal->features;
+
+	return 0;
+}
+
+#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
+		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
+		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
+		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
+		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
+		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+static int
+ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
+{
+	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
+	return 0;
+}
+
+static struct rte_vdpa_dev_ops ifcvf_ops = {
+	.get_queue_num = ifcvf_get_queue_num,
+	.get_features = ifcvf_get_vdpa_features,
+	.get_protocol_features = ifcvf_get_protocol_features,
+	.dev_conf = ifcvf_dev_config,
+	.dev_close = ifcvf_dev_close,
+	.set_vring_state = NULL,
+	.set_features = ifcvf_set_features,
+	.migration_done = NULL,
+	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
+	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
+	.get_notify_area = ifcvf_get_notify_area,
+};
+
+static inline int
+open_int(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint16_t *n = extra_args;
+
+	if (value == NULL || extra_args == NULL)
+		return -EINVAL;
+
+	*n = (uint16_t)strtoul(value, NULL, 0);
+	if (*n == USHRT_MAX && errno == ERANGE)
+		return -1;
+
+	return 0;
+}
+
+static int
+ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		struct rte_pci_device *pci_dev)
+{
+	uint64_t features;
+	struct ifcvf_internal *internal = NULL;
+	struct internal_list *list = NULL;
+	int vdpa_mode = 0;
+	int sw_fallback_lm = 0;
+	struct rte_kvargs *kvlist = NULL;
+	int ret = 0;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	if (!pci_dev->device.devargs)
+		return 1;
+
+	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
+			ifcvf_valid_arguments);
+	if (kvlist == NULL)
+		return 1;
+
+	/* probe only when vdpa mode is specified */
+	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
+		rte_kvargs_free(kvlist);
+		return 1;
+	}
+
+	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
+			&vdpa_mode);
+	if (ret < 0 || vdpa_mode == 0) {
+		rte_kvargs_free(kvlist);
+		return 1;
+	}
+
+	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
+	if (list == NULL)
+		goto error;
+
+	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
+	if (internal == NULL)
+		goto error;
+
+	internal->pdev = pci_dev;
+	rte_spinlock_init(&internal->lock);
+
+	if (ifcvf_vfio_setup(internal) < 0) {
+		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
+		goto error;
+	}
+
+	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
+		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
+		goto error;
+	}
+
+	internal->max_queues = IFCVF_MAX_QUEUES;
+	features = ifcvf_get_features(&internal->hw);
+	internal->features = (features &
+		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
+		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
+		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
+		(1ULL << VIRTIO_NET_F_STATUS) |
+		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
+		(1ULL << VHOST_F_LOG_ALL);
+
+	internal->dev_addr.pci_addr = pci_dev->addr;
+	internal->dev_addr.type = PCI_ADDR;
+	list->internal = internal;
+
+	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
+		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
+				&open_int, &sw_fallback_lm);
+		if (ret < 0)
+			goto error;
+	}
+	internal->sw_lm = sw_fallback_lm;
+
+	internal->did = rte_vdpa_register_device(&internal->dev_addr,
+				&ifcvf_ops);
+	if (internal->did < 0) {
+		DRV_LOG(ERR, "failed to register device %s", pci_dev->name);
+		goto error;
+	}
+
+	pthread_mutex_lock(&internal_list_lock);
+	TAILQ_INSERT_TAIL(&internal_list, list, next);
+	pthread_mutex_unlock(&internal_list_lock);
+
+	rte_atomic32_set(&internal->started, 1);
+	update_datapath(internal);
+
+	rte_kvargs_free(kvlist);
+	return 0;
+
+error:
+	rte_kvargs_free(kvlist);
+	rte_free(list);
+	rte_free(internal);
+	return -1;
+}
+
+static int
+ifcvf_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct ifcvf_internal *internal;
+	struct internal_list *list;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	list = find_internal_resource_by_dev(pci_dev);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
+		return -1;
+	}
+
+	internal = list->internal;
+	rte_atomic32_set(&internal->started, 0);
+	update_datapath(internal);
+
+	rte_pci_unmap_device(internal->pdev);
+	rte_vfio_container_destroy(internal->vfio_container_fd);
+	rte_vdpa_unregister_device(internal->did);
+
+	pthread_mutex_lock(&internal_list_lock);
+	TAILQ_REMOVE(&internal_list, list, next);
+	pthread_mutex_unlock(&internal_list_lock);
+
+	rte_free(list);
+	rte_free(internal);
+
+	return 0;
+}
+
+/*
+ * IFCVF has the same vendor ID and device ID as virtio net PCI
+ * device, with its specific subsystem vendor ID and device ID.
+ */
+static const struct rte_pci_id pci_id_ifcvf_map[] = {
+	{ .class_id = RTE_CLASS_ANY_ID,
+	  .vendor_id = IFCVF_VENDOR_ID,
+	  .device_id = IFCVF_DEVICE_ID,
+	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
+	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
+	},
+
+	{ .vendor_id = 0, /* sentinel */
+	},
+};
+
+static struct rte_pci_driver rte_ifcvf_vdpa = {
+	.id_table = pci_id_ifcvf_map,
+	.drv_flags = 0,
+	.probe = ifcvf_pci_probe,
+	.remove = ifcvf_pci_remove,
+};
+
+RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
+RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
+
+RTE_INIT(ifcvf_vdpa_init_log)
+{
+	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
+	if (ifcvf_vdpa_logtype >= 0)
+		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/vdpa/ifc/meson.build b/drivers/vdpa/ifc/meson.build
new file mode 100644
index 0000000..adc9ed9
--- /dev/null
+++ b/drivers/vdpa/ifc/meson.build
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+build = dpdk_conf.has('RTE_LIBRTE_VHOST')
+reason = 'missing dependency, DPDK vhost library'
+allow_experimental_apis = true
+sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
+includes += include_directories('base')
+deps += 'vhost'
diff --git a/drivers/vdpa/ifc/rte_pmd_ifc_version.map b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
new file mode 100644
index 0000000..f9f17e4
--- /dev/null
+++ b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
@@ -0,0 +1,3 @@
+DPDK_20.0 {
+	local: *;
+};
diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
index a839ff5..fd164d3 100644
--- a/drivers/vdpa/meson.build
+++ b/drivers/vdpa/meson.build
@@ -1,7 +1,7 @@
 #   SPDX-License-Identifier: BSD-3-Clause
 #   Copyright 2019 Mellanox Technologies, Ltd
 
-drivers = []
+drivers = ['ifc']
 std_deps = ['bus_pci', 'kvargs']
 std_deps += ['vhost']
 config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09 10:53               ` Xu, Rosen
@ 2020-01-09 11:34                 ` Matan Azrad
  2020-01-10  2:38                   ` Xu, Rosen
  0 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 11:34 UTC (permalink / raw)
  To: Xu, Rosen, Thomas Monjalon
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy, Roni Bar Yanai



From: Xu, Rosen <rosen.xu@intel.com>
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Thursday, January 09, 2020 16:41
> > To: Xu, Rosen <rosen.xu@intel.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> > <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> > Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> > <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA
> > device drivers
> >
> > 09/01/2020 03:27, Xu, Rosen:
> > > Hi,
> > >
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > 08/01/2020 13:39, Xu, Rosen:
> > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > From: Xu, Rosen
> > > > > > > Did you think about OVS DPDK?
> > > > > > > vDPA is a basic module for OVS, currently it will take some
> > > > > > > exception path packet processing for OVS, so it still needs
> > > > > > > to integrate
> > > > eth_dev.
> > > > > >
> > > > > > I don't understand your question.
> > > > > >
> > > > > > What do you mean by "integrate eth_dev"?
> > > > >
> > > > > My questions is in OVS DPDK scenario vDPA device implements
> > > > > eth_dev ops, so create a new class and move ifc code to this new
> > > > > class
> > is not ok.
> > > >
> > > > 1/ I don't understand the relation with OVS.
> > > >
> > > > 2/ no, vDPA device implements vDPA ops.
> > > > If it implements ethdev ops, it is an ethdev device.
> > > >
> > > > Please show an example of what you claim.
> > >
> > > Answers of 1 and 2.
> > >
> > > In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
> > > needs to be implemented as rte_eth_dev and provides eth_dev_ops
> such
> > > as
> > packet TX/RX for OVS.
> >
> > No, OVS is also using the vhost API for vhost port.
> 
> Yes, vhost pmd is not a good example.
> 
> > > Take vHost(Virtio back end) for example, OVS startups vHost
> > > interface like
> > this:
> > > ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> > > type=dpdkvhostuser drivers/net/vhost implements vHost as
> rte_eth_dev
> > and integrated in OVS.
> > > OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> > > rte_eth_rx_burst() which call eth_dev_ops implementation of
> > drivers/net/vhost.
> >
> > No, it is using rte_vhost_dequeue_burst() and
> > rte_vhost_enqueue_burst() which are not in ethdev.
> >
> > > vDPA is also Virtio back end and works like vHost, same as vHost, it
> > > will be implemented as rte_eth_dev and also be integrated into OVS.
> >
> > No, vDPA is not "implemented as rte_eth_dev".
> 
> Currently, vDPA isn't integrated with OVS.
> 
> > > So, it's not ok to move ifc code from drivers/net.
> >
> > drivers/net/ifc has no ethdev implementation at all.
> 
> For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev, but
> there are many discussions in OVS community about vDPA, some are from
> Mellanox, it seems vDPA port will be implemented as rte_eth_dev port in
> OVS in the near feature.
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch
> work.ozlabs.org%2Fpatch%2F1178474%2F&amp;data=02%7C01%7Cmatan%4
> 0mellanox.com%7C9e84c2581e2f414e0aca08d794f22e8d%7Ca652971c7d2e4d
> 9ba6a4d149256f461b%7C0%7C0%7C637141640216181763&amp;sdata=TA%2F
> 0zU495kXUqhC6eP09NDzBZfjJz1dbfkRcDpV%2BYAs%3D&amp;reserved=0
> 
> Matan,
> Could you clarify how OVS integrates vDPA in Mellanox patch?
> 
> >
> > Rosen, I'm sorry, these arguments look irrelevant, so I won't consider
> > them as blocking the integration of this patch.
> 
> What I mentioned is not blocking the integration of this patch, I just want to
> get clarification from Matan how to integrate vDPA port in OVS.


Hi

OVS like any other application should use the current API of vDPA to attach a probed vdpa device to a vhost device.
See example application /examples/vdpa.

Here, we just introduce a new class to hold all the vDPA drivers, no change in the API.

As I understand, no vDPA device is currently integrated in OVS.

I think it can be integrated only when a full offload will be integrated since the vDPA device forward the traffic from the HW directly to the virtio queue, once it will be there, I guess the offload will be configured by the representor of the vdpa device(VF) which is managed by an ethdev device.


Matan.

 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
@ 2020-01-09 17:25     ` Matan Azrad
  2020-01-10  1:55       ` Wang, Haiyue
  2020-01-13 22:57     ` Thomas Monjalon
  1 sibling, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2020-01-09 17:25 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang
  Cc: Ferruh Yigit, dev, Thomas Monjalon, Andrew Rybchenko

Small typo inline.

From: Matan Azrad
> A new vDPA class was recently introduced.
> 
> IFC driver implements the vDPA operations, hence it should be moved to
> the vDPA class.
> 
> Move it.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  MAINTAINERS                              |   14 +-
>  doc/guides/nics/features/ifcvf.ini       |    8 -
>  doc/guides/nics/ifc.rst                  |  106 ---
>  doc/guides/nics/index.rst                |    1 -
>  doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
>  doc/guides/vdpadevs/ifc.rst              |  106 +++
>  doc/guides/vdpadevs/index.rst            |    1 +
>  drivers/net/Makefile                     |    3 -
>  drivers/net/ifc/Makefile                 |   34 -
>  drivers/net/ifc/base/ifcvf.c             |  329 --------
>  drivers/net/ifc/base/ifcvf.h             |  162 ----
>  drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
>  drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
>  drivers/net/ifc/meson.build              |    9 -
>  drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
>  drivers/net/meson.build                  |    1 -
>  drivers/vdpa/Makefile                    |    6 +
>  drivers/vdpa/ifc/Makefile                |   34 +
>  drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
>  drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
>  drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280
> ++++++++++++++++++++++++++++++
>  drivers/vdpa/ifc/meson.build             |    9 +
>  drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
>  drivers/vdpa/meson.build                 |    2 +-
>  25 files changed, 1997 insertions(+), 1997 deletions(-)
>  delete mode 100644 doc/guides/nics/features/ifcvf.ini
>  delete mode 100644 doc/guides/nics/ifc.rst
>  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
>  create mode 100644 doc/guides/vdpadevs/ifc.rst
>  delete mode 100644 drivers/net/ifc/Makefile
>  delete mode 100644 drivers/net/ifc/base/ifcvf.c
>  delete mode 100644 drivers/net/ifc/base/ifcvf.h
>  delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
>  delete mode 100644 drivers/net/ifc/meson.build
>  delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
>  create mode 100644 drivers/vdpa/ifc/Makefile
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
>  create mode 100644 drivers/vdpa/ifc/meson.build
>  create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 17c2df7..16facba 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -679,14 +679,6 @@ T: git://dpdk.org/next/dpdk-next-net-intel
>  F: drivers/net/iavf/
>  F: doc/guides/nics/features/iavf*.ini
> 
> -Intel ifc
> -M: Xiao Wang <xiao.w.wang@intel.com>
> -T: git://dpdk.org/next/dpdk-next-net-intel
> -F: drivers/net/ifc/
> -F: doc/guides/nics/ifc.rst
> -F: doc/guides/nics/features/ifc*.ini
> -
> -Intel ice

This line removing is typo.
Will be fixed in next version if needed or in integration.

>  M: Qiming Yang <qiming.yang@intel.com>
>  M: Wenzhuo Lu <wenzhuo.lu@intel.com>
>  T: git://dpdk.org/next/dpdk-next-net-intel
> @@ -1093,6 +1085,12 @@ vDPA Drivers
>  ------------
>  T: git://dpdk.org/next/dpdk-next-virtio
> 
> +Intel ifc
> +M: Xiao Wang <xiao.w.wang@intel.com>
> +F: drivers/vdpa/ifc/
> +F: doc/guides/vdpadevs/ifc.rst
> +F: doc/guides/vdpadevs/features/ifcvf.ini
> +
> 
>  Eventdev Drivers
>  ----------------
> diff --git a/doc/guides/nics/features/ifcvf.ini
> b/doc/guides/nics/features/ifcvf.ini
> deleted file mode 100644
> index ef1fc47..0000000
> --- a/doc/guides/nics/features/ifcvf.ini
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -;
> -; Supported features of the 'ifcvf' vDPA driver.
> -;
> -; Refer to default.ini for the full list of available PMD features.
> -;
> -[Features]
> -x86-32               = Y
> -x86-64               = Y
> diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst
> deleted file mode 100644
> index 12a2a34..0000000
> --- a/doc/guides/nics/ifc.rst
> +++ /dev/null
> @@ -1,106 +0,0 @@
> -..  SPDX-License-Identifier: BSD-3-Clause
> -    Copyright(c) 2018 Intel Corporation.
> -
> -IFCVF vDPA driver
> -=================
> -
> -The IFCVF vDPA (vhost data path acceleration) driver provides support for
> the
> -Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
> -works as a HW vhost backend which can send/receive packets to/from virtio
> -directly by DMA. Besides, it supports dirty page logging and device state
> -report/restore, this driver enables its vDPA functionality.
> -
> -
> -Pre-Installation Configuration
> -------------------------------
> -
> -Config File Options
> -~~~~~~~~~~~~~~~~~~~
> -
> -The following option can be modified in the ``config`` file.
> -
> -- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
> -
> -  Toggle compilation of the ``librte_pmd_ifc`` driver.
> -
> -
> -IFCVF vDPA Implementation
> --------------------------
> -
> -IFCVF's vendor ID and device ID are same as that of virtio net pci device,
> -with its specific subsystem vendor ID and device ID. To let the device be
> -probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that
> this
> -device is to be used in vDPA mode, rather than polling mode, virtio pmd will
> -skip when it detects this message. If no this parameter specified, device
> -will not be used as a vDPA device, and it will be driven by virtio pmd.
> -
> -Different VF devices serve different virtio frontends which are in different
> -VMs, so each VF needs to have its own DMA address translation service.
> During
> -the driver probe a new container is created for this device, with this
> -container vDPA driver can program DMA remapping table with the VM's
> memory
> -region information.
> -
> -The device argument "sw-live-migration=1" will configure the driver into SW
> -assisted live migration mode. In this mode, the driver will set up a SW relay
> -thread when LM happens, this thread will help device to log dirty pages.
> Thus
> -this mode does not require HW to implement a dirty page logging function
> block,
> -but will consume some percentage of CPU resource depending on the
> network
> -throughput. If no this parameter specified, driver will rely on device's logging
> -capability.
> -
> -Key IFCVF vDPA driver ops
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -- ifcvf_dev_config:
> -  Enable VF data path with virtio information provided by vhost lib, including
> -  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt
> setup to
> -  route HW interrupt to virtio driver, create notify relay thread to translate
> -  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
> -
> -  This function gets called to set up HW data path backend when virtio driver
> -  in VM gets ready.
> -
> -- ifcvf_dev_close:
> -  Revoke all the setup in ifcvf_dev_config.
> -
> -  This function gets called when virtio driver stops device in VM.
> -
> -To create a vhost port with IFC VF
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -- Create a vhost socket and assign a VF's device ID to this socket via
> -  vhost API. When QEMU vhost connection gets ready, the assigned VF will
> -  get configured automatically.
> -
> -
> -Features
> ---------
> -
> -Features of the IFCVF driver are:
> -
> -- Compatibility with virtio 0.95 and 1.0.
> -- SW assisted vDPA live migration.
> -
> -
> -Prerequisites
> --------------
> -
> -- Platform with IOMMU feature. IFC VF needs address translation service to
> -  Rx/Tx directly with virtio driver in VM.
> -
> -
> -Limitations
> ------------
> -
> -Dependency on vfio-pci
> -~~~~~~~~~~~~~~~~~~~~~~
> -
> -vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt
> vector
> -is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
> -allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
> -
> -Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -IFC VF doesn't support RARP packet generation, virtio frontend supporting
> -VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index d61c27f..8c540c0 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -31,7 +31,6 @@ Network Interface Controller Drivers
>      hns3
>      i40e
>      ice
> -    ifc
>      igb
>      ipn3ke
>      ixgbe
> diff --git a/doc/guides/vdpadevs/features/ifcvf.ini
> b/doc/guides/vdpadevs/features/ifcvf.ini
> new file mode 100644
> index 0000000..ef1fc47
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/ifcvf.ini
> @@ -0,0 +1,8 @@
> +;
> +; Supported features of the 'ifcvf' vDPA driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +x86-32               = Y
> +x86-64               = Y
> diff --git a/doc/guides/vdpadevs/ifc.rst b/doc/guides/vdpadevs/ifc.rst
> new file mode 100644
> index 0000000..12a2a34
> --- /dev/null
> +++ b/doc/guides/vdpadevs/ifc.rst
> @@ -0,0 +1,106 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018 Intel Corporation.
> +
> +IFCVF vDPA driver
> +=================
> +
> +The IFCVF vDPA (vhost data path acceleration) driver provides support for
> the
> +Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
> +works as a HW vhost backend which can send/receive packets to/from
> virtio
> +directly by DMA. Besides, it supports dirty page logging and device state
> +report/restore, this driver enables its vDPA functionality.
> +
> +
> +Pre-Installation Configuration
> +------------------------------
> +
> +Config File Options
> +~~~~~~~~~~~~~~~~~~~
> +
> +The following option can be modified in the ``config`` file.
> +
> +- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
> +
> +  Toggle compilation of the ``librte_pmd_ifc`` driver.
> +
> +
> +IFCVF vDPA Implementation
> +-------------------------
> +
> +IFCVF's vendor ID and device ID are same as that of virtio net pci device,
> +with its specific subsystem vendor ID and device ID. To let the device be
> +probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that
> this
> +device is to be used in vDPA mode, rather than polling mode, virtio pmd will
> +skip when it detects this message. If no this parameter specified, device
> +will not be used as a vDPA device, and it will be driven by virtio pmd.
> +
> +Different VF devices serve different virtio frontends which are in different
> +VMs, so each VF needs to have its own DMA address translation service.
> During
> +the driver probe a new container is created for this device, with this
> +container vDPA driver can program DMA remapping table with the VM's
> memory
> +region information.
> +
> +The device argument "sw-live-migration=1" will configure the driver into SW
> +assisted live migration mode. In this mode, the driver will set up a SW relay
> +thread when LM happens, this thread will help device to log dirty pages.
> Thus
> +this mode does not require HW to implement a dirty page logging function
> block,
> +but will consume some percentage of CPU resource depending on the
> network
> +throughput. If no this parameter specified, driver will rely on device's
> logging
> +capability.
> +
> +Key IFCVF vDPA driver ops
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +- ifcvf_dev_config:
> +  Enable VF data path with virtio information provided by vhost lib, including
> +  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt
> setup to
> +  route HW interrupt to virtio driver, create notify relay thread to translate
> +  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
> +
> +  This function gets called to set up HW data path backend when virtio driver
> +  in VM gets ready.
> +
> +- ifcvf_dev_close:
> +  Revoke all the setup in ifcvf_dev_config.
> +
> +  This function gets called when virtio driver stops device in VM.
> +
> +To create a vhost port with IFC VF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +- Create a vhost socket and assign a VF's device ID to this socket via
> +  vhost API. When QEMU vhost connection gets ready, the assigned VF will
> +  get configured automatically.
> +
> +
> +Features
> +--------
> +
> +Features of the IFCVF driver are:
> +
> +- Compatibility with virtio 0.95 and 1.0.
> +- SW assisted vDPA live migration.
> +
> +
> +Prerequisites
> +-------------
> +
> +- Platform with IOMMU feature. IFC VF needs address translation service to
> +  Rx/Tx directly with virtio driver in VM.
> +
> +
> +Limitations
> +-----------
> +
> +Dependency on vfio-pci
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt
> vector
> +is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
> +allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
> +
> +Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +IFC VF doesn't support RARP packet generation, virtio frontend supporting
> +VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
> diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
> index 89e2b03..6cf0827 100644
> --- a/doc/guides/vdpadevs/index.rst
> +++ b/doc/guides/vdpadevs/index.rst
> @@ -12,3 +12,4 @@ which can be used from an application through vhost
> API.
>      :numbered:
> 
>      features_overview
> +    ifc
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index cee3036..cca3c44 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -71,9 +71,6 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
> 
>  ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
> -ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> -DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
> -endif
>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
> 
>  ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
> diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifc/Makefile
> deleted file mode 100644
> index fe227b8..0000000
> --- a/drivers/net/ifc/Makefile
> +++ /dev/null
> @@ -1,34 +0,0 @@
> -# SPDX-License-Identifier: BSD-3-Clause
> -# Copyright(c) 2018 Intel Corporation
> -
> -include $(RTE_SDK)/mk/rte.vars.mk
> -
> -#
> -# library name
> -#
> -LIB = librte_pmd_ifc.a
> -
> -LDLIBS += -lpthread
> -LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
> -LDLIBS += -lrte_kvargs
> -
> -CFLAGS += -O3
> -CFLAGS += $(WERROR_FLAGS)
> -CFLAGS += -DALLOW_EXPERIMENTAL_API
> -
> -#
> -# Add extra flags for base driver source files to disable warnings in them
> -#
> -BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard
> $(SRCDIR)/base/*.c))))
> -
> -VPATH += $(SRCDIR)/base
> -
> -EXPORT_MAP := rte_pmd_ifc_version.map
> -
> -#
> -# all source are stored in SRCS-y
> -#
> -SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
> -SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
> -
> -include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifc/base/ifcvf.c
> deleted file mode 100644
> index 3c0b2df..0000000
> --- a/drivers/net/ifc/base/ifcvf.c
> +++ /dev/null
> @@ -1,329 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright(c) 2018 Intel Corporation
> - */
> -
> -#include "ifcvf.h"
> -#include "ifcvf_osdep.h"
> -
> -STATIC void *
> -get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
> -{
> -	u8 bar = cap->bar;
> -	u32 length = cap->length;
> -	u32 offset = cap->offset;
> -
> -	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
> -		DEBUGOUT("invalid bar: %u\n", bar);
> -		return NULL;
> -	}
> -
> -	if (offset + length < offset) {
> -		DEBUGOUT("offset(%u) + length(%u) overflows\n",
> -			offset, length);
> -		return NULL;
> -	}
> -
> -	if (offset + length > hw->mem_resource[cap->bar].len) {
> -		DEBUGOUT("offset(%u) + length(%u) overflows bar
> length(%u)",
> -			offset, length, (u32)hw->mem_resource[cap-
> >bar].len);
> -		return NULL;
> -	}
> -
> -	return hw->mem_resource[bar].addr + offset;
> -}
> -
> -int
> -ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
> -{
> -	int ret;
> -	u8 pos;
> -	struct ifcvf_pci_cap cap;
> -
> -	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
> -	if (ret < 0) {
> -		DEBUGOUT("failed to read pci capability list\n");
> -		return -1;
> -	}
> -
> -	while (pos) {
> -		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
> -				sizeof(cap), pos);
> -		if (ret < 0) {
> -			DEBUGOUT("failed to read cap at pos: %x", pos);
> -			break;
> -		}
> -
> -		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
> -			goto next;
> -
> -		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
> -				"len: %u\n", cap.cfg_type, cap.bar,
> -				cap.offset, cap.length);
> -
> -		switch (cap.cfg_type) {
> -		case IFCVF_PCI_CAP_COMMON_CFG:
> -			hw->common_cfg = get_cap_addr(hw, &cap);
> -			break;
> -		case IFCVF_PCI_CAP_NOTIFY_CFG:
> -			PCI_READ_CONFIG_DWORD(dev, &hw-
> >notify_off_multiplier,
> -					pos + sizeof(cap));
> -			hw->notify_base = get_cap_addr(hw, &cap);
> -			hw->notify_region = cap.bar;
> -			break;
> -		case IFCVF_PCI_CAP_ISR_CFG:
> -			hw->isr = get_cap_addr(hw, &cap);
> -			break;
> -		case IFCVF_PCI_CAP_DEVICE_CFG:
> -			hw->dev_cfg = get_cap_addr(hw, &cap);
> -			break;
> -		}
> -next:
> -		pos = cap.cap_next;
> -	}
> -
> -	hw->lm_cfg = hw->mem_resource[4].addr;
> -
> -	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
> -			hw->isr == NULL || hw->dev_cfg == NULL) {
> -		DEBUGOUT("capability incomplete\n");
> -		return -1;
> -	}
> -
> -	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
> -			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
> -			"multiplier: %u\n",
> -			hw->common_cfg, hw->dev_cfg,
> -			hw->isr, hw->notify_base,
> -			hw->notify_off_multiplier);
> -
> -	return 0;
> -}
> -
> -STATIC u8
> -ifcvf_get_status(struct ifcvf_hw *hw)
> -{
> -	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
> -}
> -
> -STATIC void
> -ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
> -{
> -	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
> -}
> -
> -STATIC void
> -ifcvf_reset(struct ifcvf_hw *hw)
> -{
> -	ifcvf_set_status(hw, 0);
> -
> -	/* flush status write */
> -	while (ifcvf_get_status(hw))
> -		msec_delay(1);
> -}
> -
> -STATIC void
> -ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
> -{
> -	if (status != 0)
> -		status |= ifcvf_get_status(hw);
> -
> -	ifcvf_set_status(hw, status);
> -	ifcvf_get_status(hw);
> -}
> -
> -u64
> -ifcvf_get_features(struct ifcvf_hw *hw)
> -{
> -	u32 features_lo, features_hi;
> -	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> -
> -	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
> -	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
> -
> -	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
> -	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
> -
> -	return ((u64)features_hi << 32) | features_lo;
> -}
> -
> -STATIC void
> -ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
> -{
> -	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> -
> -	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
> -	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg-
> >guest_feature);
> -
> -	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
> -	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
> -}
> -
> -STATIC int
> -ifcvf_config_features(struct ifcvf_hw *hw)
> -{
> -	u64 host_features;
> -
> -	host_features = ifcvf_get_features(hw);
> -	hw->req_features &= host_features;
> -
> -	ifcvf_set_features(hw, hw->req_features);
> -	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
> -
> -	if (!(ifcvf_get_status(hw) &
> IFCVF_CONFIG_STATUS_FEATURES_OK)) {
> -		DEBUGOUT("failed to set FEATURES_OK status\n");
> -		return -1;
> -	}
> -
> -	return 0;
> -}
> -
> -STATIC void
> -io_write64_twopart(u64 val, u32 *lo, u32 *hi)
> -{
> -	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
> -	IFCVF_WRITE_REG32(val >> 32, hi);
> -}
> -
> -STATIC int
> -ifcvf_hw_enable(struct ifcvf_hw *hw)
> -{
> -	struct ifcvf_pci_common_cfg *cfg;
> -	u8 *lm_cfg;
> -	u32 i;
> -	u16 notify_off;
> -
> -	cfg = hw->common_cfg;
> -	lm_cfg = hw->lm_cfg;
> -
> -	IFCVF_WRITE_REG16(0, &cfg->msix_config);
> -	if (IFCVF_READ_REG16(&cfg->msix_config) ==
> IFCVF_MSI_NO_VECTOR) {
> -		DEBUGOUT("msix vec alloc failed for device config\n");
> -		return -1;
> -	}
> -
> -	for (i = 0; i < hw->nr_vring; i++) {
> -		IFCVF_WRITE_REG16(i, &cfg->queue_select);
> -		io_write64_twopart(hw->vring[i].desc, &cfg-
> >queue_desc_lo,
> -				&cfg->queue_desc_hi);
> -		io_write64_twopart(hw->vring[i].avail, &cfg-
> >queue_avail_lo,
> -				&cfg->queue_avail_hi);
> -		io_write64_twopart(hw->vring[i].used, &cfg-
> >queue_used_lo,
> -				&cfg->queue_used_hi);
> -		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
> -
> -		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
> -				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
> -			(u32)hw->vring[i].last_avail_idx |
> -			((u32)hw->vring[i].last_used_idx << 16);
> -
> -		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
> -		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
> -				IFCVF_MSI_NO_VECTOR) {
> -			DEBUGOUT("queue %u, msix vec alloc failed\n",
> -					i);
> -			return -1;
> -		}
> -
> -		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
> -		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
> -				notify_off * hw->notify_off_multiplier);
> -		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
> -	}
> -
> -	return 0;
> -}
> -
> -STATIC void
> -ifcvf_hw_disable(struct ifcvf_hw *hw)
> -{
> -	u32 i;
> -	struct ifcvf_pci_common_cfg *cfg;
> -	u32 ring_state;
> -
> -	cfg = hw->common_cfg;
> -
> -	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
> -	for (i = 0; i < hw->nr_vring; i++) {
> -		IFCVF_WRITE_REG16(i, &cfg->queue_select);
> -		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
> -		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg-
> >queue_msix_vector);
> -		ring_state = *(u32 *)(hw->lm_cfg +
> IFCVF_LM_RING_STATE_OFFSET +
> -				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
> -		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
> -		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
> -	}
> -}
> -
> -int
> -ifcvf_start_hw(struct ifcvf_hw *hw)
> -{
> -	ifcvf_reset(hw);
> -	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
> -	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
> -
> -	if (ifcvf_config_features(hw) < 0)
> -		return -1;
> -
> -	if (ifcvf_hw_enable(hw) < 0)
> -		return -1;
> -
> -	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
> -	return 0;
> -}
> -
> -void
> -ifcvf_stop_hw(struct ifcvf_hw *hw)
> -{
> -	ifcvf_hw_disable(hw);
> -	ifcvf_reset(hw);
> -}
> -
> -void
> -ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
> -{
> -	u8 *lm_cfg;
> -
> -	lm_cfg = hw->lm_cfg;
> -
> -	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
> -		log_base & IFCVF_32_BIT_MASK;
> -
> -	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
> -		(log_base >> 32) & IFCVF_32_BIT_MASK;
> -
> -	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
> -		(log_base + log_size) & IFCVF_32_BIT_MASK;
> -
> -	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
> -		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
> -
> -	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) =
> IFCVF_LM_ENABLE_VF;
> -}
> -
> -void
> -ifcvf_disable_logging(struct ifcvf_hw *hw)
> -{
> -	u8 *lm_cfg;
> -
> -	lm_cfg = hw->lm_cfg;
> -	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
> -}
> -
> -void
> -ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
> -{
> -	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
> -}
> -
> -u8
> -ifcvf_get_notify_region(struct ifcvf_hw *hw)
> -{
> -	return hw->notify_region;
> -}
> -
> -u64
> -ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
> -{
> -	return (u8 *)hw->notify_addr[qid] -
> -		(u8 *)hw->mem_resource[hw->notify_region].addr;
> -}
> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
> deleted file mode 100644
> index 9be2770..0000000
> --- a/drivers/net/ifc/base/ifcvf.h
> +++ /dev/null
> @@ -1,162 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright(c) 2018 Intel Corporation
> - */
> -
> -#ifndef _IFCVF_H_
> -#define _IFCVF_H_
> -
> -#include "ifcvf_osdep.h"
> -
> -#define IFCVF_VENDOR_ID		0x1AF4
> -#define IFCVF_DEVICE_ID		0x1041
> -#define IFCVF_SUBSYS_VENDOR_ID	0x8086
> -#define IFCVF_SUBSYS_DEVICE_ID	0x001A
> -
> -#define IFCVF_MAX_QUEUES		1
> -#define VIRTIO_F_IOMMU_PLATFORM		33
> -
> -/* Common configuration */
> -#define IFCVF_PCI_CAP_COMMON_CFG	1
> -/* Notifications */
> -#define IFCVF_PCI_CAP_NOTIFY_CFG	2
> -/* ISR Status */
> -#define IFCVF_PCI_CAP_ISR_CFG		3
> -/* Device specific configuration */
> -#define IFCVF_PCI_CAP_DEVICE_CFG	4
> -/* PCI configuration access */
> -#define IFCVF_PCI_CAP_PCI_CFG		5
> -
> -#define IFCVF_CONFIG_STATUS_RESET     0x00
> -#define IFCVF_CONFIG_STATUS_ACK       0x01
> -#define IFCVF_CONFIG_STATUS_DRIVER    0x02
> -#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
> -#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
> -#define IFCVF_CONFIG_STATUS_FAILED    0x80
> -
> -#define IFCVF_MSI_NO_VECTOR	0xffff
> -#define IFCVF_PCI_MAX_RESOURCE	6
> -
> -#define IFCVF_LM_CFG_SIZE		0x40
> -#define IFCVF_LM_RING_STATE_OFFSET	0x20
> -
> -#define IFCVF_LM_LOGGING_CTRL		0x0
> -
> -#define IFCVF_LM_BASE_ADDR_LOW		0x10
> -#define IFCVF_LM_BASE_ADDR_HIGH		0x14
> -#define IFCVF_LM_END_ADDR_LOW		0x18
> -#define IFCVF_LM_END_ADDR_HIGH		0x1c
> -
> -#define IFCVF_LM_DISABLE		0x0
> -#define IFCVF_LM_ENABLE_VF		0x1
> -#define IFCVF_LM_ENABLE_PF		0x3
> -#define IFCVF_LOG_BASE			0x100000000000
> -#define IFCVF_MEDIATED_VRING		0x200000000000
> -
> -#define IFCVF_32_BIT_MASK		0xffffffff
> -
> -
> -struct ifcvf_pci_cap {
> -	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
> -	u8 cap_next;            /* Generic PCI field: next ptr. */
> -	u8 cap_len;             /* Generic PCI field: capability length */
> -	u8 cfg_type;            /* Identifies the structure. */
> -	u8 bar;                 /* Where to find it. */
> -	u8 padding[3];          /* Pad to full dword. */
> -	u32 offset;             /* Offset within bar. */
> -	u32 length;             /* Length of the structure, in bytes. */
> -};
> -
> -struct ifcvf_pci_notify_cap {
> -	struct ifcvf_pci_cap cap;
> -	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
> -};
> -
> -struct ifcvf_pci_common_cfg {
> -	/* About the whole device. */
> -	u32 device_feature_select;
> -	u32 device_feature;
> -	u32 guest_feature_select;
> -	u32 guest_feature;
> -	u16 msix_config;
> -	u16 num_queues;
> -	u8 device_status;
> -	u8 config_generation;
> -
> -	/* About a specific virtqueue. */
> -	u16 queue_select;
> -	u16 queue_size;
> -	u16 queue_msix_vector;
> -	u16 queue_enable;
> -	u16 queue_notify_off;
> -	u32 queue_desc_lo;
> -	u32 queue_desc_hi;
> -	u32 queue_avail_lo;
> -	u32 queue_avail_hi;
> -	u32 queue_used_lo;
> -	u32 queue_used_hi;
> -};
> -
> -struct ifcvf_net_config {
> -	u8    mac[6];
> -	u16   status;
> -	u16   max_virtqueue_pairs;
> -} __attribute__((packed));
> -
> -struct ifcvf_pci_mem_resource {
> -	u64      phys_addr; /**< Physical address, 0 if not resource. */
> -	u64      len;       /**< Length of the resource. */
> -	u8       *addr;     /**< Virtual address, NULL when not mapped. */
> -};
> -
> -struct vring_info {
> -	u64 desc;
> -	u64 avail;
> -	u64 used;
> -	u16 size;
> -	u16 last_avail_idx;
> -	u16 last_used_idx;
> -};
> -
> -struct ifcvf_hw {
> -	u64    req_features;
> -	u8     notify_region;
> -	u32    notify_off_multiplier;
> -	struct ifcvf_pci_common_cfg *common_cfg;
> -	struct ifcvf_net_config *dev_cfg;
> -	u8     *isr;
> -	u16    *notify_base;
> -	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
> -	u8     *lm_cfg;
> -	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
> -	u8 nr_vring;
> -	struct ifcvf_pci_mem_resource
> mem_resource[IFCVF_PCI_MAX_RESOURCE];
> -};
> -
> -int
> -ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
> -
> -u64
> -ifcvf_get_features(struct ifcvf_hw *hw);
> -
> -int
> -ifcvf_start_hw(struct ifcvf_hw *hw);
> -
> -void
> -ifcvf_stop_hw(struct ifcvf_hw *hw);
> -
> -void
> -ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
> -
> -void
> -ifcvf_disable_logging(struct ifcvf_hw *hw);
> -
> -void
> -ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
> -
> -u8
> -ifcvf_get_notify_region(struct ifcvf_hw *hw);
> -
> -u64
> -ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
> -
> -#endif /* _IFCVF_H_ */
> diff --git a/drivers/net/ifc/base/ifcvf_osdep.h
> b/drivers/net/ifc/base/ifcvf_osdep.h
> deleted file mode 100644
> index 6aef25e..0000000
> --- a/drivers/net/ifc/base/ifcvf_osdep.h
> +++ /dev/null
> @@ -1,52 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright(c) 2018 Intel Corporation
> - */
> -
> -#ifndef _IFCVF_OSDEP_H_
> -#define _IFCVF_OSDEP_H_
> -
> -#include <stdint.h>
> -#include <linux/pci_regs.h>
> -
> -#include <rte_cycles.h>
> -#include <rte_pci.h>
> -#include <rte_bus_pci.h>
> -#include <rte_log.h>
> -#include <rte_io.h>
> -
> -#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
> -#define STATIC                  static
> -
> -#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
> -
> -#define IFCVF_READ_REG8(reg)		rte_read8(reg)
> -#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
> -#define IFCVF_READ_REG16(reg)		rte_read16(reg)
> -#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
> -#define IFCVF_READ_REG32(reg)		rte_read32(reg)
> -#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
> -
> -typedef struct rte_pci_device PCI_DEV;
> -
> -#define PCI_READ_CONFIG_BYTE(dev, val, where) \
> -	rte_pci_read_config(dev, val, 1, where)
> -
> -#define PCI_READ_CONFIG_DWORD(dev, val, where) \
> -	rte_pci_read_config(dev, val, 4, where)
> -
> -typedef uint8_t    u8;
> -typedef int8_t     s8;
> -typedef uint16_t   u16;
> -typedef int16_t    s16;
> -typedef uint32_t   u32;
> -typedef int32_t    s32;
> -typedef int64_t    s64;
> -typedef uint64_t   u64;
> -
> -static inline int
> -PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int
> where)
> -{
> -	return rte_pci_read_config(dev, val, size, where);
> -}
> -
> -#endif /* _IFCVF_OSDEP_H_ */
> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> deleted file mode 100644
> index da4667b..0000000
> --- a/drivers/net/ifc/ifcvf_vdpa.c
> +++ /dev/null
> @@ -1,1280 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright(c) 2018 Intel Corporation
> - */
> -
> -#include <unistd.h>
> -#include <pthread.h>
> -#include <fcntl.h>
> -#include <string.h>
> -#include <sys/ioctl.h>
> -#include <sys/epoll.h>
> -#include <linux/virtio_net.h>
> -#include <stdbool.h>
> -
> -#include <rte_malloc.h>
> -#include <rte_memory.h>
> -#include <rte_bus_pci.h>
> -#include <rte_vhost.h>
> -#include <rte_vdpa.h>
> -#include <rte_vfio.h>
> -#include <rte_spinlock.h>
> -#include <rte_log.h>
> -#include <rte_kvargs.h>
> -#include <rte_devargs.h>
> -
> -#include "base/ifcvf.h"
> -
> -#define DRV_LOG(level, fmt, args...) \
> -	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
> -		"IFCVF %s(): " fmt "\n", __func__, ##args)
> -
> -#ifndef PAGE_SIZE
> -#define PAGE_SIZE 4096
> -#endif
> -
> -#define IFCVF_USED_RING_LEN(size) \
> -	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
> -
> -#define IFCVF_VDPA_MODE		"vdpa"
> -#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
> -
> -static const char * const ifcvf_valid_arguments[] = {
> -	IFCVF_VDPA_MODE,
> -	IFCVF_SW_FALLBACK_LM,
> -	NULL
> -};
> -
> -static int ifcvf_vdpa_logtype;
> -
> -struct ifcvf_internal {
> -	struct rte_vdpa_dev_addr dev_addr;
> -	struct rte_pci_device *pdev;
> -	struct ifcvf_hw hw;
> -	int vfio_container_fd;
> -	int vfio_group_fd;
> -	int vfio_dev_fd;
> -	pthread_t tid;	/* thread for notify relay */
> -	int epfd;
> -	int vid;
> -	int did;
> -	uint16_t max_queues;
> -	uint64_t features;
> -	rte_atomic32_t started;
> -	rte_atomic32_t dev_attached;
> -	rte_atomic32_t running;
> -	rte_spinlock_t lock;
> -	bool sw_lm;
> -	bool sw_fallback_running;
> -	/* mediated vring for sw fallback */
> -	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
> -	/* eventfd for used ring interrupt */
> -	int intr_fd[IFCVF_MAX_QUEUES * 2];
> -};
> -
> -struct internal_list {
> -	TAILQ_ENTRY(internal_list) next;
> -	struct ifcvf_internal *internal;
> -};
> -
> -TAILQ_HEAD(internal_list_head, internal_list);
> -static struct internal_list_head internal_list =
> -	TAILQ_HEAD_INITIALIZER(internal_list);
> -
> -static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
> -
> -static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
> -
> -static struct internal_list *
> -find_internal_resource_by_did(int did)
> -{
> -	int found = 0;
> -	struct internal_list *list;
> -
> -	pthread_mutex_lock(&internal_list_lock);
> -
> -	TAILQ_FOREACH(list, &internal_list, next) {
> -		if (did == list->internal->did) {
> -			found = 1;
> -			break;
> -		}
> -	}
> -
> -	pthread_mutex_unlock(&internal_list_lock);
> -
> -	if (!found)
> -		return NULL;
> -
> -	return list;
> -}
> -
> -static struct internal_list *
> -find_internal_resource_by_dev(struct rte_pci_device *pdev)
> -{
> -	int found = 0;
> -	struct internal_list *list;
> -
> -	pthread_mutex_lock(&internal_list_lock);
> -
> -	TAILQ_FOREACH(list, &internal_list, next) {
> -		if (pdev == list->internal->pdev) {
> -			found = 1;
> -			break;
> -		}
> -	}
> -
> -	pthread_mutex_unlock(&internal_list_lock);
> -
> -	if (!found)
> -		return NULL;
> -
> -	return list;
> -}
> -
> -static int
> -ifcvf_vfio_setup(struct ifcvf_internal *internal)
> -{
> -	struct rte_pci_device *dev = internal->pdev;
> -	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
> -	int iommu_group_num;
> -	int i, ret;
> -
> -	internal->vfio_dev_fd = -1;
> -	internal->vfio_group_fd = -1;
> -	internal->vfio_container_fd = -1;
> -
> -	rte_pci_device_name(&dev->addr, devname,
> RTE_DEV_NAME_MAX_LEN);
> -	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
> -			&iommu_group_num);
> -	if (ret <= 0) {
> -		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
> -		return -1;
> -	}
> -
> -	internal->vfio_container_fd = rte_vfio_container_create();
> -	if (internal->vfio_container_fd < 0)
> -		return -1;
> -
> -	internal->vfio_group_fd = rte_vfio_container_group_bind(
> -			internal->vfio_container_fd, iommu_group_num);
> -	if (internal->vfio_group_fd < 0)
> -		goto err;
> -
> -	if (rte_pci_map_device(dev))
> -		goto err;
> -
> -	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
> -
> -	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE,
> IFCVF_PCI_MAX_RESOURCE);
> -			i++) {
> -		internal->hw.mem_resource[i].addr =
> -			internal->pdev->mem_resource[i].addr;
> -		internal->hw.mem_resource[i].phys_addr =
> -			internal->pdev->mem_resource[i].phys_addr;
> -		internal->hw.mem_resource[i].len =
> -			internal->pdev->mem_resource[i].len;
> -	}
> -
> -	return 0;
> -
> -err:
> -	rte_vfio_container_destroy(internal->vfio_container_fd);
> -	return -1;
> -}
> -
> -static int
> -ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
> -{
> -	uint32_t i;
> -	int ret;
> -	struct rte_vhost_memory *mem = NULL;
> -	int vfio_container_fd;
> -
> -	ret = rte_vhost_get_mem_table(internal->vid, &mem);
> -	if (ret < 0) {
> -		DRV_LOG(ERR, "failed to get VM memory layout.");
> -		goto exit;
> -	}
> -
> -	vfio_container_fd = internal->vfio_container_fd;
> -
> -	for (i = 0; i < mem->nregions; i++) {
> -		struct rte_vhost_mem_region *reg;
> -
> -		reg = &mem->regions[i];
> -		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
> -			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
> -			do_map ? "DMA map" : "DMA unmap", i,
> -			reg->host_user_addr, reg->guest_phys_addr, reg-
> >size);
> -
> -		if (do_map) {
> -			ret =
> rte_vfio_container_dma_map(vfio_container_fd,
> -				reg->host_user_addr, reg-
> >guest_phys_addr,
> -				reg->size);
> -			if (ret < 0) {
> -				DRV_LOG(ERR, "DMA map failed.");
> -				goto exit;
> -			}
> -		} else {
> -			ret =
> rte_vfio_container_dma_unmap(vfio_container_fd,
> -				reg->host_user_addr, reg-
> >guest_phys_addr,
> -				reg->size);
> -			if (ret < 0) {
> -				DRV_LOG(ERR, "DMA unmap failed.");
> -				goto exit;
> -			}
> -		}
> -	}
> -
> -exit:
> -	if (mem)
> -		free(mem);
> -	return ret;
> -}
> -
> -static uint64_t
> -hva_to_gpa(int vid, uint64_t hva)
> -{
> -	struct rte_vhost_memory *mem = NULL;
> -	struct rte_vhost_mem_region *reg;
> -	uint32_t i;
> -	uint64_t gpa = 0;
> -
> -	if (rte_vhost_get_mem_table(vid, &mem) < 0)
> -		goto exit;
> -
> -	for (i = 0; i < mem->nregions; i++) {
> -		reg = &mem->regions[i];
> -
> -		if (hva >= reg->host_user_addr &&
> -				hva < reg->host_user_addr + reg->size) {
> -			gpa = hva - reg->host_user_addr + reg-
> >guest_phys_addr;
> -			break;
> -		}
> -	}
> -
> -exit:
> -	if (mem)
> -		free(mem);
> -	return gpa;
> -}
> -
> -static int
> -vdpa_ifcvf_start(struct ifcvf_internal *internal)
> -{
> -	struct ifcvf_hw *hw = &internal->hw;
> -	int i, nr_vring;
> -	int vid;
> -	struct rte_vhost_vring vq;
> -	uint64_t gpa;
> -
> -	vid = internal->vid;
> -	nr_vring = rte_vhost_get_vring_num(vid);
> -	rte_vhost_get_negotiated_features(vid, &hw->req_features);
> -
> -	for (i = 0; i < nr_vring; i++) {
> -		rte_vhost_get_vhost_vring(vid, i, &vq);
> -		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
> -		if (gpa == 0) {
> -			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
> -			return -1;
> -		}
> -		hw->vring[i].desc = gpa;
> -
> -		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
> -		if (gpa == 0) {
> -			DRV_LOG(ERR, "Fail to get GPA for available ring.");
> -			return -1;
> -		}
> -		hw->vring[i].avail = gpa;
> -
> -		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
> -		if (gpa == 0) {
> -			DRV_LOG(ERR, "Fail to get GPA for used ring.");
> -			return -1;
> -		}
> -		hw->vring[i].used = gpa;
> -
> -		hw->vring[i].size = vq.size;
> -		rte_vhost_get_vring_base(vid, i, &hw-
> >vring[i].last_avail_idx,
> -				&hw->vring[i].last_used_idx);
> -	}
> -	hw->nr_vring = i;
> -
> -	return ifcvf_start_hw(&internal->hw);
> -}
> -
> -static void
> -vdpa_ifcvf_stop(struct ifcvf_internal *internal)
> -{
> -	struct ifcvf_hw *hw = &internal->hw;
> -	uint32_t i;
> -	int vid;
> -	uint64_t features = 0;
> -	uint64_t log_base = 0, log_size = 0;
> -	uint64_t len;
> -
> -	vid = internal->vid;
> -	ifcvf_stop_hw(hw);
> -
> -	for (i = 0; i < hw->nr_vring; i++)
> -		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
> -				hw->vring[i].last_used_idx);
> -
> -	if (internal->sw_lm)
> -		return;
> -
> -	rte_vhost_get_negotiated_features(vid, &features);
> -	if (RTE_VHOST_NEED_LOG(features)) {
> -		ifcvf_disable_logging(hw);
> -		rte_vhost_get_log_base(internal->vid, &log_base,
> &log_size);
> -		rte_vfio_container_dma_unmap(internal-
> >vfio_container_fd,
> -				log_base, IFCVF_LOG_BASE, log_size);
> -		/*
> -		 * IFCVF marks dirty memory pages for only packet buffer,
> -		 * SW helps to mark the used ring as dirty after device stops.
> -		 */
> -		for (i = 0; i < hw->nr_vring; i++) {
> -			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
> -			rte_vhost_log_used_vring(vid, i, 0, len);
> -		}
> -	}
> -}
> -
> -#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
> -		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
> -static int
> -vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
> -{
> -	int ret;
> -	uint32_t i, nr_vring;
> -	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
> -	struct vfio_irq_set *irq_set;
> -	int *fd_ptr;
> -	struct rte_vhost_vring vring;
> -	int fd;
> -
> -	vring.callfd = -1;
> -
> -	nr_vring = rte_vhost_get_vring_num(internal->vid);
> -
> -	irq_set = (struct vfio_irq_set *)irq_set_buf;
> -	irq_set->argsz = sizeof(irq_set_buf);
> -	irq_set->count = nr_vring + 1;
> -	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
> -			 VFIO_IRQ_SET_ACTION_TRIGGER;
> -	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
> -	irq_set->start = 0;
> -	fd_ptr = (int *)&irq_set->data;
> -	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev-
> >intr_handle.fd;
> -
> -	for (i = 0; i < nr_vring; i++)
> -		internal->intr_fd[i] = -1;
> -
> -	for (i = 0; i < nr_vring; i++) {
> -		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
> -		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
> -		if ((i & 1) == 0 && m_rx == true) {
> -			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
> -			if (fd < 0) {
> -				DRV_LOG(ERR, "can't setup eventfd: %s",
> -					strerror(errno));
> -				return -1;
> -			}
> -			internal->intr_fd[i] = fd;
> -			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
> -		}
> -	}
> -
> -	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
> -	if (ret) {
> -		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
> -				strerror(errno));
> -		return -1;
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
> -{
> -	int ret;
> -	uint32_t i, nr_vring;
> -	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
> -	struct vfio_irq_set *irq_set;
> -
> -	irq_set = (struct vfio_irq_set *)irq_set_buf;
> -	irq_set->argsz = sizeof(irq_set_buf);
> -	irq_set->count = 0;
> -	irq_set->flags = VFIO_IRQ_SET_DATA_NONE |
> VFIO_IRQ_SET_ACTION_TRIGGER;
> -	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
> -	irq_set->start = 0;
> -
> -	nr_vring = rte_vhost_get_vring_num(internal->vid);
> -	for (i = 0; i < nr_vring; i++) {
> -		if (internal->intr_fd[i] >= 0)
> -			close(internal->intr_fd[i]);
> -		internal->intr_fd[i] = -1;
> -	}
> -
> -	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
> -	if (ret) {
> -		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
> -				strerror(errno));
> -		return -1;
> -	}
> -
> -	return 0;
> -}
> -
> -static void *
> -notify_relay(void *arg)
> -{
> -	int i, kickfd, epfd, nfds = 0;
> -	uint32_t qid, q_num;
> -	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
> -	struct epoll_event ev;
> -	uint64_t buf;
> -	int nbytes;
> -	struct rte_vhost_vring vring;
> -	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> -	struct ifcvf_hw *hw = &internal->hw;
> -
> -	q_num = rte_vhost_get_vring_num(internal->vid);
> -
> -	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> -	if (epfd < 0) {
> -		DRV_LOG(ERR, "failed to create epoll instance.");
> -		return NULL;
> -	}
> -	internal->epfd = epfd;
> -
> -	vring.kickfd = -1;
> -	for (qid = 0; qid < q_num; qid++) {
> -		ev.events = EPOLLIN | EPOLLPRI;
> -		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
> -		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
> -		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> -			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> -			return NULL;
> -		}
> -	}
> -
> -	for (;;) {
> -		nfds = epoll_wait(epfd, events, q_num, -1);
> -		if (nfds < 0) {
> -			if (errno == EINTR)
> -				continue;
> -			DRV_LOG(ERR, "epoll_wait return fail\n");
> -			return NULL;
> -		}
> -
> -		for (i = 0; i < nfds; i++) {
> -			qid = events[i].data.u32;
> -			kickfd = (uint32_t)(events[i].data.u64 >> 32);
> -			do {
> -				nbytes = read(kickfd, &buf, 8);
> -				if (nbytes < 0) {
> -					if (errno == EINTR ||
> -					    errno == EWOULDBLOCK ||
> -					    errno == EAGAIN)
> -						continue;
> -					DRV_LOG(INFO, "Error reading "
> -						"kickfd: %s",
> -						strerror(errno));
> -				}
> -				break;
> -			} while (1);
> -
> -			ifcvf_notify_queue(hw, qid);
> -		}
> -	}
> -
> -	return NULL;
> -}
> -
> -static int
> -setup_notify_relay(struct ifcvf_internal *internal)
> -{
> -	int ret;
> -
> -	ret = pthread_create(&internal->tid, NULL, notify_relay,
> -			(void *)internal);
> -	if (ret) {
> -		DRV_LOG(ERR, "failed to create notify relay pthread.");
> -		return -1;
> -	}
> -	return 0;
> -}
> -
> -static int
> -unset_notify_relay(struct ifcvf_internal *internal)
> -{
> -	void *status;
> -
> -	if (internal->tid) {
> -		pthread_cancel(internal->tid);
> -		pthread_join(internal->tid, &status);
> -	}
> -	internal->tid = 0;
> -
> -	if (internal->epfd >= 0)
> -		close(internal->epfd);
> -	internal->epfd = -1;
> -
> -	return 0;
> -}
> -
> -static int
> -update_datapath(struct ifcvf_internal *internal)
> -{
> -	int ret;
> -
> -	rte_spinlock_lock(&internal->lock);
> -
> -	if (!rte_atomic32_read(&internal->running) &&
> -	    (rte_atomic32_read(&internal->started) &&
> -	     rte_atomic32_read(&internal->dev_attached))) {
> -		ret = ifcvf_dma_map(internal, 1);
> -		if (ret)
> -			goto err;
> -
> -		ret = vdpa_enable_vfio_intr(internal, 0);
> -		if (ret)
> -			goto err;
> -
> -		ret = vdpa_ifcvf_start(internal);
> -		if (ret)
> -			goto err;
> -
> -		ret = setup_notify_relay(internal);
> -		if (ret)
> -			goto err;
> -
> -		rte_atomic32_set(&internal->running, 1);
> -	} else if (rte_atomic32_read(&internal->running) &&
> -		   (!rte_atomic32_read(&internal->started) ||
> -		    !rte_atomic32_read(&internal->dev_attached))) {
> -		ret = unset_notify_relay(internal);
> -		if (ret)
> -			goto err;
> -
> -		vdpa_ifcvf_stop(internal);
> -
> -		ret = vdpa_disable_vfio_intr(internal);
> -		if (ret)
> -			goto err;
> -
> -		ret = ifcvf_dma_map(internal, 0);
> -		if (ret)
> -			goto err;
> -
> -		rte_atomic32_set(&internal->running, 0);
> -	}
> -
> -	rte_spinlock_unlock(&internal->lock);
> -	return 0;
> -err:
> -	rte_spinlock_unlock(&internal->lock);
> -	return ret;
> -}
> -
> -static int
> -m_ifcvf_start(struct ifcvf_internal *internal)
> -{
> -	struct ifcvf_hw *hw = &internal->hw;
> -	uint32_t i, nr_vring;
> -	int vid, ret;
> -	struct rte_vhost_vring vq;
> -	void *vring_buf;
> -	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
> -	uint64_t size;
> -	uint64_t gpa;
> -
> -	memset(&vq, 0, sizeof(vq));
> -	vid = internal->vid;
> -	nr_vring = rte_vhost_get_vring_num(vid);
> -	rte_vhost_get_negotiated_features(vid, &hw->req_features);
> -
> -	for (i = 0; i < nr_vring; i++) {
> -		rte_vhost_get_vhost_vring(vid, i, &vq);
> -
> -		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
> -				PAGE_SIZE);
> -		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
> -		vring_init(&internal->m_vring[i], vq.size, vring_buf,
> -				PAGE_SIZE);
> -
> -		ret = rte_vfio_container_dma_map(internal-
> >vfio_container_fd,
> -			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
> -		if (ret < 0) {
> -			DRV_LOG(ERR, "mediated vring DMA map failed.");
> -			goto error;
> -		}
> -
> -		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
> -		if (gpa == 0) {
> -			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
> -			return -1;
> -		}
> -		hw->vring[i].desc = gpa;
> -
> -		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
> -		if (gpa == 0) {
> -			DRV_LOG(ERR, "Fail to get GPA for available ring.");
> -			return -1;
> -		}
> -		hw->vring[i].avail = gpa;
> -
> -		/* Direct I/O for Tx queue, relay for Rx queue */
> -		if (i & 1) {
> -			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
> -			if (gpa == 0) {
> -				DRV_LOG(ERR, "Fail to get GPA for used
> ring.");
> -				return -1;
> -			}
> -			hw->vring[i].used = gpa;
> -		} else {
> -			hw->vring[i].used = m_vring_iova +
> -				(char *)internal->m_vring[i].used -
> -				(char *)internal->m_vring[i].desc;
> -		}
> -
> -		hw->vring[i].size = vq.size;
> -
> -		rte_vhost_get_vring_base(vid, i,
> -				&internal->m_vring[i].avail->idx,
> -				&internal->m_vring[i].used->idx);
> -
> -		rte_vhost_get_vring_base(vid, i, &hw-
> >vring[i].last_avail_idx,
> -				&hw->vring[i].last_used_idx);
> -
> -		m_vring_iova += size;
> -	}
> -	hw->nr_vring = nr_vring;
> -
> -	return ifcvf_start_hw(&internal->hw);
> -
> -error:
> -	for (i = 0; i < nr_vring; i++)
> -		if (internal->m_vring[i].desc)
> -			rte_free(internal->m_vring[i].desc);
> -
> -	return -1;
> -}
> -
> -static int
> -m_ifcvf_stop(struct ifcvf_internal *internal)
> -{
> -	int vid;
> -	uint32_t i;
> -	struct rte_vhost_vring vq;
> -	struct ifcvf_hw *hw = &internal->hw;
> -	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
> -	uint64_t size, len;
> -
> -	vid = internal->vid;
> -	ifcvf_stop_hw(hw);
> -
> -	for (i = 0; i < hw->nr_vring; i++) {
> -		/* synchronize remaining new used entries if any */
> -		if ((i & 1) == 0)
> -			update_used_ring(internal, i);
> -
> -		rte_vhost_get_vhost_vring(vid, i, &vq);
> -		len = IFCVF_USED_RING_LEN(vq.size);
> -		rte_vhost_log_used_vring(vid, i, 0, len);
> -
> -		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
> -				PAGE_SIZE);
> -		rte_vfio_container_dma_unmap(internal-
> >vfio_container_fd,
> -			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
> -			m_vring_iova, size);
> -
> -		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
> -				hw->vring[i].last_used_idx);
> -		rte_free(internal->m_vring[i].desc);
> -		m_vring_iova += size;
> -	}
> -
> -	return 0;
> -}
> -
> -static void
> -update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
> -{
> -	rte_vdpa_relay_vring_used(internal->vid, qid, &internal-
> >m_vring[qid]);
> -	rte_vhost_vring_call(internal->vid, qid);
> -}
> -
> -static void *
> -vring_relay(void *arg)
> -{
> -	int i, vid, epfd, fd, nfds;
> -	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> -	struct rte_vhost_vring vring;
> -	uint16_t qid, q_num;
> -	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
> -	struct epoll_event ev;
> -	int nbytes;
> -	uint64_t buf;
> -
> -	vid = internal->vid;
> -	q_num = rte_vhost_get_vring_num(vid);
> -
> -	/* add notify fd and interrupt fd to epoll */
> -	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> -	if (epfd < 0) {
> -		DRV_LOG(ERR, "failed to create epoll instance.");
> -		return NULL;
> -	}
> -	internal->epfd = epfd;
> -
> -	vring.kickfd = -1;
> -	for (qid = 0; qid < q_num; qid++) {
> -		ev.events = EPOLLIN | EPOLLPRI;
> -		rte_vhost_get_vhost_vring(vid, qid, &vring);
> -		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
> -		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> -			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> -			return NULL;
> -		}
> -	}
> -
> -	for (qid = 0; qid < q_num; qid += 2) {
> -		ev.events = EPOLLIN | EPOLLPRI;
> -		/* leave a flag to mark it's for interrupt */
> -		ev.data.u64 = 1 | qid << 1 |
> -			(uint64_t)internal->intr_fd[qid] << 32;
> -		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid],
> &ev)
> -				< 0) {
> -			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> -			return NULL;
> -		}
> -		update_used_ring(internal, qid);
> -	}
> -
> -	/* start relay with a first kick */
> -	for (qid = 0; qid < q_num; qid++)
> -		ifcvf_notify_queue(&internal->hw, qid);
> -
> -	/* listen to the events and react accordingly */
> -	for (;;) {
> -		nfds = epoll_wait(epfd, events, q_num * 2, -1);
> -		if (nfds < 0) {
> -			if (errno == EINTR)
> -				continue;
> -			DRV_LOG(ERR, "epoll_wait return fail\n");
> -			return NULL;
> -		}
> -
> -		for (i = 0; i < nfds; i++) {
> -			fd = (uint32_t)(events[i].data.u64 >> 32);
> -			do {
> -				nbytes = read(fd, &buf, 8);
> -				if (nbytes < 0) {
> -					if (errno == EINTR ||
> -					    errno == EWOULDBLOCK ||
> -					    errno == EAGAIN)
> -						continue;
> -					DRV_LOG(INFO, "Error reading "
> -						"kickfd: %s",
> -						strerror(errno));
> -				}
> -				break;
> -			} while (1);
> -
> -			qid = events[i].data.u32 >> 1;
> -
> -			if (events[i].data.u32 & 1)
> -				update_used_ring(internal, qid);
> -			else
> -				ifcvf_notify_queue(&internal->hw, qid);
> -		}
> -	}
> -
> -	return NULL;
> -}
> -
> -static int
> -setup_vring_relay(struct ifcvf_internal *internal)
> -{
> -	int ret;
> -
> -	ret = pthread_create(&internal->tid, NULL, vring_relay,
> -			(void *)internal);
> -	if (ret) {
> -		DRV_LOG(ERR, "failed to create ring relay pthread.");
> -		return -1;
> -	}
> -	return 0;
> -}
> -
> -static int
> -unset_vring_relay(struct ifcvf_internal *internal)
> -{
> -	void *status;
> -
> -	if (internal->tid) {
> -		pthread_cancel(internal->tid);
> -		pthread_join(internal->tid, &status);
> -	}
> -	internal->tid = 0;
> -
> -	if (internal->epfd >= 0)
> -		close(internal->epfd);
> -	internal->epfd = -1;
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
> -{
> -	int ret;
> -	int vid = internal->vid;
> -
> -	/* stop the direct IO data path */
> -	unset_notify_relay(internal);
> -	vdpa_ifcvf_stop(internal);
> -	vdpa_disable_vfio_intr(internal);
> -
> -	ret = rte_vhost_host_notifier_ctrl(vid, false);
> -	if (ret && ret != -ENOTSUP)
> -		goto error;
> -
> -	/* set up interrupt for interrupt relay */
> -	ret = vdpa_enable_vfio_intr(internal, 1);
> -	if (ret)
> -		goto unmap;
> -
> -	/* config the VF */
> -	ret = m_ifcvf_start(internal);
> -	if (ret)
> -		goto unset_intr;
> -
> -	/* set up vring relay thread */
> -	ret = setup_vring_relay(internal);
> -	if (ret)
> -		goto stop_vf;
> -
> -	rte_vhost_host_notifier_ctrl(vid, true);
> -
> -	internal->sw_fallback_running = true;
> -
> -	return 0;
> -
> -stop_vf:
> -	m_ifcvf_stop(internal);
> -unset_intr:
> -	vdpa_disable_vfio_intr(internal);
> -unmap:
> -	ifcvf_dma_map(internal, 0);
> -error:
> -	return -1;
> -}
> -
> -static int
> -ifcvf_dev_config(int vid)
> -{
> -	int did;
> -	struct internal_list *list;
> -	struct ifcvf_internal *internal;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	internal = list->internal;
> -	internal->vid = vid;
> -	rte_atomic32_set(&internal->dev_attached, 1);
> -	update_datapath(internal);
> -
> -	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
> -		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_dev_close(int vid)
> -{
> -	int did;
> -	struct internal_list *list;
> -	struct ifcvf_internal *internal;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	internal = list->internal;
> -
> -	if (internal->sw_fallback_running) {
> -		/* unset ring relay */
> -		unset_vring_relay(internal);
> -
> -		/* reset VF */
> -		m_ifcvf_stop(internal);
> -
> -		/* remove interrupt setting */
> -		vdpa_disable_vfio_intr(internal);
> -
> -		/* unset DMA map for guest memory */
> -		ifcvf_dma_map(internal, 0);
> -
> -		internal->sw_fallback_running = false;
> -	} else {
> -		rte_atomic32_set(&internal->dev_attached, 0);
> -		update_datapath(internal);
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_set_features(int vid)
> -{
> -	uint64_t features = 0;
> -	int did;
> -	struct internal_list *list;
> -	struct ifcvf_internal *internal;
> -	uint64_t log_base = 0, log_size = 0;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	internal = list->internal;
> -	rte_vhost_get_negotiated_features(vid, &features);
> -
> -	if (!RTE_VHOST_NEED_LOG(features))
> -		return 0;
> -
> -	if (internal->sw_lm) {
> -		ifcvf_sw_fallback_switchover(internal);
> -	} else {
> -		rte_vhost_get_log_base(vid, &log_base, &log_size);
> -		rte_vfio_container_dma_map(internal->vfio_container_fd,
> -				log_base, IFCVF_LOG_BASE, log_size);
> -		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE,
> log_size);
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_get_vfio_group_fd(int vid)
> -{
> -	int did;
> -	struct internal_list *list;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	return list->internal->vfio_group_fd;
> -}
> -
> -static int
> -ifcvf_get_vfio_device_fd(int vid)
> -{
> -	int did;
> -	struct internal_list *list;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	return list->internal->vfio_dev_fd;
> -}
> -
> -static int
> -ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
> -{
> -	int did;
> -	struct internal_list *list;
> -	struct ifcvf_internal *internal;
> -	struct vfio_region_info reg = { .argsz = sizeof(reg) };
> -	int ret;
> -
> -	did = rte_vhost_get_vdpa_device_id(vid);
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	internal = list->internal;
> -
> -	reg.index = ifcvf_get_notify_region(&internal->hw);
> -	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO,
> &reg);
> -	if (ret) {
> -		DRV_LOG(ERR, "Get not get device region info: %s",
> -				strerror(errno));
> -		return -1;
> -	}
> -
> -	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) +
> reg.offset;
> -	*size = 0x1000;
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_get_queue_num(int did, uint32_t *queue_num)
> -{
> -	struct internal_list *list;
> -
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	*queue_num = list->internal->max_queues;
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_get_vdpa_features(int did, uint64_t *features)
> -{
> -	struct internal_list *list;
> -
> -	list = find_internal_resource_by_did(did);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device id: %d", did);
> -		return -1;
> -	}
> -
> -	*features = list->internal->features;
> -
> -	return 0;
> -}
> -
> -#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
> -		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
> -		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
> -		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
> -		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
> -		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
> -static int
> -ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
> -{
> -	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
> -	return 0;
> -}
> -
> -static struct rte_vdpa_dev_ops ifcvf_ops = {
> -	.get_queue_num = ifcvf_get_queue_num,
> -	.get_features = ifcvf_get_vdpa_features,
> -	.get_protocol_features = ifcvf_get_protocol_features,
> -	.dev_conf = ifcvf_dev_config,
> -	.dev_close = ifcvf_dev_close,
> -	.set_vring_state = NULL,
> -	.set_features = ifcvf_set_features,
> -	.migration_done = NULL,
> -	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
> -	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
> -	.get_notify_area = ifcvf_get_notify_area,
> -};
> -
> -static inline int
> -open_int(const char *key __rte_unused, const char *value, void
> *extra_args)
> -{
> -	uint16_t *n = extra_args;
> -
> -	if (value == NULL || extra_args == NULL)
> -		return -EINVAL;
> -
> -	*n = (uint16_t)strtoul(value, NULL, 0);
> -	if (*n == USHRT_MAX && errno == ERANGE)
> -		return -1;
> -
> -	return 0;
> -}
> -
> -static int
> -ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> -		struct rte_pci_device *pci_dev)
> -{
> -	uint64_t features;
> -	struct ifcvf_internal *internal = NULL;
> -	struct internal_list *list = NULL;
> -	int vdpa_mode = 0;
> -	int sw_fallback_lm = 0;
> -	struct rte_kvargs *kvlist = NULL;
> -	int ret = 0;
> -
> -	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> -		return 0;
> -
> -	if (!pci_dev->device.devargs)
> -		return 1;
> -
> -	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
> -			ifcvf_valid_arguments);
> -	if (kvlist == NULL)
> -		return 1;
> -
> -	/* probe only when vdpa mode is specified */
> -	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
> -		rte_kvargs_free(kvlist);
> -		return 1;
> -	}
> -
> -	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
> -			&vdpa_mode);
> -	if (ret < 0 || vdpa_mode == 0) {
> -		rte_kvargs_free(kvlist);
> -		return 1;
> -	}
> -
> -	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
> -	if (list == NULL)
> -		goto error;
> -
> -	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
> -	if (internal == NULL)
> -		goto error;
> -
> -	internal->pdev = pci_dev;
> -	rte_spinlock_init(&internal->lock);
> -
> -	if (ifcvf_vfio_setup(internal) < 0) {
> -		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
> -		goto error;
> -	}
> -
> -	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
> -		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
> -		goto error;
> -	}
> -
> -	internal->max_queues = IFCVF_MAX_QUEUES;
> -	features = ifcvf_get_features(&internal->hw);
> -	internal->features = (features &
> -		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
> -		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
> -		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
> -		(1ULL << VIRTIO_NET_F_STATUS) |
> -		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
> -		(1ULL << VHOST_F_LOG_ALL);
> -
> -	internal->dev_addr.pci_addr = pci_dev->addr;
> -	internal->dev_addr.type = PCI_ADDR;
> -	list->internal = internal;
> -
> -	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
> -		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
> -				&open_int, &sw_fallback_lm);
> -		if (ret < 0)
> -			goto error;
> -	}
> -	internal->sw_lm = sw_fallback_lm;
> -
> -	internal->did = rte_vdpa_register_device(&internal->dev_addr,
> -				&ifcvf_ops);
> -	if (internal->did < 0) {
> -		DRV_LOG(ERR, "failed to register device %s", pci_dev-
> >name);
> -		goto error;
> -	}
> -
> -	pthread_mutex_lock(&internal_list_lock);
> -	TAILQ_INSERT_TAIL(&internal_list, list, next);
> -	pthread_mutex_unlock(&internal_list_lock);
> -
> -	rte_atomic32_set(&internal->started, 1);
> -	update_datapath(internal);
> -
> -	rte_kvargs_free(kvlist);
> -	return 0;
> -
> -error:
> -	rte_kvargs_free(kvlist);
> -	rte_free(list);
> -	rte_free(internal);
> -	return -1;
> -}
> -
> -static int
> -ifcvf_pci_remove(struct rte_pci_device *pci_dev)
> -{
> -	struct ifcvf_internal *internal;
> -	struct internal_list *list;
> -
> -	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> -		return 0;
> -
> -	list = find_internal_resource_by_dev(pci_dev);
> -	if (list == NULL) {
> -		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
> -		return -1;
> -	}
> -
> -	internal = list->internal;
> -	rte_atomic32_set(&internal->started, 0);
> -	update_datapath(internal);
> -
> -	rte_pci_unmap_device(internal->pdev);
> -	rte_vfio_container_destroy(internal->vfio_container_fd);
> -	rte_vdpa_unregister_device(internal->did);
> -
> -	pthread_mutex_lock(&internal_list_lock);
> -	TAILQ_REMOVE(&internal_list, list, next);
> -	pthread_mutex_unlock(&internal_list_lock);
> -
> -	rte_free(list);
> -	rte_free(internal);
> -
> -	return 0;
> -}
> -
> -/*
> - * IFCVF has the same vendor ID and device ID as virtio net PCI
> - * device, with its specific subsystem vendor ID and device ID.
> - */
> -static const struct rte_pci_id pci_id_ifcvf_map[] = {
> -	{ .class_id = RTE_CLASS_ANY_ID,
> -	  .vendor_id = IFCVF_VENDOR_ID,
> -	  .device_id = IFCVF_DEVICE_ID,
> -	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> -	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
> -	},
> -
> -	{ .vendor_id = 0, /* sentinel */
> -	},
> -};
> -
> -static struct rte_pci_driver rte_ifcvf_vdpa = {
> -	.id_table = pci_id_ifcvf_map,
> -	.drv_flags = 0,
> -	.probe = ifcvf_pci_probe,
> -	.remove = ifcvf_pci_remove,
> -};
> -
> -RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
> -RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
> -RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
> -
> -RTE_INIT(ifcvf_vdpa_init_log)
> -{
> -	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
> -	if (ifcvf_vdpa_logtype >= 0)
> -		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
> -}
> diff --git a/drivers/net/ifc/meson.build b/drivers/net/ifc/meson.build
> deleted file mode 100644
> index adc9ed9..0000000
> --- a/drivers/net/ifc/meson.build
> +++ /dev/null
> @@ -1,9 +0,0 @@
> -# SPDX-License-Identifier: BSD-3-Clause
> -# Copyright(c) 2018 Intel Corporation
> -
> -build = dpdk_conf.has('RTE_LIBRTE_VHOST')
> -reason = 'missing dependency, DPDK vhost library'
> -allow_experimental_apis = true
> -sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
> -includes += include_directories('base')
> -deps += 'vhost'
> diff --git a/drivers/net/ifc/rte_pmd_ifc_version.map
> b/drivers/net/ifc/rte_pmd_ifc_version.map
> deleted file mode 100644
> index f9f17e4..0000000
> --- a/drivers/net/ifc/rte_pmd_ifc_version.map
> +++ /dev/null
> @@ -1,3 +0,0 @@
> -DPDK_20.0 {
> -	local: *;
> -};
> diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> index c300afb..b0ea8fe 100644
> --- a/drivers/net/meson.build
> +++ b/drivers/net/meson.build
> @@ -21,7 +21,6 @@ drivers = ['af_packet',
>  	'hns3',
>  	'iavf',
>  	'ice',
> -	'ifc',
>  	'ipn3ke',
>  	'ixgbe',
>  	'kni',
> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> index 82a2b70..27fec96 100644
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> @@ -5,4 +5,10 @@ include $(RTE_SDK)/mk/rte.vars.mk
> 
>  # DIRS-$(<configuration>) += <directory>
> 
> +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
> +ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
> +endif
> +endif # $(CONFIG_RTE_LIBRTE_VHOST)
> +
>  include $(RTE_SDK)/mk/rte.subdir.mk
> diff --git a/drivers/vdpa/ifc/Makefile b/drivers/vdpa/ifc/Makefile
> new file mode 100644
> index 0000000..fe227b8
> --- /dev/null
> +++ b/drivers/vdpa/ifc/Makefile
> @@ -0,0 +1,34 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2018 Intel Corporation
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_ifc.a
> +
> +LDLIBS += -lpthread
> +LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
> +LDLIBS += -lrte_kvargs
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +
> +#
> +# Add extra flags for base driver source files to disable warnings in them
> +#
> +BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard
> $(SRCDIR)/base/*.c))))
> +
> +VPATH += $(SRCDIR)/base
> +
> +EXPORT_MAP := rte_pmd_ifc_version.map
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf_vdpa.c
> +SRCS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifcvf.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> new file mode 100644
> index 0000000..3c0b2df
> --- /dev/null
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -0,0 +1,329 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#include "ifcvf.h"
> +#include "ifcvf_osdep.h"
> +
> +STATIC void *
> +get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
> +{
> +	u8 bar = cap->bar;
> +	u32 length = cap->length;
> +	u32 offset = cap->offset;
> +
> +	if (bar > IFCVF_PCI_MAX_RESOURCE - 1) {
> +		DEBUGOUT("invalid bar: %u\n", bar);
> +		return NULL;
> +	}
> +
> +	if (offset + length < offset) {
> +		DEBUGOUT("offset(%u) + length(%u) overflows\n",
> +			offset, length);
> +		return NULL;
> +	}
> +
> +	if (offset + length > hw->mem_resource[cap->bar].len) {
> +		DEBUGOUT("offset(%u) + length(%u) overflows bar
> length(%u)",
> +			offset, length, (u32)hw->mem_resource[cap-
> >bar].len);
> +		return NULL;
> +	}
> +
> +	return hw->mem_resource[bar].addr + offset;
> +}
> +
> +int
> +ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev)
> +{
> +	int ret;
> +	u8 pos;
> +	struct ifcvf_pci_cap cap;
> +
> +	ret = PCI_READ_CONFIG_BYTE(dev, &pos, PCI_CAPABILITY_LIST);
> +	if (ret < 0) {
> +		DEBUGOUT("failed to read pci capability list\n");
> +		return -1;
> +	}
> +
> +	while (pos) {
> +		ret = PCI_READ_CONFIG_RANGE(dev, (u32 *)&cap,
> +				sizeof(cap), pos);
> +		if (ret < 0) {
> +			DEBUGOUT("failed to read cap at pos: %x", pos);
> +			break;
> +		}
> +
> +		if (cap.cap_vndr != PCI_CAP_ID_VNDR)
> +			goto next;
> +
> +		DEBUGOUT("cfg type: %u, bar: %u, offset: %u, "
> +				"len: %u\n", cap.cfg_type, cap.bar,
> +				cap.offset, cap.length);
> +
> +		switch (cap.cfg_type) {
> +		case IFCVF_PCI_CAP_COMMON_CFG:
> +			hw->common_cfg = get_cap_addr(hw, &cap);
> +			break;
> +		case IFCVF_PCI_CAP_NOTIFY_CFG:
> +			PCI_READ_CONFIG_DWORD(dev, &hw-
> >notify_off_multiplier,
> +					pos + sizeof(cap));
> +			hw->notify_base = get_cap_addr(hw, &cap);
> +			hw->notify_region = cap.bar;
> +			break;
> +		case IFCVF_PCI_CAP_ISR_CFG:
> +			hw->isr = get_cap_addr(hw, &cap);
> +			break;
> +		case IFCVF_PCI_CAP_DEVICE_CFG:
> +			hw->dev_cfg = get_cap_addr(hw, &cap);
> +			break;
> +		}
> +next:
> +		pos = cap.cap_next;
> +	}
> +
> +	hw->lm_cfg = hw->mem_resource[4].addr;
> +
> +	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
> +			hw->isr == NULL || hw->dev_cfg == NULL) {
> +		DEBUGOUT("capability incomplete\n");
> +		return -1;
> +	}
> +
> +	DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
> +			"notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
> +			"multiplier: %u\n",
> +			hw->common_cfg, hw->dev_cfg,
> +			hw->isr, hw->notify_base,
> +			hw->notify_off_multiplier);
> +
> +	return 0;
> +}
> +
> +STATIC u8
> +ifcvf_get_status(struct ifcvf_hw *hw)
> +{
> +	return IFCVF_READ_REG8(&hw->common_cfg->device_status);
> +}
> +
> +STATIC void
> +ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
> +{
> +	IFCVF_WRITE_REG8(status, &hw->common_cfg->device_status);
> +}
> +
> +STATIC void
> +ifcvf_reset(struct ifcvf_hw *hw)
> +{
> +	ifcvf_set_status(hw, 0);
> +
> +	/* flush status write */
> +	while (ifcvf_get_status(hw))
> +		msec_delay(1);
> +}
> +
> +STATIC void
> +ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
> +{
> +	if (status != 0)
> +		status |= ifcvf_get_status(hw);
> +
> +	ifcvf_set_status(hw, status);
> +	ifcvf_get_status(hw);
> +}
> +
> +u64
> +ifcvf_get_features(struct ifcvf_hw *hw)
> +{
> +	u32 features_lo, features_hi;
> +	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> +
> +	IFCVF_WRITE_REG32(0, &cfg->device_feature_select);
> +	features_lo = IFCVF_READ_REG32(&cfg->device_feature);
> +
> +	IFCVF_WRITE_REG32(1, &cfg->device_feature_select);
> +	features_hi = IFCVF_READ_REG32(&cfg->device_feature);
> +
> +	return ((u64)features_hi << 32) | features_lo;
> +}
> +
> +STATIC void
> +ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
> +{
> +	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> +
> +	IFCVF_WRITE_REG32(0, &cfg->guest_feature_select);
> +	IFCVF_WRITE_REG32(features & ((1ULL << 32) - 1), &cfg-
> >guest_feature);
> +
> +	IFCVF_WRITE_REG32(1, &cfg->guest_feature_select);
> +	IFCVF_WRITE_REG32(features >> 32, &cfg->guest_feature);
> +}
> +
> +STATIC int
> +ifcvf_config_features(struct ifcvf_hw *hw)
> +{
> +	u64 host_features;
> +
> +	host_features = ifcvf_get_features(hw);
> +	hw->req_features &= host_features;
> +
> +	ifcvf_set_features(hw, hw->req_features);
> +	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_FEATURES_OK);
> +
> +	if (!(ifcvf_get_status(hw) &
> IFCVF_CONFIG_STATUS_FEATURES_OK)) {
> +		DEBUGOUT("failed to set FEATURES_OK status\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +STATIC void
> +io_write64_twopart(u64 val, u32 *lo, u32 *hi)
> +{
> +	IFCVF_WRITE_REG32(val & ((1ULL << 32) - 1), lo);
> +	IFCVF_WRITE_REG32(val >> 32, hi);
> +}
> +
> +STATIC int
> +ifcvf_hw_enable(struct ifcvf_hw *hw)
> +{
> +	struct ifcvf_pci_common_cfg *cfg;
> +	u8 *lm_cfg;
> +	u32 i;
> +	u16 notify_off;
> +
> +	cfg = hw->common_cfg;
> +	lm_cfg = hw->lm_cfg;
> +
> +	IFCVF_WRITE_REG16(0, &cfg->msix_config);
> +	if (IFCVF_READ_REG16(&cfg->msix_config) ==
> IFCVF_MSI_NO_VECTOR) {
> +		DEBUGOUT("msix vec alloc failed for device config\n");
> +		return -1;
> +	}
> +
> +	for (i = 0; i < hw->nr_vring; i++) {
> +		IFCVF_WRITE_REG16(i, &cfg->queue_select);
> +		io_write64_twopart(hw->vring[i].desc, &cfg-
> >queue_desc_lo,
> +				&cfg->queue_desc_hi);
> +		io_write64_twopart(hw->vring[i].avail, &cfg-
> >queue_avail_lo,
> +				&cfg->queue_avail_hi);
> +		io_write64_twopart(hw->vring[i].used, &cfg-
> >queue_used_lo,
> +				&cfg->queue_used_hi);
> +		IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
> +
> +		*(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
> +				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
> +			(u32)hw->vring[i].last_avail_idx |
> +			((u32)hw->vring[i].last_used_idx << 16);
> +
> +		IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
> +		if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
> +				IFCVF_MSI_NO_VECTOR) {
> +			DEBUGOUT("queue %u, msix vec alloc failed\n",
> +					i);
> +			return -1;
> +		}
> +
> +		notify_off = IFCVF_READ_REG16(&cfg->queue_notify_off);
> +		hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
> +				notify_off * hw->notify_off_multiplier);
> +		IFCVF_WRITE_REG16(1, &cfg->queue_enable);
> +	}
> +
> +	return 0;
> +}
> +
> +STATIC void
> +ifcvf_hw_disable(struct ifcvf_hw *hw)
> +{
> +	u32 i;
> +	struct ifcvf_pci_common_cfg *cfg;
> +	u32 ring_state;
> +
> +	cfg = hw->common_cfg;
> +
> +	IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
> +	for (i = 0; i < hw->nr_vring; i++) {
> +		IFCVF_WRITE_REG16(i, &cfg->queue_select);
> +		IFCVF_WRITE_REG16(0, &cfg->queue_enable);
> +		IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg-
> >queue_msix_vector);
> +		ring_state = *(u32 *)(hw->lm_cfg +
> IFCVF_LM_RING_STATE_OFFSET +
> +				(i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
> +		hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
> +		hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
> +	}
> +}
> +
> +int
> +ifcvf_start_hw(struct ifcvf_hw *hw)
> +{
> +	ifcvf_reset(hw);
> +	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_ACK);
> +	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER);
> +
> +	if (ifcvf_config_features(hw) < 0)
> +		return -1;
> +
> +	if (ifcvf_hw_enable(hw) < 0)
> +		return -1;
> +
> +	ifcvf_add_status(hw, IFCVF_CONFIG_STATUS_DRIVER_OK);
> +	return 0;
> +}
> +
> +void
> +ifcvf_stop_hw(struct ifcvf_hw *hw)
> +{
> +	ifcvf_hw_disable(hw);
> +	ifcvf_reset(hw);
> +}
> +
> +void
> +ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
> +{
> +	u8 *lm_cfg;
> +
> +	lm_cfg = hw->lm_cfg;
> +
> +	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
> +		log_base & IFCVF_32_BIT_MASK;
> +
> +	*(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
> +		(log_base >> 32) & IFCVF_32_BIT_MASK;
> +
> +	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
> +		(log_base + log_size) & IFCVF_32_BIT_MASK;
> +
> +	*(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
> +		((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
> +
> +	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) =
> IFCVF_LM_ENABLE_VF;
> +}
> +
> +void
> +ifcvf_disable_logging(struct ifcvf_hw *hw)
> +{
> +	u8 *lm_cfg;
> +
> +	lm_cfg = hw->lm_cfg;
> +	*(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
> +}
> +
> +void
> +ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
> +{
> +	IFCVF_WRITE_REG16(qid, hw->notify_addr[qid]);
> +}
> +
> +u8
> +ifcvf_get_notify_region(struct ifcvf_hw *hw)
> +{
> +	return hw->notify_region;
> +}
> +
> +u64
> +ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
> +{
> +	return (u8 *)hw->notify_addr[qid] -
> +		(u8 *)hw->mem_resource[hw->notify_region].addr;
> +}
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> new file mode 100644
> index 0000000..9be2770
> --- /dev/null
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -0,0 +1,162 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#ifndef _IFCVF_H_
> +#define _IFCVF_H_
> +
> +#include "ifcvf_osdep.h"
> +
> +#define IFCVF_VENDOR_ID		0x1AF4
> +#define IFCVF_DEVICE_ID		0x1041
> +#define IFCVF_SUBSYS_VENDOR_ID	0x8086
> +#define IFCVF_SUBSYS_DEVICE_ID	0x001A
> +
> +#define IFCVF_MAX_QUEUES		1
> +#define VIRTIO_F_IOMMU_PLATFORM		33
> +
> +/* Common configuration */
> +#define IFCVF_PCI_CAP_COMMON_CFG	1
> +/* Notifications */
> +#define IFCVF_PCI_CAP_NOTIFY_CFG	2
> +/* ISR Status */
> +#define IFCVF_PCI_CAP_ISR_CFG		3
> +/* Device specific configuration */
> +#define IFCVF_PCI_CAP_DEVICE_CFG	4
> +/* PCI configuration access */
> +#define IFCVF_PCI_CAP_PCI_CFG		5
> +
> +#define IFCVF_CONFIG_STATUS_RESET     0x00
> +#define IFCVF_CONFIG_STATUS_ACK       0x01
> +#define IFCVF_CONFIG_STATUS_DRIVER    0x02
> +#define IFCVF_CONFIG_STATUS_DRIVER_OK 0x04
> +#define IFCVF_CONFIG_STATUS_FEATURES_OK 0x08
> +#define IFCVF_CONFIG_STATUS_FAILED    0x80
> +
> +#define IFCVF_MSI_NO_VECTOR	0xffff
> +#define IFCVF_PCI_MAX_RESOURCE	6
> +
> +#define IFCVF_LM_CFG_SIZE		0x40
> +#define IFCVF_LM_RING_STATE_OFFSET	0x20
> +
> +#define IFCVF_LM_LOGGING_CTRL		0x0
> +
> +#define IFCVF_LM_BASE_ADDR_LOW		0x10
> +#define IFCVF_LM_BASE_ADDR_HIGH		0x14
> +#define IFCVF_LM_END_ADDR_LOW		0x18
> +#define IFCVF_LM_END_ADDR_HIGH		0x1c
> +
> +#define IFCVF_LM_DISABLE		0x0
> +#define IFCVF_LM_ENABLE_VF		0x1
> +#define IFCVF_LM_ENABLE_PF		0x3
> +#define IFCVF_LOG_BASE			0x100000000000
> +#define IFCVF_MEDIATED_VRING		0x200000000000
> +
> +#define IFCVF_32_BIT_MASK		0xffffffff
> +
> +
> +struct ifcvf_pci_cap {
> +	u8 cap_vndr;            /* Generic PCI field: PCI_CAP_ID_VNDR */
> +	u8 cap_next;            /* Generic PCI field: next ptr. */
> +	u8 cap_len;             /* Generic PCI field: capability length */
> +	u8 cfg_type;            /* Identifies the structure. */
> +	u8 bar;                 /* Where to find it. */
> +	u8 padding[3];          /* Pad to full dword. */
> +	u32 offset;             /* Offset within bar. */
> +	u32 length;             /* Length of the structure, in bytes. */
> +};
> +
> +struct ifcvf_pci_notify_cap {
> +	struct ifcvf_pci_cap cap;
> +	u32 notify_off_multiplier;  /* Multiplier for queue_notify_off. */
> +};
> +
> +struct ifcvf_pci_common_cfg {
> +	/* About the whole device. */
> +	u32 device_feature_select;
> +	u32 device_feature;
> +	u32 guest_feature_select;
> +	u32 guest_feature;
> +	u16 msix_config;
> +	u16 num_queues;
> +	u8 device_status;
> +	u8 config_generation;
> +
> +	/* About a specific virtqueue. */
> +	u16 queue_select;
> +	u16 queue_size;
> +	u16 queue_msix_vector;
> +	u16 queue_enable;
> +	u16 queue_notify_off;
> +	u32 queue_desc_lo;
> +	u32 queue_desc_hi;
> +	u32 queue_avail_lo;
> +	u32 queue_avail_hi;
> +	u32 queue_used_lo;
> +	u32 queue_used_hi;
> +};
> +
> +struct ifcvf_net_config {
> +	u8    mac[6];
> +	u16   status;
> +	u16   max_virtqueue_pairs;
> +} __attribute__((packed));
> +
> +struct ifcvf_pci_mem_resource {
> +	u64      phys_addr; /**< Physical address, 0 if not resource. */
> +	u64      len;       /**< Length of the resource. */
> +	u8       *addr;     /**< Virtual address, NULL when not mapped. */
> +};
> +
> +struct vring_info {
> +	u64 desc;
> +	u64 avail;
> +	u64 used;
> +	u16 size;
> +	u16 last_avail_idx;
> +	u16 last_used_idx;
> +};
> +
> +struct ifcvf_hw {
> +	u64    req_features;
> +	u8     notify_region;
> +	u32    notify_off_multiplier;
> +	struct ifcvf_pci_common_cfg *common_cfg;
> +	struct ifcvf_net_config *dev_cfg;
> +	u8     *isr;
> +	u16    *notify_base;
> +	u16    *notify_addr[IFCVF_MAX_QUEUES * 2];
> +	u8     *lm_cfg;
> +	struct vring_info vring[IFCVF_MAX_QUEUES * 2];
> +	u8 nr_vring;
> +	struct ifcvf_pci_mem_resource
> mem_resource[IFCVF_PCI_MAX_RESOURCE];
> +};
> +
> +int
> +ifcvf_init_hw(struct ifcvf_hw *hw, PCI_DEV *dev);
> +
> +u64
> +ifcvf_get_features(struct ifcvf_hw *hw);
> +
> +int
> +ifcvf_start_hw(struct ifcvf_hw *hw);
> +
> +void
> +ifcvf_stop_hw(struct ifcvf_hw *hw);
> +
> +void
> +ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
> +
> +void
> +ifcvf_disable_logging(struct ifcvf_hw *hw);
> +
> +void
> +ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
> +
> +u8
> +ifcvf_get_notify_region(struct ifcvf_hw *hw);
> +
> +u64
> +ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
> +
> +#endif /* _IFCVF_H_ */
> diff --git a/drivers/vdpa/ifc/base/ifcvf_osdep.h
> b/drivers/vdpa/ifc/base/ifcvf_osdep.h
> new file mode 100644
> index 0000000..6aef25e
> --- /dev/null
> +++ b/drivers/vdpa/ifc/base/ifcvf_osdep.h
> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#ifndef _IFCVF_OSDEP_H_
> +#define _IFCVF_OSDEP_H_
> +
> +#include <stdint.h>
> +#include <linux/pci_regs.h>
> +
> +#include <rte_cycles.h>
> +#include <rte_pci.h>
> +#include <rte_bus_pci.h>
> +#include <rte_log.h>
> +#include <rte_io.h>
> +
> +#define DEBUGOUT(S, args...)    RTE_LOG(DEBUG, PMD, S, ##args)
> +#define STATIC                  static
> +
> +#define msec_delay(x)	rte_delay_us_sleep(1000 * (x))
> +
> +#define IFCVF_READ_REG8(reg)		rte_read8(reg)
> +#define IFCVF_WRITE_REG8(val, reg)	rte_write8((val), (reg))
> +#define IFCVF_READ_REG16(reg)		rte_read16(reg)
> +#define IFCVF_WRITE_REG16(val, reg)	rte_write16((val), (reg))
> +#define IFCVF_READ_REG32(reg)		rte_read32(reg)
> +#define IFCVF_WRITE_REG32(val, reg)	rte_write32((val), (reg))
> +
> +typedef struct rte_pci_device PCI_DEV;
> +
> +#define PCI_READ_CONFIG_BYTE(dev, val, where) \
> +	rte_pci_read_config(dev, val, 1, where)
> +
> +#define PCI_READ_CONFIG_DWORD(dev, val, where) \
> +	rte_pci_read_config(dev, val, 4, where)
> +
> +typedef uint8_t    u8;
> +typedef int8_t     s8;
> +typedef uint16_t   u16;
> +typedef int16_t    s16;
> +typedef uint32_t   u32;
> +typedef int32_t    s32;
> +typedef int64_t    s64;
> +typedef uint64_t   u64;
> +
> +static inline int
> +PCI_READ_CONFIG_RANGE(PCI_DEV *dev, uint32_t *val, int size, int
> where)
> +{
> +	return rte_pci_read_config(dev, val, size, where);
> +}
> +
> +#endif /* _IFCVF_OSDEP_H_ */
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> new file mode 100644
> index 0000000..da4667b
> --- /dev/null
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -0,0 +1,1280 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#include <unistd.h>
> +#include <pthread.h>
> +#include <fcntl.h>
> +#include <string.h>
> +#include <sys/ioctl.h>
> +#include <sys/epoll.h>
> +#include <linux/virtio_net.h>
> +#include <stdbool.h>
> +
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_bus_pci.h>
> +#include <rte_vhost.h>
> +#include <rte_vdpa.h>
> +#include <rte_vfio.h>
> +#include <rte_spinlock.h>
> +#include <rte_log.h>
> +#include <rte_kvargs.h>
> +#include <rte_devargs.h>
> +
> +#include "base/ifcvf.h"
> +
> +#define DRV_LOG(level, fmt, args...) \
> +	rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
> +		"IFCVF %s(): " fmt "\n", __func__, ##args)
> +
> +#ifndef PAGE_SIZE
> +#define PAGE_SIZE 4096
> +#endif
> +
> +#define IFCVF_USED_RING_LEN(size) \
> +	((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3)
> +
> +#define IFCVF_VDPA_MODE		"vdpa"
> +#define IFCVF_SW_FALLBACK_LM	"sw-live-migration"
> +
> +static const char * const ifcvf_valid_arguments[] = {
> +	IFCVF_VDPA_MODE,
> +	IFCVF_SW_FALLBACK_LM,
> +	NULL
> +};
> +
> +static int ifcvf_vdpa_logtype;
> +
> +struct ifcvf_internal {
> +	struct rte_vdpa_dev_addr dev_addr;
> +	struct rte_pci_device *pdev;
> +	struct ifcvf_hw hw;
> +	int vfio_container_fd;
> +	int vfio_group_fd;
> +	int vfio_dev_fd;
> +	pthread_t tid;	/* thread for notify relay */
> +	int epfd;
> +	int vid;
> +	int did;
> +	uint16_t max_queues;
> +	uint64_t features;
> +	rte_atomic32_t started;
> +	rte_atomic32_t dev_attached;
> +	rte_atomic32_t running;
> +	rte_spinlock_t lock;
> +	bool sw_lm;
> +	bool sw_fallback_running;
> +	/* mediated vring for sw fallback */
> +	struct vring m_vring[IFCVF_MAX_QUEUES * 2];
> +	/* eventfd for used ring interrupt */
> +	int intr_fd[IFCVF_MAX_QUEUES * 2];
> +};
> +
> +struct internal_list {
> +	TAILQ_ENTRY(internal_list) next;
> +	struct ifcvf_internal *internal;
> +};
> +
> +TAILQ_HEAD(internal_list_head, internal_list);
> +static struct internal_list_head internal_list =
> +	TAILQ_HEAD_INITIALIZER(internal_list);
> +
> +static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
> +
> +static void update_used_ring(struct ifcvf_internal *internal, uint16_t qid);
> +
> +static struct internal_list *
> +find_internal_resource_by_did(int did)
> +{
> +	int found = 0;
> +	struct internal_list *list;
> +
> +	pthread_mutex_lock(&internal_list_lock);
> +
> +	TAILQ_FOREACH(list, &internal_list, next) {
> +		if (did == list->internal->did) {
> +			found = 1;
> +			break;
> +		}
> +	}
> +
> +	pthread_mutex_unlock(&internal_list_lock);
> +
> +	if (!found)
> +		return NULL;
> +
> +	return list;
> +}
> +
> +static struct internal_list *
> +find_internal_resource_by_dev(struct rte_pci_device *pdev)
> +{
> +	int found = 0;
> +	struct internal_list *list;
> +
> +	pthread_mutex_lock(&internal_list_lock);
> +
> +	TAILQ_FOREACH(list, &internal_list, next) {
> +		if (pdev == list->internal->pdev) {
> +			found = 1;
> +			break;
> +		}
> +	}
> +
> +	pthread_mutex_unlock(&internal_list_lock);
> +
> +	if (!found)
> +		return NULL;
> +
> +	return list;
> +}
> +
> +static int
> +ifcvf_vfio_setup(struct ifcvf_internal *internal)
> +{
> +	struct rte_pci_device *dev = internal->pdev;
> +	char devname[RTE_DEV_NAME_MAX_LEN] = {0};
> +	int iommu_group_num;
> +	int i, ret;
> +
> +	internal->vfio_dev_fd = -1;
> +	internal->vfio_group_fd = -1;
> +	internal->vfio_container_fd = -1;
> +
> +	rte_pci_device_name(&dev->addr, devname,
> RTE_DEV_NAME_MAX_LEN);
> +	ret = rte_vfio_get_group_num(rte_pci_get_sysfs_path(), devname,
> +			&iommu_group_num);
> +	if (ret <= 0) {
> +		DRV_LOG(ERR, "%s failed to get IOMMU group", devname);
> +		return -1;
> +	}
> +
> +	internal->vfio_container_fd = rte_vfio_container_create();
> +	if (internal->vfio_container_fd < 0)
> +		return -1;
> +
> +	internal->vfio_group_fd = rte_vfio_container_group_bind(
> +			internal->vfio_container_fd, iommu_group_num);
> +	if (internal->vfio_group_fd < 0)
> +		goto err;
> +
> +	if (rte_pci_map_device(dev))
> +		goto err;
> +
> +	internal->vfio_dev_fd = dev->intr_handle.vfio_dev_fd;
> +
> +	for (i = 0; i < RTE_MIN(PCI_MAX_RESOURCE,
> IFCVF_PCI_MAX_RESOURCE);
> +			i++) {
> +		internal->hw.mem_resource[i].addr =
> +			internal->pdev->mem_resource[i].addr;
> +		internal->hw.mem_resource[i].phys_addr =
> +			internal->pdev->mem_resource[i].phys_addr;
> +		internal->hw.mem_resource[i].len =
> +			internal->pdev->mem_resource[i].len;
> +	}
> +
> +	return 0;
> +
> +err:
> +	rte_vfio_container_destroy(internal->vfio_container_fd);
> +	return -1;
> +}
> +
> +static int
> +ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
> +{
> +	uint32_t i;
> +	int ret;
> +	struct rte_vhost_memory *mem = NULL;
> +	int vfio_container_fd;
> +
> +	ret = rte_vhost_get_mem_table(internal->vid, &mem);
> +	if (ret < 0) {
> +		DRV_LOG(ERR, "failed to get VM memory layout.");
> +		goto exit;
> +	}
> +
> +	vfio_container_fd = internal->vfio_container_fd;
> +
> +	for (i = 0; i < mem->nregions; i++) {
> +		struct rte_vhost_mem_region *reg;
> +
> +		reg = &mem->regions[i];
> +		DRV_LOG(INFO, "%s, region %u: HVA 0x%" PRIx64 ", "
> +			"GPA 0x%" PRIx64 ", size 0x%" PRIx64 ".",
> +			do_map ? "DMA map" : "DMA unmap", i,
> +			reg->host_user_addr, reg->guest_phys_addr, reg-
> >size);
> +
> +		if (do_map) {
> +			ret =
> rte_vfio_container_dma_map(vfio_container_fd,
> +				reg->host_user_addr, reg-
> >guest_phys_addr,
> +				reg->size);
> +			if (ret < 0) {
> +				DRV_LOG(ERR, "DMA map failed.");
> +				goto exit;
> +			}
> +		} else {
> +			ret =
> rte_vfio_container_dma_unmap(vfio_container_fd,
> +				reg->host_user_addr, reg-
> >guest_phys_addr,
> +				reg->size);
> +			if (ret < 0) {
> +				DRV_LOG(ERR, "DMA unmap failed.");
> +				goto exit;
> +			}
> +		}
> +	}
> +
> +exit:
> +	if (mem)
> +		free(mem);
> +	return ret;
> +}
> +
> +static uint64_t
> +hva_to_gpa(int vid, uint64_t hva)
> +{
> +	struct rte_vhost_memory *mem = NULL;
> +	struct rte_vhost_mem_region *reg;
> +	uint32_t i;
> +	uint64_t gpa = 0;
> +
> +	if (rte_vhost_get_mem_table(vid, &mem) < 0)
> +		goto exit;
> +
> +	for (i = 0; i < mem->nregions; i++) {
> +		reg = &mem->regions[i];
> +
> +		if (hva >= reg->host_user_addr &&
> +				hva < reg->host_user_addr + reg->size) {
> +			gpa = hva - reg->host_user_addr + reg-
> >guest_phys_addr;
> +			break;
> +		}
> +	}
> +
> +exit:
> +	if (mem)
> +		free(mem);
> +	return gpa;
> +}
> +
> +static int
> +vdpa_ifcvf_start(struct ifcvf_internal *internal)
> +{
> +	struct ifcvf_hw *hw = &internal->hw;
> +	int i, nr_vring;
> +	int vid;
> +	struct rte_vhost_vring vq;
> +	uint64_t gpa;
> +
> +	vid = internal->vid;
> +	nr_vring = rte_vhost_get_vring_num(vid);
> +	rte_vhost_get_negotiated_features(vid, &hw->req_features);
> +
> +	for (i = 0; i < nr_vring; i++) {
> +		rte_vhost_get_vhost_vring(vid, i, &vq);
> +		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
> +		if (gpa == 0) {
> +			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
> +			return -1;
> +		}
> +		hw->vring[i].desc = gpa;
> +
> +		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
> +		if (gpa == 0) {
> +			DRV_LOG(ERR, "Fail to get GPA for available ring.");
> +			return -1;
> +		}
> +		hw->vring[i].avail = gpa;
> +
> +		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
> +		if (gpa == 0) {
> +			DRV_LOG(ERR, "Fail to get GPA for used ring.");
> +			return -1;
> +		}
> +		hw->vring[i].used = gpa;
> +
> +		hw->vring[i].size = vq.size;
> +		rte_vhost_get_vring_base(vid, i, &hw-
> >vring[i].last_avail_idx,
> +				&hw->vring[i].last_used_idx);
> +	}
> +	hw->nr_vring = i;
> +
> +	return ifcvf_start_hw(&internal->hw);
> +}
> +
> +static void
> +vdpa_ifcvf_stop(struct ifcvf_internal *internal)
> +{
> +	struct ifcvf_hw *hw = &internal->hw;
> +	uint32_t i;
> +	int vid;
> +	uint64_t features = 0;
> +	uint64_t log_base = 0, log_size = 0;
> +	uint64_t len;
> +
> +	vid = internal->vid;
> +	ifcvf_stop_hw(hw);
> +
> +	for (i = 0; i < hw->nr_vring; i++)
> +		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
> +				hw->vring[i].last_used_idx);
> +
> +	if (internal->sw_lm)
> +		return;
> +
> +	rte_vhost_get_negotiated_features(vid, &features);
> +	if (RTE_VHOST_NEED_LOG(features)) {
> +		ifcvf_disable_logging(hw);
> +		rte_vhost_get_log_base(internal->vid, &log_base,
> &log_size);
> +		rte_vfio_container_dma_unmap(internal-
> >vfio_container_fd,
> +				log_base, IFCVF_LOG_BASE, log_size);
> +		/*
> +		 * IFCVF marks dirty memory pages for only packet buffer,
> +		 * SW helps to mark the used ring as dirty after device stops.
> +		 */
> +		for (i = 0; i < hw->nr_vring; i++) {
> +			len = IFCVF_USED_RING_LEN(hw->vring[i].size);
> +			rte_vhost_log_used_vring(vid, i, 0, len);
> +		}
> +	}
> +}
> +
> +#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
> +		sizeof(int) * (IFCVF_MAX_QUEUES * 2 + 1))
> +static int
> +vdpa_enable_vfio_intr(struct ifcvf_internal *internal, bool m_rx)
> +{
> +	int ret;
> +	uint32_t i, nr_vring;
> +	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
> +	struct vfio_irq_set *irq_set;
> +	int *fd_ptr;
> +	struct rte_vhost_vring vring;
> +	int fd;
> +
> +	vring.callfd = -1;
> +
> +	nr_vring = rte_vhost_get_vring_num(internal->vid);
> +
> +	irq_set = (struct vfio_irq_set *)irq_set_buf;
> +	irq_set->argsz = sizeof(irq_set_buf);
> +	irq_set->count = nr_vring + 1;
> +	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
> +			 VFIO_IRQ_SET_ACTION_TRIGGER;
> +	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
> +	irq_set->start = 0;
> +	fd_ptr = (int *)&irq_set->data;
> +	fd_ptr[RTE_INTR_VEC_ZERO_OFFSET] = internal->pdev-
> >intr_handle.fd;
> +
> +	for (i = 0; i < nr_vring; i++)
> +		internal->intr_fd[i] = -1;
> +
> +	for (i = 0; i < nr_vring; i++) {
> +		rte_vhost_get_vhost_vring(internal->vid, i, &vring);
> +		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
> +		if ((i & 1) == 0 && m_rx == true) {
> +			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
> +			if (fd < 0) {
> +				DRV_LOG(ERR, "can't setup eventfd: %s",
> +					strerror(errno));
> +				return -1;
> +			}
> +			internal->intr_fd[i] = fd;
> +			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
> +		}
> +	}
> +
> +	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +	if (ret) {
> +		DRV_LOG(ERR, "Error enabling MSI-X interrupts: %s",
> +				strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +vdpa_disable_vfio_intr(struct ifcvf_internal *internal)
> +{
> +	int ret;
> +	uint32_t i, nr_vring;
> +	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
> +	struct vfio_irq_set *irq_set;
> +
> +	irq_set = (struct vfio_irq_set *)irq_set_buf;
> +	irq_set->argsz = sizeof(irq_set_buf);
> +	irq_set->count = 0;
> +	irq_set->flags = VFIO_IRQ_SET_DATA_NONE |
> VFIO_IRQ_SET_ACTION_TRIGGER;
> +	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
> +	irq_set->start = 0;
> +
> +	nr_vring = rte_vhost_get_vring_num(internal->vid);
> +	for (i = 0; i < nr_vring; i++) {
> +		if (internal->intr_fd[i] >= 0)
> +			close(internal->intr_fd[i]);
> +		internal->intr_fd[i] = -1;
> +	}
> +
> +	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +	if (ret) {
> +		DRV_LOG(ERR, "Error disabling MSI-X interrupts: %s",
> +				strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static void *
> +notify_relay(void *arg)
> +{
> +	int i, kickfd, epfd, nfds = 0;
> +	uint32_t qid, q_num;
> +	struct epoll_event events[IFCVF_MAX_QUEUES * 2];
> +	struct epoll_event ev;
> +	uint64_t buf;
> +	int nbytes;
> +	struct rte_vhost_vring vring;
> +	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> +	struct ifcvf_hw *hw = &internal->hw;
> +
> +	q_num = rte_vhost_get_vring_num(internal->vid);
> +
> +	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> +	if (epfd < 0) {
> +		DRV_LOG(ERR, "failed to create epoll instance.");
> +		return NULL;
> +	}
> +	internal->epfd = epfd;
> +
> +	vring.kickfd = -1;
> +	for (qid = 0; qid < q_num; qid++) {
> +		ev.events = EPOLLIN | EPOLLPRI;
> +		rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
> +		ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
> +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> +			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> +			return NULL;
> +		}
> +	}
> +
> +	for (;;) {
> +		nfds = epoll_wait(epfd, events, q_num, -1);
> +		if (nfds < 0) {
> +			if (errno == EINTR)
> +				continue;
> +			DRV_LOG(ERR, "epoll_wait return fail\n");
> +			return NULL;
> +		}
> +
> +		for (i = 0; i < nfds; i++) {
> +			qid = events[i].data.u32;
> +			kickfd = (uint32_t)(events[i].data.u64 >> 32);
> +			do {
> +				nbytes = read(kickfd, &buf, 8);
> +				if (nbytes < 0) {
> +					if (errno == EINTR ||
> +					    errno == EWOULDBLOCK ||
> +					    errno == EAGAIN)
> +						continue;
> +					DRV_LOG(INFO, "Error reading "
> +						"kickfd: %s",
> +						strerror(errno));
> +				}
> +				break;
> +			} while (1);
> +
> +			ifcvf_notify_queue(hw, qid);
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static int
> +setup_notify_relay(struct ifcvf_internal *internal)
> +{
> +	int ret;
> +
> +	ret = pthread_create(&internal->tid, NULL, notify_relay,
> +			(void *)internal);
> +	if (ret) {
> +		DRV_LOG(ERR, "failed to create notify relay pthread.");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static int
> +unset_notify_relay(struct ifcvf_internal *internal)
> +{
> +	void *status;
> +
> +	if (internal->tid) {
> +		pthread_cancel(internal->tid);
> +		pthread_join(internal->tid, &status);
> +	}
> +	internal->tid = 0;
> +
> +	if (internal->epfd >= 0)
> +		close(internal->epfd);
> +	internal->epfd = -1;
> +
> +	return 0;
> +}
> +
> +static int
> +update_datapath(struct ifcvf_internal *internal)
> +{
> +	int ret;
> +
> +	rte_spinlock_lock(&internal->lock);
> +
> +	if (!rte_atomic32_read(&internal->running) &&
> +	    (rte_atomic32_read(&internal->started) &&
> +	     rte_atomic32_read(&internal->dev_attached))) {
> +		ret = ifcvf_dma_map(internal, 1);
> +		if (ret)
> +			goto err;
> +
> +		ret = vdpa_enable_vfio_intr(internal, 0);
> +		if (ret)
> +			goto err;
> +
> +		ret = vdpa_ifcvf_start(internal);
> +		if (ret)
> +			goto err;
> +
> +		ret = setup_notify_relay(internal);
> +		if (ret)
> +			goto err;
> +
> +		rte_atomic32_set(&internal->running, 1);
> +	} else if (rte_atomic32_read(&internal->running) &&
> +		   (!rte_atomic32_read(&internal->started) ||
> +		    !rte_atomic32_read(&internal->dev_attached))) {
> +		ret = unset_notify_relay(internal);
> +		if (ret)
> +			goto err;
> +
> +		vdpa_ifcvf_stop(internal);
> +
> +		ret = vdpa_disable_vfio_intr(internal);
> +		if (ret)
> +			goto err;
> +
> +		ret = ifcvf_dma_map(internal, 0);
> +		if (ret)
> +			goto err;
> +
> +		rte_atomic32_set(&internal->running, 0);
> +	}
> +
> +	rte_spinlock_unlock(&internal->lock);
> +	return 0;
> +err:
> +	rte_spinlock_unlock(&internal->lock);
> +	return ret;
> +}
> +
> +static int
> +m_ifcvf_start(struct ifcvf_internal *internal)
> +{
> +	struct ifcvf_hw *hw = &internal->hw;
> +	uint32_t i, nr_vring;
> +	int vid, ret;
> +	struct rte_vhost_vring vq;
> +	void *vring_buf;
> +	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
> +	uint64_t size;
> +	uint64_t gpa;
> +
> +	memset(&vq, 0, sizeof(vq));
> +	vid = internal->vid;
> +	nr_vring = rte_vhost_get_vring_num(vid);
> +	rte_vhost_get_negotiated_features(vid, &hw->req_features);
> +
> +	for (i = 0; i < nr_vring; i++) {
> +		rte_vhost_get_vhost_vring(vid, i, &vq);
> +
> +		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
> +				PAGE_SIZE);
> +		vring_buf = rte_zmalloc("ifcvf", size, PAGE_SIZE);
> +		vring_init(&internal->m_vring[i], vq.size, vring_buf,
> +				PAGE_SIZE);
> +
> +		ret = rte_vfio_container_dma_map(internal-
> >vfio_container_fd,
> +			(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
> +		if (ret < 0) {
> +			DRV_LOG(ERR, "mediated vring DMA map failed.");
> +			goto error;
> +		}
> +
> +		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
> +		if (gpa == 0) {
> +			DRV_LOG(ERR, "Fail to get GPA for descriptor ring.");
> +			return -1;
> +		}
> +		hw->vring[i].desc = gpa;
> +
> +		gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.avail);
> +		if (gpa == 0) {
> +			DRV_LOG(ERR, "Fail to get GPA for available ring.");
> +			return -1;
> +		}
> +		hw->vring[i].avail = gpa;
> +
> +		/* Direct I/O for Tx queue, relay for Rx queue */
> +		if (i & 1) {
> +			gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.used);
> +			if (gpa == 0) {
> +				DRV_LOG(ERR, "Fail to get GPA for used
> ring.");
> +				return -1;
> +			}
> +			hw->vring[i].used = gpa;
> +		} else {
> +			hw->vring[i].used = m_vring_iova +
> +				(char *)internal->m_vring[i].used -
> +				(char *)internal->m_vring[i].desc;
> +		}
> +
> +		hw->vring[i].size = vq.size;
> +
> +		rte_vhost_get_vring_base(vid, i,
> +				&internal->m_vring[i].avail->idx,
> +				&internal->m_vring[i].used->idx);
> +
> +		rte_vhost_get_vring_base(vid, i, &hw-
> >vring[i].last_avail_idx,
> +				&hw->vring[i].last_used_idx);
> +
> +		m_vring_iova += size;
> +	}
> +	hw->nr_vring = nr_vring;
> +
> +	return ifcvf_start_hw(&internal->hw);
> +
> +error:
> +	for (i = 0; i < nr_vring; i++)
> +		if (internal->m_vring[i].desc)
> +			rte_free(internal->m_vring[i].desc);
> +
> +	return -1;
> +}
> +
> +static int
> +m_ifcvf_stop(struct ifcvf_internal *internal)
> +{
> +	int vid;
> +	uint32_t i;
> +	struct rte_vhost_vring vq;
> +	struct ifcvf_hw *hw = &internal->hw;
> +	uint64_t m_vring_iova = IFCVF_MEDIATED_VRING;
> +	uint64_t size, len;
> +
> +	vid = internal->vid;
> +	ifcvf_stop_hw(hw);
> +
> +	for (i = 0; i < hw->nr_vring; i++) {
> +		/* synchronize remaining new used entries if any */
> +		if ((i & 1) == 0)
> +			update_used_ring(internal, i);
> +
> +		rte_vhost_get_vhost_vring(vid, i, &vq);
> +		len = IFCVF_USED_RING_LEN(vq.size);
> +		rte_vhost_log_used_vring(vid, i, 0, len);
> +
> +		size = RTE_ALIGN_CEIL(vring_size(vq.size, PAGE_SIZE),
> +				PAGE_SIZE);
> +		rte_vfio_container_dma_unmap(internal-
> >vfio_container_fd,
> +			(uint64_t)(uintptr_t)internal->m_vring[i].desc,
> +			m_vring_iova, size);
> +
> +		rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
> +				hw->vring[i].last_used_idx);
> +		rte_free(internal->m_vring[i].desc);
> +		m_vring_iova += size;
> +	}
> +
> +	return 0;
> +}
> +
> +static void
> +update_used_ring(struct ifcvf_internal *internal, uint16_t qid)
> +{
> +	rte_vdpa_relay_vring_used(internal->vid, qid, &internal-
> >m_vring[qid]);
> +	rte_vhost_vring_call(internal->vid, qid);
> +}
> +
> +static void *
> +vring_relay(void *arg)
> +{
> +	int i, vid, epfd, fd, nfds;
> +	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> +	struct rte_vhost_vring vring;
> +	uint16_t qid, q_num;
> +	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
> +	struct epoll_event ev;
> +	int nbytes;
> +	uint64_t buf;
> +
> +	vid = internal->vid;
> +	q_num = rte_vhost_get_vring_num(vid);
> +
> +	/* add notify fd and interrupt fd to epoll */
> +	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> +	if (epfd < 0) {
> +		DRV_LOG(ERR, "failed to create epoll instance.");
> +		return NULL;
> +	}
> +	internal->epfd = epfd;
> +
> +	vring.kickfd = -1;
> +	for (qid = 0; qid < q_num; qid++) {
> +		ev.events = EPOLLIN | EPOLLPRI;
> +		rte_vhost_get_vhost_vring(vid, qid, &vring);
> +		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
> +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> +			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> +			return NULL;
> +		}
> +	}
> +
> +	for (qid = 0; qid < q_num; qid += 2) {
> +		ev.events = EPOLLIN | EPOLLPRI;
> +		/* leave a flag to mark it's for interrupt */
> +		ev.data.u64 = 1 | qid << 1 |
> +			(uint64_t)internal->intr_fd[qid] << 32;
> +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, internal->intr_fd[qid],
> &ev)
> +				< 0) {
> +			DRV_LOG(ERR, "epoll add error: %s",
> strerror(errno));
> +			return NULL;
> +		}
> +		update_used_ring(internal, qid);
> +	}
> +
> +	/* start relay with a first kick */
> +	for (qid = 0; qid < q_num; qid++)
> +		ifcvf_notify_queue(&internal->hw, qid);
> +
> +	/* listen to the events and react accordingly */
> +	for (;;) {
> +		nfds = epoll_wait(epfd, events, q_num * 2, -1);
> +		if (nfds < 0) {
> +			if (errno == EINTR)
> +				continue;
> +			DRV_LOG(ERR, "epoll_wait return fail\n");
> +			return NULL;
> +		}
> +
> +		for (i = 0; i < nfds; i++) {
> +			fd = (uint32_t)(events[i].data.u64 >> 32);
> +			do {
> +				nbytes = read(fd, &buf, 8);
> +				if (nbytes < 0) {
> +					if (errno == EINTR ||
> +					    errno == EWOULDBLOCK ||
> +					    errno == EAGAIN)
> +						continue;
> +					DRV_LOG(INFO, "Error reading "
> +						"kickfd: %s",
> +						strerror(errno));
> +				}
> +				break;
> +			} while (1);
> +
> +			qid = events[i].data.u32 >> 1;
> +
> +			if (events[i].data.u32 & 1)
> +				update_used_ring(internal, qid);
> +			else
> +				ifcvf_notify_queue(&internal->hw, qid);
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static int
> +setup_vring_relay(struct ifcvf_internal *internal)
> +{
> +	int ret;
> +
> +	ret = pthread_create(&internal->tid, NULL, vring_relay,
> +			(void *)internal);
> +	if (ret) {
> +		DRV_LOG(ERR, "failed to create ring relay pthread.");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static int
> +unset_vring_relay(struct ifcvf_internal *internal)
> +{
> +	void *status;
> +
> +	if (internal->tid) {
> +		pthread_cancel(internal->tid);
> +		pthread_join(internal->tid, &status);
> +	}
> +	internal->tid = 0;
> +
> +	if (internal->epfd >= 0)
> +		close(internal->epfd);
> +	internal->epfd = -1;
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
> +{
> +	int ret;
> +	int vid = internal->vid;
> +
> +	/* stop the direct IO data path */
> +	unset_notify_relay(internal);
> +	vdpa_ifcvf_stop(internal);
> +	vdpa_disable_vfio_intr(internal);
> +
> +	ret = rte_vhost_host_notifier_ctrl(vid, false);
> +	if (ret && ret != -ENOTSUP)
> +		goto error;
> +
> +	/* set up interrupt for interrupt relay */
> +	ret = vdpa_enable_vfio_intr(internal, 1);
> +	if (ret)
> +		goto unmap;
> +
> +	/* config the VF */
> +	ret = m_ifcvf_start(internal);
> +	if (ret)
> +		goto unset_intr;
> +
> +	/* set up vring relay thread */
> +	ret = setup_vring_relay(internal);
> +	if (ret)
> +		goto stop_vf;
> +
> +	rte_vhost_host_notifier_ctrl(vid, true);
> +
> +	internal->sw_fallback_running = true;
> +
> +	return 0;
> +
> +stop_vf:
> +	m_ifcvf_stop(internal);
> +unset_intr:
> +	vdpa_disable_vfio_intr(internal);
> +unmap:
> +	ifcvf_dma_map(internal, 0);
> +error:
> +	return -1;
> +}
> +
> +static int
> +ifcvf_dev_config(int vid)
> +{
> +	int did;
> +	struct internal_list *list;
> +	struct ifcvf_internal *internal;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +	internal->vid = vid;
> +	rte_atomic32_set(&internal->dev_attached, 1);
> +	update_datapath(internal);
> +
> +	if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
> +		DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_dev_close(int vid)
> +{
> +	int did;
> +	struct internal_list *list;
> +	struct ifcvf_internal *internal;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +
> +	if (internal->sw_fallback_running) {
> +		/* unset ring relay */
> +		unset_vring_relay(internal);
> +
> +		/* reset VF */
> +		m_ifcvf_stop(internal);
> +
> +		/* remove interrupt setting */
> +		vdpa_disable_vfio_intr(internal);
> +
> +		/* unset DMA map for guest memory */
> +		ifcvf_dma_map(internal, 0);
> +
> +		internal->sw_fallback_running = false;
> +	} else {
> +		rte_atomic32_set(&internal->dev_attached, 0);
> +		update_datapath(internal);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_set_features(int vid)
> +{
> +	uint64_t features = 0;
> +	int did;
> +	struct internal_list *list;
> +	struct ifcvf_internal *internal;
> +	uint64_t log_base = 0, log_size = 0;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +	rte_vhost_get_negotiated_features(vid, &features);
> +
> +	if (!RTE_VHOST_NEED_LOG(features))
> +		return 0;
> +
> +	if (internal->sw_lm) {
> +		ifcvf_sw_fallback_switchover(internal);
> +	} else {
> +		rte_vhost_get_log_base(vid, &log_base, &log_size);
> +		rte_vfio_container_dma_map(internal->vfio_container_fd,
> +				log_base, IFCVF_LOG_BASE, log_size);
> +		ifcvf_enable_logging(&internal->hw, IFCVF_LOG_BASE,
> log_size);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_get_vfio_group_fd(int vid)
> +{
> +	int did;
> +	struct internal_list *list;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	return list->internal->vfio_group_fd;
> +}
> +
> +static int
> +ifcvf_get_vfio_device_fd(int vid)
> +{
> +	int did;
> +	struct internal_list *list;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	return list->internal->vfio_dev_fd;
> +}
> +
> +static int
> +ifcvf_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size)
> +{
> +	int did;
> +	struct internal_list *list;
> +	struct ifcvf_internal *internal;
> +	struct vfio_region_info reg = { .argsz = sizeof(reg) };
> +	int ret;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +
> +	reg.index = ifcvf_get_notify_region(&internal->hw);
> +	ret = ioctl(internal->vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO,
> &reg);
> +	if (ret) {
> +		DRV_LOG(ERR, "Get not get device region info: %s",
> +				strerror(errno));
> +		return -1;
> +	}
> +
> +	*offset = ifcvf_get_queue_notify_off(&internal->hw, qid) +
> reg.offset;
> +	*size = 0x1000;
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_get_queue_num(int did, uint32_t *queue_num)
> +{
> +	struct internal_list *list;
> +
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	*queue_num = list->internal->max_queues;
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_get_vdpa_features(int did, uint64_t *features)
> +{
> +	struct internal_list *list;
> +
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	*features = list->internal->features;
> +
> +	return 0;
> +}
> +
> +#define VDPA_SUPPORTED_PROTOCOL_FEATURES \
> +		(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
> +		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
> +		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
> +		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
> +		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
> +static int
> +ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
> +{
> +	*features = VDPA_SUPPORTED_PROTOCOL_FEATURES;
> +	return 0;
> +}
> +
> +static struct rte_vdpa_dev_ops ifcvf_ops = {
> +	.get_queue_num = ifcvf_get_queue_num,
> +	.get_features = ifcvf_get_vdpa_features,
> +	.get_protocol_features = ifcvf_get_protocol_features,
> +	.dev_conf = ifcvf_dev_config,
> +	.dev_close = ifcvf_dev_close,
> +	.set_vring_state = NULL,
> +	.set_features = ifcvf_set_features,
> +	.migration_done = NULL,
> +	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
> +	.get_vfio_device_fd = ifcvf_get_vfio_device_fd,
> +	.get_notify_area = ifcvf_get_notify_area,
> +};
> +
> +static inline int
> +open_int(const char *key __rte_unused, const char *value, void
> *extra_args)
> +{
> +	uint16_t *n = extra_args;
> +
> +	if (value == NULL || extra_args == NULL)
> +		return -EINVAL;
> +
> +	*n = (uint16_t)strtoul(value, NULL, 0);
> +	if (*n == USHRT_MAX && errno == ERANGE)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +static int
> +ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> +		struct rte_pci_device *pci_dev)
> +{
> +	uint64_t features;
> +	struct ifcvf_internal *internal = NULL;
> +	struct internal_list *list = NULL;
> +	int vdpa_mode = 0;
> +	int sw_fallback_lm = 0;
> +	struct rte_kvargs *kvlist = NULL;
> +	int ret = 0;
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +		return 0;
> +
> +	if (!pci_dev->device.devargs)
> +		return 1;
> +
> +	kvlist = rte_kvargs_parse(pci_dev->device.devargs->args,
> +			ifcvf_valid_arguments);
> +	if (kvlist == NULL)
> +		return 1;
> +
> +	/* probe only when vdpa mode is specified */
> +	if (rte_kvargs_count(kvlist, IFCVF_VDPA_MODE) == 0) {
> +		rte_kvargs_free(kvlist);
> +		return 1;
> +	}
> +
> +	ret = rte_kvargs_process(kvlist, IFCVF_VDPA_MODE, &open_int,
> +			&vdpa_mode);
> +	if (ret < 0 || vdpa_mode == 0) {
> +		rte_kvargs_free(kvlist);
> +		return 1;
> +	}
> +
> +	list = rte_zmalloc("ifcvf", sizeof(*list), 0);
> +	if (list == NULL)
> +		goto error;
> +
> +	internal = rte_zmalloc("ifcvf", sizeof(*internal), 0);
> +	if (internal == NULL)
> +		goto error;
> +
> +	internal->pdev = pci_dev;
> +	rte_spinlock_init(&internal->lock);
> +
> +	if (ifcvf_vfio_setup(internal) < 0) {
> +		DRV_LOG(ERR, "failed to setup device %s", pci_dev->name);
> +		goto error;
> +	}
> +
> +	if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0) {
> +		DRV_LOG(ERR, "failed to init device %s", pci_dev->name);
> +		goto error;
> +	}
> +
> +	internal->max_queues = IFCVF_MAX_QUEUES;
> +	features = ifcvf_get_features(&internal->hw);
> +	internal->features = (features &
> +		~(1ULL << VIRTIO_F_IOMMU_PLATFORM)) |
> +		(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
> +		(1ULL << VIRTIO_NET_F_CTRL_VQ) |
> +		(1ULL << VIRTIO_NET_F_STATUS) |
> +		(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) |
> +		(1ULL << VHOST_F_LOG_ALL);
> +
> +	internal->dev_addr.pci_addr = pci_dev->addr;
> +	internal->dev_addr.type = PCI_ADDR;
> +	list->internal = internal;
> +
> +	if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
> +		ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
> +				&open_int, &sw_fallback_lm);
> +		if (ret < 0)
> +			goto error;
> +	}
> +	internal->sw_lm = sw_fallback_lm;
> +
> +	internal->did = rte_vdpa_register_device(&internal->dev_addr,
> +				&ifcvf_ops);
> +	if (internal->did < 0) {
> +		DRV_LOG(ERR, "failed to register device %s", pci_dev-
> >name);
> +		goto error;
> +	}
> +
> +	pthread_mutex_lock(&internal_list_lock);
> +	TAILQ_INSERT_TAIL(&internal_list, list, next);
> +	pthread_mutex_unlock(&internal_list_lock);
> +
> +	rte_atomic32_set(&internal->started, 1);
> +	update_datapath(internal);
> +
> +	rte_kvargs_free(kvlist);
> +	return 0;
> +
> +error:
> +	rte_kvargs_free(kvlist);
> +	rte_free(list);
> +	rte_free(internal);
> +	return -1;
> +}
> +
> +static int
> +ifcvf_pci_remove(struct rte_pci_device *pci_dev)
> +{
> +	struct ifcvf_internal *internal;
> +	struct internal_list *list;
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +		return 0;
> +
> +	list = find_internal_resource_by_dev(pci_dev);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device: %s", pci_dev->name);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +	rte_atomic32_set(&internal->started, 0);
> +	update_datapath(internal);
> +
> +	rte_pci_unmap_device(internal->pdev);
> +	rte_vfio_container_destroy(internal->vfio_container_fd);
> +	rte_vdpa_unregister_device(internal->did);
> +
> +	pthread_mutex_lock(&internal_list_lock);
> +	TAILQ_REMOVE(&internal_list, list, next);
> +	pthread_mutex_unlock(&internal_list_lock);
> +
> +	rte_free(list);
> +	rte_free(internal);
> +
> +	return 0;
> +}
> +
> +/*
> + * IFCVF has the same vendor ID and device ID as virtio net PCI
> + * device, with its specific subsystem vendor ID and device ID.
> + */
> +static const struct rte_pci_id pci_id_ifcvf_map[] = {
> +	{ .class_id = RTE_CLASS_ANY_ID,
> +	  .vendor_id = IFCVF_VENDOR_ID,
> +	  .device_id = IFCVF_DEVICE_ID,
> +	  .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> +	  .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
> +	},
> +
> +	{ .vendor_id = 0, /* sentinel */
> +	},
> +};
> +
> +static struct rte_pci_driver rte_ifcvf_vdpa = {
> +	.id_table = pci_id_ifcvf_map,
> +	.drv_flags = 0,
> +	.probe = ifcvf_pci_probe,
> +	.remove = ifcvf_pci_remove,
> +};
> +
> +RTE_PMD_REGISTER_PCI(net_ifcvf, rte_ifcvf_vdpa);
> +RTE_PMD_REGISTER_PCI_TABLE(net_ifcvf, pci_id_ifcvf_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_ifcvf, "* vfio-pci");
> +
> +RTE_INIT(ifcvf_vdpa_init_log)
> +{
> +	ifcvf_vdpa_logtype = rte_log_register("pmd.net.ifcvf_vdpa");
> +	if (ifcvf_vdpa_logtype >= 0)
> +		rte_log_set_level(ifcvf_vdpa_logtype, RTE_LOG_NOTICE);
> +}
> diff --git a/drivers/vdpa/ifc/meson.build b/drivers/vdpa/ifc/meson.build
> new file mode 100644
> index 0000000..adc9ed9
> --- /dev/null
> +++ b/drivers/vdpa/ifc/meson.build
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2018 Intel Corporation
> +
> +build = dpdk_conf.has('RTE_LIBRTE_VHOST')
> +reason = 'missing dependency, DPDK vhost library'
> +allow_experimental_apis = true
> +sources = files('ifcvf_vdpa.c', 'base/ifcvf.c')
> +includes += include_directories('base')
> +deps += 'vhost'
> diff --git a/drivers/vdpa/ifc/rte_pmd_ifc_version.map
> b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
> new file mode 100644
> index 0000000..f9f17e4
> --- /dev/null
> +++ b/drivers/vdpa/ifc/rte_pmd_ifc_version.map
> @@ -0,0 +1,3 @@
> +DPDK_20.0 {
> +	local: *;
> +};
> diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
> index a839ff5..fd164d3 100644
> --- a/drivers/vdpa/meson.build
> +++ b/drivers/vdpa/meson.build
> @@ -1,7 +1,7 @@
>  #   SPDX-License-Identifier: BSD-3-Clause
>  #   Copyright 2019 Mellanox Technologies, Ltd
> 
> -drivers = []
> +drivers = ['ifc']
>  std_deps = ['bus_pci', 'kvargs']
>  std_deps += ['vhost']
>  config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-09 17:25     ` Matan Azrad
@ 2020-01-10  1:55       ` Wang, Haiyue
  2020-01-10  9:07         ` Matan Azrad
  0 siblings, 1 reply; 50+ messages in thread
From: Wang, Haiyue @ 2020-01-10  1:55 UTC (permalink / raw)
  To: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Thomas Monjalon, Andrew Rybchenko

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> Sent: Friday, January 10, 2020 01:26
> To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin <maxime.coquelin@redhat.com>; Bie, Tiwei
> <tiwei.bie@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>;
> Andrew Rybchenko <arybchenko@solarflare.com>
> Subject: Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
> 
> Small typo inline.
> 
> From: Matan Azrad
> > A new vDPA class was recently introduced.
> >
> > IFC driver implements the vDPA operations, hence it should be moved to
> > the vDPA class.
> >
> > Move it.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  MAINTAINERS                              |   14 +-
> >  doc/guides/nics/features/ifcvf.ini       |    8 -
> >  doc/guides/nics/ifc.rst                  |  106 ---
> >  doc/guides/nics/index.rst                |    1 -
> >  doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
> >  doc/guides/vdpadevs/ifc.rst              |  106 +++
> >  doc/guides/vdpadevs/index.rst            |    1 +
> >  drivers/net/Makefile                     |    3 -
> >  drivers/net/ifc/Makefile                 |   34 -
> >  drivers/net/ifc/base/ifcvf.c             |  329 --------
> >  drivers/net/ifc/base/ifcvf.h             |  162 ----
> >  drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
> >  drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
> >  drivers/net/ifc/meson.build              |    9 -
> >  drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
> >  drivers/net/meson.build                  |    1 -
> >  drivers/vdpa/Makefile                    |    6 +
> >  drivers/vdpa/ifc/Makefile                |   34 +
> >  drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
> >  drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
> >  drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
> >  drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280
> > ++++++++++++++++++++++++++++++
> >  drivers/vdpa/ifc/meson.build             |    9 +
> >  drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
> >  drivers/vdpa/meson.build                 |    2 +-
> >  25 files changed, 1997 insertions(+), 1997 deletions(-)
> >  delete mode 100644 doc/guides/nics/features/ifcvf.ini
> >  delete mode 100644 doc/guides/nics/ifc.rst
> >  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
> >  create mode 100644 doc/guides/vdpadevs/ifc.rst
> >  delete mode 100644 drivers/net/ifc/Makefile
> >  delete mode 100644 drivers/net/ifc/base/ifcvf.c
> >  delete mode 100644 drivers/net/ifc/base/ifcvf.h
> >  delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
> >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
> >  delete mode 100644 drivers/net/ifc/meson.build
> >  delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
> >  create mode 100644 drivers/vdpa/ifc/Makefile
> >  create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
> >  create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
> >  create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
> >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
> >  create mode 100644 drivers/vdpa/ifc/meson.build
> >  create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
> >

git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)

> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09 11:34                 ` Matan Azrad
@ 2020-01-10  2:38                   ` Xu, Rosen
  2020-01-10  9:21                     ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-10  2:38 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy, Roni Bar Yanai



> -----Original Message-----
> From: Matan Azrad <matan@mellanox.com>
> Sent: Thursday, January 09, 2020 19:34
> To: Xu, Rosen <rosen.xu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Bie, Tiwei
> <tiwei.bie@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Wang,
> Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>; Roni Bar Yanai
> <roniba@mellanox.com>
> Subject: RE: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 
> 
> From: Xu, Rosen <rosen.xu@intel.com>
> > > -----Original Message-----
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Thursday, January 09, 2020 16:41
> > > To: Xu, Rosen <rosen.xu@intel.com>
> > > Cc: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> > > <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> > > Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> > > <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA
> > > device drivers
> > >
> > > 09/01/2020 03:27, Xu, Rosen:
> > > > Hi,
> > > >
> > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > 08/01/2020 13:39, Xu, Rosen:
> > > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > > From: Xu, Rosen
> > > > > > > > Did you think about OVS DPDK?
> > > > > > > > vDPA is a basic module for OVS, currently it will take
> > > > > > > > some exception path packet processing for OVS, so it still
> > > > > > > > needs to integrate
> > > > > eth_dev.
> > > > > > >
> > > > > > > I don't understand your question.
> > > > > > >
> > > > > > > What do you mean by "integrate eth_dev"?
> > > > > >
> > > > > > My questions is in OVS DPDK scenario vDPA device implements
> > > > > > eth_dev ops, so create a new class and move ifc code to this
> > > > > > new class
> > > is not ok.
> > > > >
> > > > > 1/ I don't understand the relation with OVS.
> > > > >
> > > > > 2/ no, vDPA device implements vDPA ops.
> > > > > If it implements ethdev ops, it is an ethdev device.
> > > > >
> > > > > Please show an example of what you claim.
> > > >
> > > > Answers of 1 and 2.
> > > >
> > > > In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
> > > > needs to be implemented as rte_eth_dev and provides eth_dev_ops
> > such
> > > > as
> > > packet TX/RX for OVS.
> > >
> > > No, OVS is also using the vhost API for vhost port.
> >
> > Yes, vhost pmd is not a good example.
> >
> > > > Take vHost(Virtio back end) for example, OVS startups vHost
> > > > interface like
> > > this:
> > > > ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> > > > type=dpdkvhostuser drivers/net/vhost implements vHost as
> > rte_eth_dev
> > > and integrated in OVS.
> > > > OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> > > > rte_eth_rx_burst() which call eth_dev_ops implementation of
> > > drivers/net/vhost.
> > >
> > > No, it is using rte_vhost_dequeue_burst() and
> > > rte_vhost_enqueue_burst() which are not in ethdev.
> > >
> > > > vDPA is also Virtio back end and works like vHost, same as vHost,
> > > > it will be implemented as rte_eth_dev and also be integrated into OVS.
> > >
> > > No, vDPA is not "implemented as rte_eth_dev".
> >
> > Currently, vDPA isn't integrated with OVS.
> >
> > > > So, it's not ok to move ifc code from drivers/net.
> > >
> > > drivers/net/ifc has no ethdev implementation at all.
> >
> > For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev, but
> > there are many discussions in OVS community about vDPA, some are from
> > Mellanox, it seems vDPA port will be implemented as rte_eth_dev port
> > in OVS in the near feature.
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> > h
> >
> work.ozlabs.org%2Fpatch%2F1178474%2F&amp;data=02%7C01%7Cmatan%
> 4
> >
> 0mellanox.com%7C9e84c2581e2f414e0aca08d794f22e8d%7Ca652971c7d2e4
> d
> >
> 9ba6a4d149256f461b%7C0%7C0%7C637141640216181763&amp;sdata=TA%
> 2F
> > 0zU495kXUqhC6eP09NDzBZfjJz1dbfkRcDpV%2BYAs%3D&amp;reserved=0
> >
> > Matan,
> > Could you clarify how OVS integrates vDPA in Mellanox patch?
> >
> > >
> > > Rosen, I'm sorry, these arguments look irrelevant, so I won't
> > > consider them as blocking the integration of this patch.
> >
> > What I mentioned is not blocking the integration of this patch, I just
> > want to get clarification from Matan how to integrate vDPA port in OVS.
> 
> 
> Hi
> 
> OVS like any other application should use the current API of vDPA to attach a
> probed vdpa device to a vhost device.
> See example application /examples/vdpa.
> 
> Here, we just introduce a new class to hold all the vDPA drivers, no change in
> the API.
> 
> As I understand, no vDPA device is currently integrated in OVS.
> 
> I think it can be integrated only when a full offload will be integrated since
> the vDPA device forward the traffic from the HW directly to the virtio queue,
> once it will be there, I guess the offload will be configured by the representor
> of the vdpa device(VF) which is managed by an ethdev device.
>
> 
> Matan.
> 
Hi,

I'm still confused about your last sentence " the representor of the vdpa device(VF) which is managed by an ethdev device".
My understanding is that there are some connections and dependency between rte_eth_dev and vdpa device?
Am I right or any other explanations from you?

Rosen.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-09 10:42                   ` Maxime Coquelin
@ 2020-01-10  2:40                     ` Xu, Rosen
  0 siblings, 0 replies; 50+ messages in thread
From: Xu, Rosen @ 2020-01-10  2:40 UTC (permalink / raw)
  To: Maxime Coquelin, Thomas Monjalon
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy



> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, January 09, 2020 18:42
> To: Xu, Rosen <rosen.xu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Cc: Matan Azrad <matan@mellanox.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> Wang, Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 
> 
> On 1/9/20 10:49 AM, Xu, Rosen wrote:
> >
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Thursday, January 09, 2020 17:24
> >> To: Thomas Monjalon <thomas@monjalon.net>; Xu, Rosen
> >> <rosen.xu@intel.com>
> >> Cc: Matan Azrad <matan@mellanox.com>; Bie, Tiwei
> >> <tiwei.bie@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Wang,
> >> Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh
> >> <ferruh.yigit@intel.com>; dev@dpdk.org; Pei, Andy
> >> <andy.pei@intel.com>
> >> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA
> >> device drivers
> >>
> >>
> >>
> >> On 1/9/20 9:41 AM, Thomas Monjalon wrote:
> >>> 09/01/2020 03:27, Xu, Rosen:
> >>>> Hi,
> >>>>
> >>>> From: Thomas Monjalon <thomas@monjalon.net>
> >>>>> 08/01/2020 13:39, Xu, Rosen:
> >>>>>> From: Matan Azrad <matan@mellanox.com>
> >>>>>>> From: Xu, Rosen
> >>>>>>>> Did you think about OVS DPDK?
> >>>>>>>> vDPA is a basic module for OVS, currently it will take some
> >>>>>>>> exception path packet processing for OVS, so it still needs to
> >>>>>>>> integrate
> >>>>> eth_dev.
> >>>>>>>
> >>>>>>> I don't understand your question.
> >>>>>>>
> >>>>>>> What do you mean by "integrate eth_dev"?
> >>>>>>
> >>>>>> My questions is in OVS DPDK scenario vDPA device implements
> >>>>>> eth_dev ops, so create a new class and move ifc code to this new
> class is not ok.
> >>>>>
> >>>>> 1/ I don't understand the relation with OVS.
> >>>>>
> >>>>> 2/ no, vDPA device implements vDPA ops.
> >>>>> If it implements ethdev ops, it is an ethdev device.
> >>>>>
> >>>>> Please show an example of what you claim.
> >>>>
> >>>> Answers of 1 and 2.
> >>>>
> >>>> In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
> >>>> needs to be implemented as rte_eth_dev and provides eth_dev_ops
> >>>> such
> >> as packet TX/RX for OVS.
> >>>
> >>> No, OVS is also using the vhost API for vhost port.
> >>>
> >>>> Take vHost(Virtio back end) for example, OVS startups vHost
> >>>> interface like
> >> this:
> >>>> ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> >>>> type=dpdkvhostuser drivers/net/vhost implements vHost as
> >>>> rte_eth_dev
> >> and integrated in OVS.
> >>>> OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> >>>> rte_eth_rx_burst() which call eth_dev_ops implementation of
> >> drivers/net/vhost.
> >>>
> >>> No, it is using rte_vhost_dequeue_burst() and
> >>> rte_vhost_enqueue_burst() which are not in ethdev.
> >>>
> >>>> vDPA is also Virtio back end and works like vHost, same as vHost,
> >>>> it will be implemented as rte_eth_dev and also be integrated into OVS.
> >>>
> >>> No, vDPA is not "implemented as rte_eth_dev".
> >>>
> >>>> So, it's not ok to move ifc code from drivers/net.
> >>>
> >>> drivers/net/ifc has no ethdev implementation at all.
> >>>
> >>>
> >>> Rosen, I'm sorry, these arguments look irrelevant, so I won't
> >>> consider them as blocking the integration of this patch.
> >>>
> >>>
> >>
> >> I agree with Thomas, the vDPA drivers do not implement the ethdev ops.
> >
> > For OVS hasn't integrated vDPA, it doesn't implement ethdev ops, but
> > there are many discussions in OVS community about vDPA, it seems vDPA
> > will be supported in OVS in the near feature.
> 
> I agree with this statement, but if you look at Mellanox series being reviewed,
> it is defining a new type of port and not use the regular DPDK port type.
> 
> >> And OVS does not use the Vhost PMD for the Vhost-user ports, but
> >> directly call the librte_vhost APIs.
> >
> > I'm afraid you are wrong, pls read these documents which introduce how
> to use vHost-user PMD in OVS:
> > http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
> > http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/
> 
> I can confirm that below command to add ports is not using Vhost PMD but
> directly the  librte_vhost API:
> 
> $ ovs-vsctl add-port br0 dpdkvhostclient0 \
>     -- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
>        options:vhost-server-path=/tmp/dpdkvhostclient0
> 
> Please check the OVS source code.
> 
> It is possible  to use the Vhost PMD as a regular DPDK port, but this is not
> with above command, and not the recommended way.
> 
> >> Regards,
> >> Maxime
> >
Hi,

What I mentioned and questions are in Matan's email thread.

Thanks,
Rosen


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10  1:55       ` Wang, Haiyue
@ 2020-01-10  9:07         ` Matan Azrad
  2020-01-10  9:13           ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Matan Azrad @ 2020-01-10  9:07 UTC (permalink / raw)
  To: Wang, Haiyue, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W
  Cc: Yigit, Ferruh, dev, Thomas Monjalon, Andrew Rybchenko



From: Wang, Haiyue
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> > Sent: Friday, January 10, 2020 01:26
> > To: Matan Azrad <matan@mellanox.com>; Maxime Coquelin
> > <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> > Zhihong <zhihong.wang@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>
> > Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; dev@dpdk.org; Thomas
> > Monjalon <thomas@monjalon.net>; Andrew Rybchenko
> > <arybchenko@solarflare.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the
> > vDPA class
> >
> > Small typo inline.
> >
> > From: Matan Azrad
> > > A new vDPA class was recently introduced.
> > >
> > > IFC driver implements the vDPA operations, hence it should be moved
> > > to the vDPA class.
> > >
> > > Move it.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  MAINTAINERS                              |   14 +-
> > >  doc/guides/nics/features/ifcvf.ini       |    8 -
> > >  doc/guides/nics/ifc.rst                  |  106 ---
> > >  doc/guides/nics/index.rst                |    1 -
> > >  doc/guides/vdpadevs/features/ifcvf.ini   |    8 +
> > >  doc/guides/vdpadevs/ifc.rst              |  106 +++
> > >  doc/guides/vdpadevs/index.rst            |    1 +
> > >  drivers/net/Makefile                     |    3 -
> > >  drivers/net/ifc/Makefile                 |   34 -
> > >  drivers/net/ifc/base/ifcvf.c             |  329 --------
> > >  drivers/net/ifc/base/ifcvf.h             |  162 ----
> > >  drivers/net/ifc/base/ifcvf_osdep.h       |   52 --
> > >  drivers/net/ifc/ifcvf_vdpa.c             | 1280 ------------------------------
> > >  drivers/net/ifc/meson.build              |    9 -
> > >  drivers/net/ifc/rte_pmd_ifc_version.map  |    3 -
> > >  drivers/net/meson.build                  |    1 -
> > >  drivers/vdpa/Makefile                    |    6 +
> > >  drivers/vdpa/ifc/Makefile                |   34 +
> > >  drivers/vdpa/ifc/base/ifcvf.c            |  329 ++++++++
> > >  drivers/vdpa/ifc/base/ifcvf.h            |  162 ++++
> > >  drivers/vdpa/ifc/base/ifcvf_osdep.h      |   52 ++
> > >  drivers/vdpa/ifc/ifcvf_vdpa.c            | 1280
> > > ++++++++++++++++++++++++++++++
> > >  drivers/vdpa/ifc/meson.build             |    9 +
> > >  drivers/vdpa/ifc/rte_pmd_ifc_version.map |    3 +
> > >  drivers/vdpa/meson.build                 |    2 +-
> > >  25 files changed, 1997 insertions(+), 1997 deletions(-)  delete
> > > mode 100644 doc/guides/nics/features/ifcvf.ini
> > >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > > doc/guides/vdpadevs/features/ifcvf.ini
> > >  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
> > > drivers/net/ifc/Makefile  delete mode 100644
> > > drivers/net/ifc/base/ifcvf.c  delete mode 100644
> > > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > > drivers/net/ifc/base/ifcvf_osdep.h
> > >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > > drivers/net/ifc/meson.build  delete mode 100644
> > > drivers/net/ifc/rte_pmd_ifc_version.map
> > >  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > > drivers/vdpa/ifc/base/ifcvf_osdep.h
> > >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> > > 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> > > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> > >
> 
> git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)

Yes, and more file move in docs. (you can see like rename in git 😊)
Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
Also MAINTAINER file etc...

> > > --
> > > 1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10  9:07         ` Matan Azrad
@ 2020-01-10  9:13           ` Thomas Monjalon
  2020-01-10 12:31             ` Wang, Haiyue
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-10  9:13 UTC (permalink / raw)
  To: Wang, Haiyue, Matan Azrad
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Andrew Rybchenko

10/01/2020 10:07, Matan Azrad:
> From: Wang, Haiyue
> > From: Matan Azrad
> > > >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > > > doc/guides/vdpadevs/features/ifcvf.ini
> > > >  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
> > > > drivers/net/ifc/Makefile  delete mode 100644
> > > > drivers/net/ifc/base/ifcvf.c  delete mode 100644
> > > > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > > > drivers/net/ifc/base/ifcvf_osdep.h
> > > >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > > > drivers/net/ifc/meson.build  delete mode 100644
> > > > drivers/net/ifc/rte_pmd_ifc_version.map
> > > >  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > > > drivers/vdpa/ifc/base/ifcvf_osdep.h
> > > >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> > > > 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> > > > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> > > >
> > 
> > git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)
> 
> Yes, and more file move in docs. (you can see like rename in git 😊)
> Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
> Also MAINTAINER file etc...

I think the comment from Haiyue was about the files deleted and created
in the patch, instead of being moved (renamed).
As far as I know, this is for 2 reasons:
	- there is no move in internal git representation
	- some versions of git-diff does not detect moves properly,
	  but option -M may help



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-10  2:38                   ` Xu, Rosen
@ 2020-01-10  9:21                     ` Thomas Monjalon
  2020-01-10 14:18                       ` Xu, Rosen
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-10  9:21 UTC (permalink / raw)
  To: Matan Azrad, Xu, Rosen
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy, Roni Bar Yanai

10/01/2020 03:38, Xu, Rosen:
> From: Matan Azrad <matan@mellanox.com>
> > From: Xu, Rosen <rosen.xu@intel.com>
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > 09/01/2020 03:27, Xu, Rosen:
> > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > 08/01/2020 13:39, Xu, Rosen:
> > > > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > > > From: Xu, Rosen
> > > > > > > > > Did you think about OVS DPDK?
> > > > > > > > > vDPA is a basic module for OVS, currently it will take
> > > > > > > > > some exception path packet processing for OVS, so it still
> > > > > > > > > needs to integrate
> > > > > > eth_dev.
> > > > > > > >
> > > > > > > > I don't understand your question.
> > > > > > > >
> > > > > > > > What do you mean by "integrate eth_dev"?
> > > > > > >
> > > > > > > My questions is in OVS DPDK scenario vDPA device implements
> > > > > > > eth_dev ops, so create a new class and move ifc code to this
> > > > > > > new class
> > > > is not ok.
> > > > > >
> > > > > > 1/ I don't understand the relation with OVS.
> > > > > >
> > > > > > 2/ no, vDPA device implements vDPA ops.
> > > > > > If it implements ethdev ops, it is an ethdev device.
> > > > > >
> > > > > > Please show an example of what you claim.
> > > > >
> > > > > Answers of 1 and 2.
> > > > >
> > > > > In OVS DPDK, each network device(such as NIC, vHost etc) of DPDK
> > > > > needs to be implemented as rte_eth_dev and provides eth_dev_ops
> > > such
> > > > > as
> > > > packet TX/RX for OVS.
> > > >
> > > > No, OVS is also using the vhost API for vhost port.
> > >
> > > Yes, vhost pmd is not a good example.
> > >
> > > > > Take vHost(Virtio back end) for example, OVS startups vHost
> > > > > interface like
> > > > this:
> > > > > ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1
> > > > > type=dpdkvhostuser drivers/net/vhost implements vHost as
> > > rte_eth_dev
> > > > and integrated in OVS.
> > > > > OVS can send/receive packets to/from VM with rte_eth_tx_burst()
> > > > > rte_eth_rx_burst() which call eth_dev_ops implementation of
> > > > drivers/net/vhost.
> > > >
> > > > No, it is using rte_vhost_dequeue_burst() and
> > > > rte_vhost_enqueue_burst() which are not in ethdev.
> > > >
> > > > > vDPA is also Virtio back end and works like vHost, same as vHost,
> > > > > it will be implemented as rte_eth_dev and also be integrated into OVS.
> > > >
> > > > No, vDPA is not "implemented as rte_eth_dev".
> > >
> > > Currently, vDPA isn't integrated with OVS.
> > >
> > > > > So, it's not ok to move ifc code from drivers/net.
> > > >
> > > > drivers/net/ifc has no ethdev implementation at all.
> > >
> > > For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev, but
> > > there are many discussions in OVS community about vDPA, some are from
> > > Mellanox, it seems vDPA port will be implemented as rte_eth_dev port
> > > in OVS in the near feature.
> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> > > h
> > >
> > work.ozlabs.org%2Fpatch%2F1178474%2F&amp;data=02%7C01%7Cmatan%
> > 4
> > >
> > 0mellanox.com%7C9e84c2581e2f414e0aca08d794f22e8d%7Ca652971c7d2e4
> > d
> > >
> > 9ba6a4d149256f461b%7C0%7C0%7C637141640216181763&amp;sdata=TA%
> > 2F
> > > 0zU495kXUqhC6eP09NDzBZfjJz1dbfkRcDpV%2BYAs%3D&amp;reserved=0
> > >
> > > Matan,
> > > Could you clarify how OVS integrates vDPA in Mellanox patch?
> > >
> > > >
> > > > Rosen, I'm sorry, these arguments look irrelevant, so I won't
> > > > consider them as blocking the integration of this patch.
> > >
> > > What I mentioned is not blocking the integration of this patch, I just
> > > want to get clarification from Matan how to integrate vDPA port in OVS.
> > 
> > 
> > Hi
> > 
> > OVS like any other application should use the current API of vDPA to attach a
> > probed vdpa device to a vhost device.
> > See example application /examples/vdpa.
> > 
> > Here, we just introduce a new class to hold all the vDPA drivers, no change in
> > the API.
> > 
> > As I understand, no vDPA device is currently integrated in OVS.
> > 
> > I think it can be integrated only when a full offload will be integrated since
> > the vDPA device forward the traffic from the HW directly to the virtio queue,
> > once it will be there, I guess the offload will be configured by the representor
> > of the vdpa device(VF) which is managed by an ethdev device.
> >
> > 
> > Matan.
> > 
> Hi,
> 
> I'm still confused about your last sentence " the representor of the vdpa device(VF) which is managed by an ethdev device".
> My understanding is that there are some connections and dependency between rte_eth_dev and vdpa device?
> Am I right or any other explanations from you?

A vDPA port does not allow any ethdev operations (like rte_flow).
In order to configure some offloads on the device, OVS needs an ethdev port.
In Mellanox case, an ethdev VF representor port can be instantiated.
So we may have two ports for the same device:
	- vDPA for data path with the VM
	- ethdev for offloads control path



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10  9:13           ` Thomas Monjalon
@ 2020-01-10 12:31             ` Wang, Haiyue
  2020-01-10 12:34               ` Maxime Coquelin
  0 siblings, 1 reply; 50+ messages in thread
From: Wang, Haiyue @ 2020-01-10 12:31 UTC (permalink / raw)
  To: Thomas Monjalon, Matan Azrad
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Andrew Rybchenko

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, January 10, 2020 17:14
> To: Wang, Haiyue <haiyue.wang@intel.com>; Matan Azrad <matan@mellanox.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang, Zhihong
> <zhihong.wang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Andrew Rybchenko <arybchenko@solarflare.com>
> Subject: Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
> 
> 10/01/2020 10:07, Matan Azrad:
> > From: Wang, Haiyue
> > > From: Matan Azrad
> > > > >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > > > > doc/guides/vdpadevs/features/ifcvf.ini
> > > > >  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
> > > > > drivers/net/ifc/Makefile  delete mode 100644
> > > > > drivers/net/ifc/base/ifcvf.c  delete mode 100644
> > > > > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > > > > drivers/net/ifc/base/ifcvf_osdep.h
> > > > >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > > > > drivers/net/ifc/meson.build  delete mode 100644
> > > > > drivers/net/ifc/rte_pmd_ifc_version.map
> > > > >  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
> > > > > drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
> > > > > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > > > > drivers/vdpa/ifc/base/ifcvf_osdep.h
> > > > >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> > > > > 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> > > > > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> > > > >
> > >
> > > git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)
> >
> > Yes, and more file move in docs. (you can see like rename in git 😊)
> > Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
> > Also MAINTAINER file etc...
> 
> I think the comment from Haiyue was about the files deleted and created
> in the patch, instead of being moved (renamed).

Yes, I moved one Intel PMD code, then the patch is very small, it is like:

 rename drivers/{net/iavf/base => common/iavf}/iavf_common.c (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_devids.h (100%)

detail is : https://patchwork.dpdk.org/patch/64384/

> As far as I know, this is for 2 reasons:
> 	- there is no move in internal git representation
> 	- some versions of git-diff does not detect moves properly,
> 	  but option -M may help
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10 12:31             ` Wang, Haiyue
@ 2020-01-10 12:34               ` Maxime Coquelin
  2020-01-10 12:59                 ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Maxime Coquelin @ 2020-01-10 12:34 UTC (permalink / raw)
  To: Wang, Haiyue, Thomas Monjalon, Matan Azrad
  Cc: Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit, Ferruh, dev,
	Andrew Rybchenko



On 1/10/20 1:31 PM, Wang, Haiyue wrote:
>> -----Original Message-----
>> From: Thomas Monjalon <thomas@monjalon.net>
>> Sent: Friday, January 10, 2020 17:14
>> To: Wang, Haiyue <haiyue.wang@intel.com>; Matan Azrad <matan@mellanox.com>
>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang, Zhihong
>> <zhihong.wang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
>> dev@dpdk.org; Andrew Rybchenko <arybchenko@solarflare.com>
>> Subject: Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
>>
>> 10/01/2020 10:07, Matan Azrad:
>>> From: Wang, Haiyue
>>>> From: Matan Azrad
>>>>>>  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
>>>>>> doc/guides/vdpadevs/features/ifcvf.ini
>>>>>>  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
>>>>>> drivers/net/ifc/Makefile  delete mode 100644
>>>>>> drivers/net/ifc/base/ifcvf.c  delete mode 100644
>>>>>> drivers/net/ifc/base/ifcvf.h  delete mode 100644
>>>>>> drivers/net/ifc/base/ifcvf_osdep.h
>>>>>>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
>>>>>> drivers/net/ifc/meson.build  delete mode 100644
>>>>>> drivers/net/ifc/rte_pmd_ifc_version.map
>>>>>>  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
>>>>>> drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
>>>>>> drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
>>>>>> drivers/vdpa/ifc/base/ifcvf_osdep.h
>>>>>>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
>>>>>> 100644 drivers/vdpa/ifc/meson.build  create mode 100644
>>>>>> drivers/vdpa/ifc/rte_pmd_ifc_version.map
>>>>>>
>>>>
>>>> git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)
>>>
>>> Yes, and more file move in docs. (you can see like rename in git 😊)
>>> Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
>>> Also MAINTAINER file etc...
>>
>> I think the comment from Haiyue was about the files deleted and created
>> in the patch, instead of being moved (renamed).
> 
> Yes, I moved one Intel PMD code, then the patch is very small, it is like:
> 
>  rename drivers/{net/iavf/base => common/iavf}/iavf_common.c (100%)
>  rename drivers/{net/iavf/base => common/iavf}/iavf_devids.h (100%)
> 
> detail is : https://patchwork.dpdk.org/patch/64384/

Nice, and the advantage of doing so is that git would be smarter when
doing backport to stable branch.

>> As far as I know, this is for 2 reasons:
>> 	- there is no move in internal git representation
>> 	- some versions of git-diff does not detect moves properly,
>> 	  but option -M may help
>>
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10 12:34               ` Maxime Coquelin
@ 2020-01-10 12:59                 ` Thomas Monjalon
  2020-01-10 19:17                   ` Kevin Traynor
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-10 12:59 UTC (permalink / raw)
  To: Wang, Haiyue, Maxime Coquelin
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Andrew Rybchenko

10/01/2020 13:34, Maxime Coquelin:
> On 1/10/20 1:31 PM, Wang, Haiyue wrote:
> > From: Thomas Monjalon <thomas@monjalon.net>
> >> 10/01/2020 10:07, Matan Azrad:
> >>> From: Wang, Haiyue
> >>>> From: Matan Azrad
> >>>>>>  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> >>>>>> doc/guides/vdpadevs/features/ifcvf.ini
> >>>>>>  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
> >>>>>> drivers/net/ifc/Makefile  delete mode 100644
> >>>>>> drivers/net/ifc/base/ifcvf.c  delete mode 100644
> >>>>>> drivers/net/ifc/base/ifcvf.h  delete mode 100644
> >>>>>> drivers/net/ifc/base/ifcvf_osdep.h
> >>>>>>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> >>>>>> drivers/net/ifc/meson.build  delete mode 100644
> >>>>>> drivers/net/ifc/rte_pmd_ifc_version.map
> >>>>>>  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
> >>>>>> drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
> >>>>>> drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> >>>>>> drivers/vdpa/ifc/base/ifcvf_osdep.h
> >>>>>>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
> >>>>>> 100644 drivers/vdpa/ifc/meson.build  create mode 100644
> >>>>>> drivers/vdpa/ifc/rte_pmd_ifc_version.map
> >>>>>>
> >>>>
> >>>> git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)
> >>>
> >>> Yes, and more file move in docs. (you can see like rename in git 😊)
> >>> Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
> >>> Also MAINTAINER file etc...
> >>
> >> I think the comment from Haiyue was about the files deleted and created
> >> in the patch, instead of being moved (renamed).
> > 
> > Yes, I moved one Intel PMD code, then the patch is very small, it is like:
> > 
> >  rename drivers/{net/iavf/base => common/iavf}/iavf_common.c (100%)
> >  rename drivers/{net/iavf/base => common/iavf}/iavf_devids.h (100%)
> > 
> > detail is : https://patchwork.dpdk.org/patch/64384/
> 
> Nice, and the advantage of doing so is that git would be smarter when
> doing backport to stable branch.

No, git won't be smarter.
As explained below, the format of the diff does not change the internal
git representation of the change.
Move/Rename is just a nice formatting done by smart detection.

> >> As far as I know, this is for 2 reasons:
> >> 	- there is no move in internal git representation
> >> 	- some versions of git-diff does not detect moves properly,
> >> 	  but option -M may help




^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-10  9:21                     ` Thomas Monjalon
@ 2020-01-10 14:18                       ` Xu, Rosen
  2020-01-10 16:27                         ` Thomas Monjalon
  0 siblings, 1 reply; 50+ messages in thread
From: Xu, Rosen @ 2020-01-10 14:18 UTC (permalink / raw)
  To: Thomas Monjalon, Matan Azrad
  Cc: Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Pei, Andy, Roni Bar Yanai



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, January 10, 2020 17:21
> To: Matan Azrad <matan@mellanox.com>; Xu, Rosen <rosen.xu@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Bie, Tiwei
> <tiwei.bie@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Wang,
> Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> dev@dpdk.org; Pei, Andy <andy.pei@intel.com>; Roni Bar Yanai
> <roniba@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> 10/01/2020 03:38, Xu, Rosen:
> > From: Matan Azrad <matan@mellanox.com>
> > > From: Xu, Rosen <rosen.xu@intel.com>
> > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > 09/01/2020 03:27, Xu, Rosen:
> > > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > > 08/01/2020 13:39, Xu, Rosen:
> > > > > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > > > > From: Xu, Rosen
> > > > > > > > > > Did you think about OVS DPDK?
> > > > > > > > > > vDPA is a basic module for OVS, currently it will take
> > > > > > > > > > some exception path packet processing for OVS, so it
> > > > > > > > > > still needs to integrate
> > > > > > > eth_dev.
> > > > > > > > >
> > > > > > > > > I don't understand your question.
> > > > > > > > >
> > > > > > > > > What do you mean by "integrate eth_dev"?
> > > > > > > >
> > > > > > > > My questions is in OVS DPDK scenario vDPA device
> > > > > > > > implements eth_dev ops, so create a new class and move ifc
> > > > > > > > code to this new class
> > > > > is not ok.
> > > > > > >
> > > > > > > 1/ I don't understand the relation with OVS.
> > > > > > >
> > > > > > > 2/ no, vDPA device implements vDPA ops.
> > > > > > > If it implements ethdev ops, it is an ethdev device.
> > > > > > >
> > > > > > > Please show an example of what you claim.
> > > > > >
> > > > > > Answers of 1 and 2.
> > > > > >
> > > > > > In OVS DPDK, each network device(such as NIC, vHost etc) of
> > > > > > DPDK needs to be implemented as rte_eth_dev and provides
> > > > > > eth_dev_ops
> > > > such
> > > > > > as
> > > > > packet TX/RX for OVS.
> > > > >
> > > > > No, OVS is also using the vhost API for vhost port.
> > > >
> > > > Yes, vhost pmd is not a good example.
> > > >
> > > > > > Take vHost(Virtio back end) for example, OVS startups vHost
> > > > > > interface like
> > > > > this:
> > > > > > ovs-vsctl add-port br0 vhost-user-1 -- set Interface
> > > > > > vhost-user-1 type=dpdkvhostuser drivers/net/vhost implements
> > > > > > vHost as
> > > > rte_eth_dev
> > > > > and integrated in OVS.
> > > > > > OVS can send/receive packets to/from VM with
> > > > > > rte_eth_tx_burst()
> > > > > > rte_eth_rx_burst() which call eth_dev_ops implementation of
> > > > > drivers/net/vhost.
> > > > >
> > > > > No, it is using rte_vhost_dequeue_burst() and
> > > > > rte_vhost_enqueue_burst() which are not in ethdev.
> > > > >
> > > > > > vDPA is also Virtio back end and works like vHost, same as
> > > > > > vHost, it will be implemented as rte_eth_dev and also be integrated
> into OVS.
> > > > >
> > > > > No, vDPA is not "implemented as rte_eth_dev".
> > > >
> > > > Currently, vDPA isn't integrated with OVS.
> > > >
> > > > > > So, it's not ok to move ifc code from drivers/net.
> > > > >
> > > > > drivers/net/ifc has no ethdev implementation at all.
> > > >
> > > > For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev,
> > > > but there are many discussions in OVS community about vDPA, some
> > > > are from Mellanox, it seems vDPA port will be implemented as
> > > > rte_eth_dev port in OVS in the near feature.
> > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > patc
> > > > h
> > > >
> > >
> work.ozlabs.org%2Fpatch%2F1178474%2F&amp;data=02%7C01%7Cmatan%
> > > 4
> > > >
> > >
> 0mellanox.com%7C9e84c2581e2f414e0aca08d794f22e8d%7Ca652971c7d2e4
> > > d
> > > >
> > > 9ba6a4d149256f461b%7C0%7C0%7C637141640216181763&amp;sdata=TA%
> > > 2F
> > > >
> 0zU495kXUqhC6eP09NDzBZfjJz1dbfkRcDpV%2BYAs%3D&amp;reserved=0
> > > >
> > > > Matan,
> > > > Could you clarify how OVS integrates vDPA in Mellanox patch?
> > > >
> > > > >
> > > > > Rosen, I'm sorry, these arguments look irrelevant, so I won't
> > > > > consider them as blocking the integration of this patch.
> > > >
> > > > What I mentioned is not blocking the integration of this patch, I
> > > > just want to get clarification from Matan how to integrate vDPA port in
> OVS.
> > >
> > >
> > > Hi
> > >
> > > OVS like any other application should use the current API of vDPA to
> > > attach a probed vdpa device to a vhost device.
> > > See example application /examples/vdpa.
> > >
> > > Here, we just introduce a new class to hold all the vDPA drivers, no
> > > change in the API.
> > >
> > > As I understand, no vDPA device is currently integrated in OVS.
> > >
> > > I think it can be integrated only when a full offload will be
> > > integrated since the vDPA device forward the traffic from the HW
> > > directly to the virtio queue, once it will be there, I guess the
> > > offload will be configured by the representor of the vdpa device(VF)
> which is managed by an ethdev device.
> > >
> > >
> > > Matan.
> > >
> > Hi,
> >
> > I'm still confused about your last sentence " the representor of the vdpa
> device(VF) which is managed by an ethdev device".
> > My understanding is that there are some connections and dependency
> between rte_eth_dev and vdpa device?
> > Am I right or any other explanations from you?
> 
> A vDPA port does not allow any ethdev operations (like rte_flow).
> In order to configure some offloads on the device, OVS needs an ethdev port.
> In Mellanox case, an ethdev VF representor port can be instantiated.
> So we may have two ports for the same device:
> 	- vDPA for data path with the VM
> 	- ethdev for offloads control path

It's obviously that OVS needs these two functions of same device.
In DPDK part, I have some concerns about how to scan, probe and bond these two ports
of same device. Could you introduce it?
As far as I know, each network device provided by DPDK is identified as a port in OVS,
do you mind to take more clarification about how to connect or bond these two
functions of same device in OVS part?


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers
  2020-01-10 14:18                       ` Xu, Rosen
@ 2020-01-10 16:27                         ` Thomas Monjalon
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-10 16:27 UTC (permalink / raw)
  To: Xu, Rosen
  Cc: Matan Azrad, Maxime Coquelin, Bie, Tiwei, Wang, Zhihong, Wang,
	Xiao W, Yigit, Ferruh, dev, Pei, Andy, Roni Bar Yanai

10/01/2020 15:18, Xu, Rosen:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 10/01/2020 03:38, Xu, Rosen:
> > > From: Matan Azrad <matan@mellanox.com>
> > > > From: Xu, Rosen <rosen.xu@intel.com>
> > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > 09/01/2020 03:27, Xu, Rosen:
> > > > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > > > 08/01/2020 13:39, Xu, Rosen:
> > > > > > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > > > > > From: Xu, Rosen
> > > > > > > > > > > Did you think about OVS DPDK?
> > > > > > > > > > > vDPA is a basic module for OVS, currently it will take
> > > > > > > > > > > some exception path packet processing for OVS, so it
> > > > > > > > > > > still needs to integrate
> > > > > > > > eth_dev.
> > > > > > > > > >
> > > > > > > > > > I don't understand your question.
> > > > > > > > > >
> > > > > > > > > > What do you mean by "integrate eth_dev"?
> > > > > > > > >
> > > > > > > > > My questions is in OVS DPDK scenario vDPA device
> > > > > > > > > implements eth_dev ops, so create a new class and move ifc
> > > > > > > > > code to this new class
> > > > > > is not ok.
> > > > > > > >
> > > > > > > > 1/ I don't understand the relation with OVS.
> > > > > > > >
> > > > > > > > 2/ no, vDPA device implements vDPA ops.
> > > > > > > > If it implements ethdev ops, it is an ethdev device.
> > > > > > > >
> > > > > > > > Please show an example of what you claim.
> > > > > > >
> > > > > > > Answers of 1 and 2.
> > > > > > >
> > > > > > > In OVS DPDK, each network device(such as NIC, vHost etc) of
> > > > > > > DPDK needs to be implemented as rte_eth_dev and provides
> > > > > > > eth_dev_ops
> > > > > such
> > > > > > > as
> > > > > > packet TX/RX for OVS.
> > > > > >
> > > > > > No, OVS is also using the vhost API for vhost port.
> > > > >
> > > > > Yes, vhost pmd is not a good example.
> > > > >
> > > > > > > Take vHost(Virtio back end) for example, OVS startups vHost
> > > > > > > interface like
> > > > > > this:
> > > > > > > ovs-vsctl add-port br0 vhost-user-1 -- set Interface
> > > > > > > vhost-user-1 type=dpdkvhostuser drivers/net/vhost implements
> > > > > > > vHost as
> > > > > rte_eth_dev
> > > > > > and integrated in OVS.
> > > > > > > OVS can send/receive packets to/from VM with
> > > > > > > rte_eth_tx_burst()
> > > > > > > rte_eth_rx_burst() which call eth_dev_ops implementation of
> > > > > > drivers/net/vhost.
> > > > > >
> > > > > > No, it is using rte_vhost_dequeue_burst() and
> > > > > > rte_vhost_enqueue_burst() which are not in ethdev.
> > > > > >
> > > > > > > vDPA is also Virtio back end and works like vHost, same as
> > > > > > > vHost, it will be implemented as rte_eth_dev and also be integrated
> > into OVS.
> > > > > >
> > > > > > No, vDPA is not "implemented as rte_eth_dev".
> > > > >
> > > > > Currently, vDPA isn't integrated with OVS.
> > > > >
> > > > > > > So, it's not ok to move ifc code from drivers/net.
> > > > > >
> > > > > > drivers/net/ifc has no ethdev implementation at all.
> > > > >
> > > > > For OVS hasn't integrated vDPA, it doesn't implement rte_eth_dev,
> > > > > but there are many discussions in OVS community about vDPA, some
> > > > > are from Mellanox, it seems vDPA port will be implemented as
> > > > > rte_eth_dev port in OVS in the near feature.
> > > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > > patc
> > > > > h
> > > > >
> > > >
> > work.ozlabs.org%2Fpatch%2F1178474%2F&amp;data=02%7C01%7Cmatan%
> > > > 4
> > > > >
> > > >
> > 0mellanox.com%7C9e84c2581e2f414e0aca08d794f22e8d%7Ca652971c7d2e4
> > > > d
> > > > >
> > > > 9ba6a4d149256f461b%7C0%7C0%7C637141640216181763&amp;sdata=TA%
> > > > 2F
> > > > >
> > 0zU495kXUqhC6eP09NDzBZfjJz1dbfkRcDpV%2BYAs%3D&amp;reserved=0
> > > > >
> > > > > Matan,
> > > > > Could you clarify how OVS integrates vDPA in Mellanox patch?
> > > > >
> > > > > >
> > > > > > Rosen, I'm sorry, these arguments look irrelevant, so I won't
> > > > > > consider them as blocking the integration of this patch.
> > > > >
> > > > > What I mentioned is not blocking the integration of this patch, I
> > > > > just want to get clarification from Matan how to integrate vDPA port in
> > OVS.
> > > >
> > > >
> > > > Hi
> > > >
> > > > OVS like any other application should use the current API of vDPA to
> > > > attach a probed vdpa device to a vhost device.
> > > > See example application /examples/vdpa.
> > > >
> > > > Here, we just introduce a new class to hold all the vDPA drivers, no
> > > > change in the API.
> > > >
> > > > As I understand, no vDPA device is currently integrated in OVS.
> > > >
> > > > I think it can be integrated only when a full offload will be
> > > > integrated since the vDPA device forward the traffic from the HW
> > > > directly to the virtio queue, once it will be there, I guess the
> > > > offload will be configured by the representor of the vdpa device(VF)
> > which is managed by an ethdev device.
> > > >
> > > >
> > > > Matan.
> > > >
> > > Hi,
> > >
> > > I'm still confused about your last sentence " the representor of the vdpa
> > device(VF) which is managed by an ethdev device".
> > > My understanding is that there are some connections and dependency
> > between rte_eth_dev and vdpa device?
> > > Am I right or any other explanations from you?
> > 
> > A vDPA port does not allow any ethdev operations (like rte_flow).
> > In order to configure some offloads on the device, OVS needs an ethdev port.
> > In Mellanox case, an ethdev VF representor port can be instantiated.
> > So we may have two ports for the same device:
> > 	- vDPA for data path with the VM
> > 	- ethdev for offloads control path
> 
> It's obviously that OVS needs these two functions of same device.
> In DPDK part, I have some concerns about how to scan, probe and bond these two ports
> of same device. Could you introduce it?
> As far as I know, each network device provided by DPDK is identified as a port in OVS,
> do you mind to take more clarification about how to connect or bond these two
> functions of same device in OVS part?

This is a different discussion.
You are asking a design review of vDPA integration in OVS.
I think it should be discussed separately.
And I have no time for such discussion currently. It will come later.

As it does not block this series, I will stop here, sorry.
Feel free to start a thread proposing how to integrate vDPA with offloads,
either in general, or specifically for IFC.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table Matan Azrad
@ 2020-01-10 18:26     ` Thomas Monjalon
  2020-01-13 22:40     ` Thomas Monjalon
  1 sibling, 0 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-10 18:26 UTC (permalink / raw)
  To: Andrew Rybchenko, Matan Azrad
  Cc: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang, dev,
	Ferruh Yigit, dev

09/01/2020 12:00, Matan Azrad:
> +This section explains the supported features that are listed in the table below.
> +
> +  * csum - Device can handle packets with partial checksum.
> +  * guest csum - Guest can handle packets with partial checksum.
> +  * mac - Device has given MAC address.
> +  * gso - Device can handle packets with any GSO type.
> +  * guest tso4 - Guest can receive TSOv4.
> +  * guest tso6 - Guest can receive TSOv6.
> +  * ecn - Device can receive TSO with ECN.
> +  * ufo - Device can receive UFO.
> +  * host tso4 - Device can receive TSOv4.
> +  * host tso6 - Device can receive TSOv6.
> +  * mrg rxbuf - Guest can merge receive buffers.
> +  * ctrl vq - Control channel is available.
> +  * ctrl rx - Control channel RX mode support.
> +  * any layout - Device can handle any descriptor layout.
> +  * guest announce - Guest can send gratuitous packets.
> +  * mq - Device supports Receive Flow Steering.
> +  * version 1 - v1.0 compliant.
> +  * log all - Device can log all write descriptors (live migration).
> +  * indirect desc - Indirect buffer descriptors support.
> +  * event idx - Support for avail_idx and used_idx fields.
> +  * mtu - Host can advise the guest with its maximum supported MTU.
> +  * in_order - Device can use descriptors in ring order.
> +  * IOMMU platform - Device support IOMMU addresses.
> +  * packed - Device support packed virtio queues.
> +  * proto mq - Support the number of queues query.
> +  * proto log shmfd - Guest support setting log base.
> +  * proto rarp - Host can broadcast a fake RARP after live migration.
> +  * proto reply ack - Host support requested operation status ack.
> +  * proto host notifier - Host can register memory region based host notifiers.
> +  * proto pagefault - Slave expose page-fault FD for migration process.
> +  * BSD nic_uio - BSD ``nic_uio`` module supported.
> +  * Linux VFIO - Works with ``vfio-pci`` kernel module.
> +  * Other kdrv - Kernel module other than above ones supported.
> +  * ARMv7 - Support armv7 architecture.
> +  * ARMv8 - Support armv8a (64bit) architecture.
> +  * Power8 - Support PowerPC architecture.
> +  * x86-32 - Support 32bits x86 architecture.
> +  * x86-64 - Support 64bits x86 architecture.
> +  * Usage doc - Documentation describes usage, In ``doc/guides/vdpadevs/``.
> +  * Design doc - Documentation describes design. In ``doc/guides/vdpadevs/``.
> +  * Perf doc - Documentation describes performance values, In ``doc/perf/``.

It may be appropriate to use the RST syntax for definitions:
	https://docutils.sourceforge.io/docs/user/rst/quickref.html#definition-lists

Andrew proposed to describe each feature with the same properties as for ethdev:
	http://code.dpdk.org/dpdk/latest/source/doc/guides/nics/features.rst
You replied that it would be redundant for each feature.
In order to be more specific, the ethdev feature properties are:
	- [uses] = input fields and constants
	- [implements] = dev_ops functions
	- [provides] = output fields
	- [related] = API function

The API is very simple:
	http://code.dpdk.org/dpdk/latest/source/lib/librte_vhost/rte_vdpa.h
The relevant dev_ops are written in the below note.

> +.. note::
> +
> +   Most of the features capabilities should be provided by the drivers via the
> +   next vDPA operations: ``get_features`` and ``get_protocol_features``.

I don't see what else can be filled in [uses], [implements] and [provides]
in the case of vDPA, so I suggest to keep it simple as it is.




^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-10 12:59                 ` Thomas Monjalon
@ 2020-01-10 19:17                   ` Kevin Traynor
  0 siblings, 0 replies; 50+ messages in thread
From: Kevin Traynor @ 2020-01-10 19:17 UTC (permalink / raw)
  To: Thomas Monjalon, Wang, Haiyue, Maxime Coquelin
  Cc: Matan Azrad, Bie, Tiwei, Wang, Zhihong, Wang, Xiao W, Yigit,
	Ferruh, dev, Andrew Rybchenko

On 10/01/2020 12:59, Thomas Monjalon wrote:
> 10/01/2020 13:34, Maxime Coquelin:
>> On 1/10/20 1:31 PM, Wang, Haiyue wrote:
>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>> 10/01/2020 10:07, Matan Azrad:
>>>>> From: Wang, Haiyue
>>>>>> From: Matan Azrad
>>>>>>>>  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
>>>>>>>> doc/guides/vdpadevs/features/ifcvf.ini
>>>>>>>>  create mode 100644 doc/guides/vdpadevs/ifc.rst  delete mode 100644
>>>>>>>> drivers/net/ifc/Makefile  delete mode 100644
>>>>>>>> drivers/net/ifc/base/ifcvf.c  delete mode 100644
>>>>>>>> drivers/net/ifc/base/ifcvf.h  delete mode 100644
>>>>>>>> drivers/net/ifc/base/ifcvf_osdep.h
>>>>>>>>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
>>>>>>>> drivers/net/ifc/meson.build  delete mode 100644
>>>>>>>> drivers/net/ifc/rte_pmd_ifc_version.map
>>>>>>>>  create mode 100644 drivers/vdpa/ifc/Makefile  create mode 100644
>>>>>>>> drivers/vdpa/ifc/base/ifcvf.c  create mode 100644
>>>>>>>> drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
>>>>>>>> drivers/vdpa/ifc/base/ifcvf_osdep.h
>>>>>>>>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode
>>>>>>>> 100644 drivers/vdpa/ifc/meson.build  create mode 100644
>>>>>>>> drivers/vdpa/ifc/rte_pmd_ifc_version.map
>>>>>>>>
>>>>>>
>>>>>> git mv drivers/net/ifc/ drivers/vdpa/ifc  ? ;-)
>>>>>
>>>>> Yes, and more file move in docs. (you can see like rename in git 😊)
>>>>> Adjusted also the classes makefiles\measons to remove from net and to add in vdpa.
>>>>> Also MAINTAINER file etc...
>>>>
>>>> I think the comment from Haiyue was about the files deleted and created
>>>> in the patch, instead of being moved (renamed).
>>>
>>> Yes, I moved one Intel PMD code, then the patch is very small, it is like:
>>>
>>>  rename drivers/{net/iavf/base => common/iavf}/iavf_common.c (100%)
>>>  rename drivers/{net/iavf/base => common/iavf}/iavf_devids.h (100%)
>>>
>>> detail is : https://patchwork.dpdk.org/patch/64384/
>>
>> Nice, and the advantage of doing so is that git would be smarter when
>> doing backport to stable branch.
> 
> No, git won't be smarter.
> As explained below, the format of the diff does not change the internal
> git representation of the change.
> Move/Rename is just a nice formatting done by smart detection.
> 

+1. Thanks for considering it, just checked and confirmed this with a
test backport on one of the "moved" files. (btw, see stats when patch is
applied with git version 2.21.1.)

 MAINTAINERS                                       | 14 ++++++--------
 doc/guides/nics/index.rst                         |  1 -
 doc/guides/{nics => vdpadevs}/features/ifcvf.ini  |  0
 doc/guides/{nics => vdpadevs}/ifc.rst             |  0
 doc/guides/vdpadevs/index.rst                     |  1 +
 drivers/net/Makefile                              |  3 ---
 drivers/net/meson.build                           |  1 -
 drivers/vdpa/Makefile                             |  6 ++++++
 drivers/{net => vdpa}/ifc/Makefile                |  0
 drivers/{net => vdpa}/ifc/base/ifcvf.c            |  0
 drivers/{net => vdpa}/ifc/base/ifcvf.h            |  0
 drivers/{net => vdpa}/ifc/base/ifcvf_osdep.h      |  0
 drivers/{net => vdpa}/ifc/ifcvf_vdpa.c            |  0
 drivers/{net => vdpa}/ifc/meson.build             |  0
 drivers/{net => vdpa}/ifc/rte_pmd_ifc_version.map |  0
 drivers/vdpa/meson.build                          |  2 +-
 16 files changed, 14 insertions(+), 14 deletions(-)


>>>> As far as I know, this is for 2 reasons:
>>>> 	- there is no move in internal git representation
>>>> 	- some versions of git-diff does not detect moves properly,
>>>> 	  but option -M may help
> 
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table Matan Azrad
  2020-01-10 18:26     ` Thomas Monjalon
@ 2020-01-13 22:40     ` Thomas Monjalon
  1 sibling, 0 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-13 22:40 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang, dev,
	Ferruh Yigit, dev, Andrew Rybchenko

09/01/2020 12:00, Matan Azrad:
> +Useful links
> +============
> +
> +  * `OASIS: Virtual I/O Device (VIRTIO) Version 1.1 <https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01>`_.
> +  * `QEMU: Vhost-user Protocol <https://qemu.weilnetz.de/doc/interop/vhost-user>`_.

.html suffix is missing in these links.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
  2020-01-09 17:25     ` Matan Azrad
@ 2020-01-13 22:57     ` Thomas Monjalon
  1 sibling, 0 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-13 22:57 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang, dev,
	Ferruh Yigit, dev, Andrew Rybchenko

09/01/2020 12:00, Matan Azrad:
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
> +ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
> +endif
> +endif # $(CONFIG_RTE_LIBRTE_VHOST)

All vDPA drivers will need vhost lib.
As it is already a dependency in drivers/Makefile,
I will remove ifeq/endif vhost from here.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Introduce new class for vDPA device drivers
  2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
                     ` (2 preceding siblings ...)
  2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
@ 2020-01-13 23:08   ` Thomas Monjalon
  3 siblings, 0 replies; 50+ messages in thread
From: Thomas Monjalon @ 2020-01-13 23:08 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Maxime Coquelin, Tiwei Bie, Zhihong Wang, Xiao Wang, dev,
	Ferruh Yigit, dev, Andrew Rybchenko, rosen.xu

09/01/2020 12:00, Matan Azrad:
> v2:
> Apply comments from Maxime Coquelin, Andrew Rybchenko and Tiwei Bie.
> 
> 
> Matan Azrad (3):
>   drivers: introduce vDPA class
>   doc: add vDPA feature table
>   drivers: move ifc driver to the vDPA class

I've fixed few minor things as discussed in this thread.

Summary of other discussions:
- Rosen said he "is not blocking the integration of this patch".
- ifc features need to be filled in a separate patch.
- ifc patches will be merged in the next-virtio tree starting now.
- Andrew and Tiwei asked for some changes in the doc which are addressed in v2,
or justified (features description is chosen to be kept simple for now).

This is the very first step for this new drivers directory,
and we will surely apply some improvements in this area
when adding more drivers.

Applied, thanks



^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2020-01-13 23:08 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25 15:19 [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class Matan Azrad
2020-01-07 17:32   ` Maxime Coquelin
2020-01-08 21:28     ` Thomas Monjalon
2020-01-09  8:00       ` Maxime Coquelin
2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table Matan Azrad
2020-01-07 17:39   ` Maxime Coquelin
2020-01-08  5:28     ` Tiwei Bie
2020-01-08  7:20       ` Andrew Rybchenko
2020-01-08 10:42         ` Matan Azrad
2020-01-08 13:11           ` Andrew Rybchenko
2020-01-08 17:01             ` Matan Azrad
2020-01-09  2:15           ` Tiwei Bie
2020-01-09  8:08             ` Matan Azrad
2019-12-25 15:19 ` [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
2020-01-07 18:17   ` Maxime Coquelin
2020-01-07  7:57 ` [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers Matan Azrad
2020-01-08  5:44   ` Xu, Rosen
2020-01-08 10:45     ` Matan Azrad
2020-01-08 12:39       ` Xu, Rosen
2020-01-08 12:58         ` Thomas Monjalon
2020-01-09  2:27           ` Xu, Rosen
2020-01-09  8:41             ` Thomas Monjalon
2020-01-09  9:23               ` Maxime Coquelin
2020-01-09  9:49                 ` Xu, Rosen
2020-01-09 10:42                   ` Maxime Coquelin
2020-01-10  2:40                     ` Xu, Rosen
2020-01-09 10:42                   ` Maxime Coquelin
2020-01-09 10:53               ` Xu, Rosen
2020-01-09 11:34                 ` Matan Azrad
2020-01-10  2:38                   ` Xu, Rosen
2020-01-10  9:21                     ` Thomas Monjalon
2020-01-10 14:18                       ` Xu, Rosen
2020-01-10 16:27                         ` Thomas Monjalon
2020-01-09 11:00 ` [dpdk-dev] [PATCH v2 " Matan Azrad
2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 1/3] drivers: introduce vDPA class Matan Azrad
2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 2/3] doc: add vDPA feature table Matan Azrad
2020-01-10 18:26     ` Thomas Monjalon
2020-01-13 22:40     ` Thomas Monjalon
2020-01-09 11:00   ` [dpdk-dev] [PATCH v2 3/3] drivers: move ifc driver to the vDPA class Matan Azrad
2020-01-09 17:25     ` Matan Azrad
2020-01-10  1:55       ` Wang, Haiyue
2020-01-10  9:07         ` Matan Azrad
2020-01-10  9:13           ` Thomas Monjalon
2020-01-10 12:31             ` Wang, Haiyue
2020-01-10 12:34               ` Maxime Coquelin
2020-01-10 12:59                 ` Thomas Monjalon
2020-01-10 19:17                   ` Kevin Traynor
2020-01-13 22:57     ` Thomas Monjalon
2020-01-13 23:08   ` [dpdk-dev] [PATCH v2 0/3] Introduce new class for vDPA device drivers Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).