DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 00/10] support software live migration
@ 2024-04-26  7:48 Chaoyong He
  2024-04-26  7:48 ` [PATCH 01/10] mailmap: add new contributor Chaoyong He
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Chaoyong He

This patch series aims to add the support of software live migration
feature for NFP vDPA device.

Xinying Yu (10):
  mailmap: add new contributor
  vdpa/nfp: fix logic in hardware init
  vdpa/nfp: fix the logic of reconfiguration
  vdpa/nfp: refactor the logic of datapath update
  vdpa/nfp: add the live migration logic
  vdpa/nfp: add the interrupt logic of vring relay
  vdpa/nfp: setup the VF configure
  vdpa/nfp: recover the ring index on new host
  vdpa/nfp: setup vring relay thread
  doc: update nfp document

 .mailmap                             |   1 +
 doc/guides/vdpadevs/nfp.rst          |   9 +
 drivers/common/nfp/nfp_common_ctrl.h |  11 +-
 drivers/vdpa/nfp/nfp_vdpa.c          | 440 +++++++++++++++++++++++++--
 drivers/vdpa/nfp/nfp_vdpa_core.c     | 135 ++++++--
 drivers/vdpa/nfp/nfp_vdpa_core.h     |  14 +
 6 files changed, 564 insertions(+), 46 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 01/10] mailmap: add new contributor
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 02/10] vdpa/nfp: fix logic in hardware init Chaoyong He
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Add new contributor.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 848603bfc5..84a6ff84f9 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1597,6 +1597,7 @@ Xieming Katty <katty.xieming@huawei.com>
 Xinfeng Zhao <xinfengx.zhao@intel.com>
 Xingguang He <xingguang.he@intel.com>
 Xingyou Chen <niatlantice@gmail.com>
+Xinying Yu <xinying.yu@corigine.com>
 Xin Long <longxin.xl@alibaba-inc.com>
 Xi Zhang <xix.zhang@intel.com>
 Xuan Ding <xuan.ding@intel.com>
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 02/10] vdpa/nfp: fix logic in hardware init
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
  2024-04-26  7:48 ` [PATCH 01/10] mailmap: add new contributor Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 03/10] vdpa/nfp: fix the logic of reconfiguration Chaoyong He
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, chaoyong.he, stable, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Reconfigure the NIC will fail because lack of the
initialization logic of queue configuration pointer.
Fix this by adding the correct initialization logic.

Fixes: d89f4990c14e ("vdpa/nfp: add hardware init")
Cc: chaoyong.he@corigine.com
Cc: stable@dpdk.org

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/vdpa/nfp/nfp_vdpa_core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 7b877605e4..291798196c 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -55,7 +55,10 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 		struct rte_pci_device *pci_dev)
 {
 	uint32_t queue;
+	uint8_t *tx_bar;
+	uint32_t start_q;
 	struct nfp_hw *hw;
+	uint32_t tx_bar_off;
 	uint8_t *notify_base;
 
 	hw = &vdpa_hw->super;
@@ -82,6 +85,12 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 				idx + 1, vdpa_hw->notify_addr[idx + 1]);
 	}
 
+	/* NFP vDPA cfg queue setup */
+	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+	tx_bar_off = start_q * NFP_QCP_QUEUE_ADDR_SZ;
+	tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
+	hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
+
 	vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) |
 			(1ULL << VIRTIO_F_IN_ORDER) |
 			(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 03/10] vdpa/nfp: fix the logic of reconfiguration
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
  2024-04-26  7:48 ` [PATCH 01/10] mailmap: add new contributor Chaoyong He
  2024-04-26  7:48 ` [PATCH 02/10] vdpa/nfp: fix logic in hardware init Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 04/10] vdpa/nfp: refactor the logic of datapath update Chaoyong He
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, chaoyong.he, stable, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

The ctrl words of vDPA is locate on the extend word, so should
use the 'nfp_ext_reconfig()' rather than 'nfp_reconfig()'.

Also replace the misuse of 'NFP_NET_CFG_CTRL_SCATTER' macro
with 'NFP_NET_CFG_CTRL_VIRTIO'.

Fixes: b47a0373903f ("vdpa/nfp: add datapath update")
Cc: chaoyong.he@corigine.com
Cc: stable@dpdk.org

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/common/nfp/nfp_common_ctrl.h |  1 +
 drivers/vdpa/nfp/nfp_vdpa_core.c     | 16 ++++++++++++----
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h
index 6badf769fc..a0e62b063d 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -184,6 +184,7 @@ struct nfp_net_fw_ver {
 #define NFP_NET_CFG_CTRL_IPSEC_LM_LOOKUP  (0x1 << 4) /**< SA long match lookup */
 #define NFP_NET_CFG_CTRL_MULTI_PF         (0x1 << 5)
 #define NFP_NET_CFG_CTRL_FLOW_STEER       (0x1 << 8) /**< Flow Steering */
+#define NFP_NET_CFG_CTRL_VIRTIO           (0x1 << 10) /**< Virtio offload */
 #define NFP_NET_CFG_CTRL_IN_ORDER         (0x1 << 11) /**< Virtio in-order flag */
 #define NFP_NET_CFG_CTRL_USO              (0x1 << 16) /**< UDP segmentation offload */
 
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 291798196c..6d07356581 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -101,7 +101,7 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 static uint32_t
 nfp_vdpa_check_offloads(void)
 {
-	return NFP_NET_CFG_CTRL_SCATTER |
+	return NFP_NET_CFG_CTRL_VIRTIO  |
 			NFP_NET_CFG_CTRL_IN_ORDER;
 }
 
@@ -112,6 +112,7 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 	int ret;
 	uint32_t update;
 	uint32_t new_ctrl;
+	uint32_t new_ext_ctrl;
 	struct timespec wait_tst;
 	struct nfp_hw *hw = &vdpa_hw->super;
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
@@ -131,8 +132,6 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 	nfp_disable_queues(hw);
 	nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
 
-	new_ctrl = nfp_vdpa_check_offloads();
-
 	nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216);
 	nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240);
 
@@ -147,8 +146,17 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 	/* Writing new MAC to the specific port BAR address */
 	nfp_write_mac(hw, (uint8_t *)mac_addr);
 
+	new_ext_ctrl = nfp_vdpa_check_offloads();
+
+	update = NFP_NET_CFG_UPDATE_GEN;
+	ret = nfp_ext_reconfig(hw, new_ext_ctrl, update);
+	if (ret != 0)
+		return -EIO;
+
+	hw->ctrl_ext = new_ext_ctrl;
+
 	/* Enable device */
-	new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+	new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
 
 	/* Signal the NIC about the change */
 	update = NFP_NET_CFG_UPDATE_MACADDR |
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 04/10] vdpa/nfp: refactor the logic of datapath update
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (2 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 03/10] vdpa/nfp: fix the logic of reconfiguration Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 05/10] vdpa/nfp: add the live migration logic Chaoyong He
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

In order to add the new configuration logic of software live
migration, split the datapath update logic into two parts,
queue configuration and VF configuration.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/vdpa/nfp/nfp_vdpa_core.c | 54 +++++++++++++++++++++-----------
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 6d07356581..79ecd2b4fc 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -105,8 +105,8 @@ nfp_vdpa_check_offloads(void)
 			NFP_NET_CFG_CTRL_IN_ORDER;
 }
 
-int
-nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
+static int
+nfp_vdpa_vf_config(struct nfp_hw *hw,
 		int vid)
 {
 	int ret;
@@ -114,24 +114,8 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 	uint32_t new_ctrl;
 	uint32_t new_ext_ctrl;
 	struct timespec wait_tst;
-	struct nfp_hw *hw = &vdpa_hw->super;
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
-	nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0), rte_log2_u32(vdpa_hw->vring[1].size));
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
-
-	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
-	nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0), rte_log2_u32(vdpa_hw->vring[0].size));
-	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
-	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
-
-	rte_wmb();
-
-	nfp_disable_queues(hw);
-	nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
-
 	nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216);
 	nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240);
 
@@ -177,6 +161,40 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 	return 0;
 }
 
+static void
+nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw)
+{
+	struct nfp_hw *hw = &vdpa_hw->super;
+
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
+	nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0),
+			rte_log2_u32(vdpa_hw->vring[1].size));
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
+
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
+	nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0),
+			rte_log2_u32(vdpa_hw->vring[0].size));
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
+
+	rte_wmb();
+}
+
+int
+nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
+		int vid)
+{
+	struct nfp_hw *hw = &vdpa_hw->super;
+
+	nfp_vdpa_queue_config(vdpa_hw);
+
+	nfp_disable_queues(hw);
+	nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
+
+	return nfp_vdpa_vf_config(hw, vid);
+}
+
 void
 nfp_vdpa_hw_stop(struct nfp_vdpa_hw *vdpa_hw)
 {
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 05/10] vdpa/nfp: add the live migration logic
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (3 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 04/10] vdpa/nfp: refactor the logic of datapath update Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 06/10] vdpa/nfp: add the interrupt logic of vring relay Chaoyong He
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Add the basic logic of software live migration.

Unset the ring notify area to stop the direct IO datapath if the
device support, then we can setup the vring relay to help the
live migration.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/vdpa/nfp/nfp_vdpa.c      | 66 +++++++++++++++++++++++++++++++-
 drivers/vdpa/nfp/nfp_vdpa_core.c |  3 ++
 drivers/vdpa/nfp/nfp_vdpa_core.h |  4 ++
 3 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index cef80b5476..45092cb0af 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -603,6 +603,30 @@ update_datapath(struct nfp_vdpa_dev *device)
 	return ret;
 }
 
+static int
+nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
+{
+	int ret;
+	int vid = device->vid;
+
+	/* Stop the direct IO data path */
+	nfp_vdpa_unset_notify_relay(device);
+	nfp_vdpa_disable_vfio_intr(device);
+
+	ret = rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false);
+	if ((ret != 0) && (ret != -ENOTSUP)) {
+		DRV_VDPA_LOG(ERR, "Unset the host notifier failed.");
+		goto error;
+	}
+
+	device->hw.sw_fallback_running = true;
+
+	return 0;
+
+error:
+	return ret;
+}
+
 static int
 nfp_vdpa_dev_config(int vid)
 {
@@ -646,8 +670,18 @@ nfp_vdpa_dev_close(int vid)
 	}
 
 	device = node->device;
-	rte_atomic_store_explicit(&device->dev_attached, 0, rte_memory_order_relaxed);
-	update_datapath(device);
+	if (device->hw.sw_fallback_running) {
+		device->hw.sw_fallback_running = false;
+
+		rte_atomic_store_explicit(&device->dev_attached, 0,
+				rte_memory_order_relaxed);
+		rte_atomic_store_explicit(&device->running, 0,
+				rte_memory_order_relaxed);
+	} else {
+		rte_atomic_store_explicit(&device->dev_attached, 0,
+				rte_memory_order_relaxed);
+		update_datapath(device);
+	}
 
 	return 0;
 }
@@ -770,7 +804,35 @@ nfp_vdpa_get_protocol_features(struct rte_vdpa_device *vdev __rte_unused,
 static int
 nfp_vdpa_set_features(int32_t vid)
 {
+	int ret;
+	uint64_t features = 0;
+	struct nfp_vdpa_dev *device;
+	struct rte_vdpa_device *vdev;
+	struct nfp_vdpa_dev_node *node;
+
 	DRV_VDPA_LOG(DEBUG, "Start vid=%d", vid);
+
+	vdev = rte_vhost_get_vdpa_device(vid);
+	node = nfp_vdpa_find_node_by_vdev(vdev);
+	if (node == NULL) {
+		DRV_VDPA_LOG(ERR, "Invalid vDPA device: %p", vdev);
+		return -ENODEV;
+	}
+
+	rte_vhost_get_negotiated_features(vid, &features);
+
+	if (RTE_VHOST_NEED_LOG(features) == 0)
+		return 0;
+
+	device = node->device;
+	if (device->hw.sw_lm) {
+		ret = nfp_vdpa_sw_fallback(device);
+		if (ret != 0) {
+			DRV_VDPA_LOG(ERR, "Software fallback start failed");
+			return -1;
+		}
+	}
+
 	return 0;
 }
 
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 79ecd2b4fc..82a323a6d0 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -91,8 +91,11 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 	tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
 	hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
 
+	vdpa_hw->sw_lm = true;
+
 	vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) |
 			(1ULL << VIRTIO_F_IN_ORDER) |
+			(1ULL << VHOST_F_LOG_ALL) |
 			(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
 
 	return 0;
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.h b/drivers/vdpa/nfp/nfp_vdpa_core.h
index a8e0d6dd70..0f880fc0c6 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.h
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.h
@@ -36,6 +36,10 @@ struct nfp_vdpa_hw {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 	uint8_t notify_region;
 	uint8_t nr_vring;
+
+	/** Software Live Migration */
+	bool sw_lm;
+	bool sw_fallback_running;
 };
 
 int nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, struct rte_pci_device *dev);
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 06/10] vdpa/nfp: add the interrupt logic of vring relay
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (4 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 05/10] vdpa/nfp: add the live migration logic Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 07/10] vdpa/nfp: setup the VF configure Chaoyong He
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Add the interrupt setup logic of vring relay.

The epoll fd is provided here so host can get the interrupt from device
on Rx direction, all other operations on vring relay are based on this.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/vdpa/nfp/nfp_vdpa.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 45092cb0af..1643ebbb8c 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -336,8 +336,10 @@ nfp_vdpa_stop(struct nfp_vdpa_dev *device)
 }
 
 static int
-nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device)
+nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device,
+		bool relay)
 {
+	int fd;
 	int ret;
 	uint16_t i;
 	int *fd_ptr;
@@ -366,6 +368,19 @@ nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device)
 		fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
 	}
 
+	if (relay) {
+		for (i = 0; i < nr_vring; i += 2) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				DRV_VDPA_LOG(ERR, "Can't setup eventfd");
+				return -EINVAL;
+			}
+
+			device->intr_fd[i] = fd;
+			fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
+		}
+	}
+
 	ret = ioctl(device->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 	if (ret != 0) {
 		DRV_VDPA_LOG(ERR, "Error enabling MSI-X interrupts.");
@@ -556,7 +571,7 @@ update_datapath(struct nfp_vdpa_dev *device)
 		if (ret != 0)
 			goto unlock_exit;
 
-		ret = nfp_vdpa_enable_vfio_intr(device);
+		ret = nfp_vdpa_enable_vfio_intr(device, false);
 		if (ret != 0)
 			goto dma_map_rollback;
 
@@ -619,6 +634,11 @@ nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
 		goto error;
 	}
 
+	/* Setup interrupt for vring relay */
+	ret = nfp_vdpa_enable_vfio_intr(device, true);
+	if (ret != 0)
+		goto error;
+
 	device->hw.sw_fallback_running = true;
 
 	return 0;
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 07/10] vdpa/nfp: setup the VF configure
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (5 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 06/10] vdpa/nfp: add the interrupt logic of vring relay Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 08/10] vdpa/nfp: recover the ring index on new host Chaoyong He
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Create the relay vring on host and then set the address of Rx
used ring to the VF config bar. So the device can DMA the
used ring information to host rather than directly to VM.

Use 'NFP_NET_CFG_CTRL_LM_RELAY' notify the device side. And
enable the MSIX interrupt on device.

Tx ring address is no need to change since the relay vring only
assistant Rx ring to do the dirty page logging.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/common/nfp/nfp_common_ctrl.h |   3 +
 drivers/vdpa/nfp/nfp_vdpa.c          | 202 ++++++++++++++++++++++++---
 drivers/vdpa/nfp/nfp_vdpa_core.c     |  55 ++++++--
 drivers/vdpa/nfp/nfp_vdpa_core.h     |   8 ++
 4 files changed, 238 insertions(+), 30 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h
index a0e62b063d..9311d01590 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -186,6 +186,9 @@ struct nfp_net_fw_ver {
 #define NFP_NET_CFG_CTRL_FLOW_STEER       (0x1 << 8) /**< Flow Steering */
 #define NFP_NET_CFG_CTRL_VIRTIO           (0x1 << 10) /**< Virtio offload */
 #define NFP_NET_CFG_CTRL_IN_ORDER         (0x1 << 11) /**< Virtio in-order flag */
+#define NFP_NET_CFG_CTRL_LM_RELAY         (0x1 << 12) /**< Virtio live migration relay start */
+#define NFP_NET_CFG_CTRL_NOTIFY_DATA      (0x1 << 13) /**< Virtio notification data flag */
+#define NFP_NET_CFG_CTRL_SWLM             (0x1 << 14) /**< Virtio SW live migration enable */
 #define NFP_NET_CFG_CTRL_USO              (0x1 << 16) /**< UDP segmentation offload */
 
 #define NFP_NET_CFG_CAP_WORD1           0x00a4
diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 1643ebbb8c..65f7144671 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -11,6 +11,8 @@
 #include <nfp_common_pci.h>
 #include <nfp_dev.h>
 #include <rte_vfio.h>
+#include <rte_eal_paging.h>
+#include <rte_malloc.h>
 #include <vdpa_driver.h>
 
 #include "nfp_vdpa_core.h"
@@ -21,6 +23,9 @@
 #define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
 		sizeof(int) * (NFP_VDPA_MAX_QUEUES * 2 + 1))
 
+#define NFP_VDPA_USED_RING_LEN(size) \
+		((size) * sizeof(struct vring_used_elem) + sizeof(struct vring_used))
+
 struct nfp_vdpa_dev {
 	struct rte_pci_device *pci_dev;
 	struct rte_vdpa_device *vdev;
@@ -261,15 +266,85 @@ nfp_vdpa_qva_to_gpa(int vid,
 	return gpa;
 }
 
+static void
+nfp_vdpa_relay_vring_free(struct nfp_vdpa_dev *device,
+		uint16_t vring_index)
+{
+	uint16_t i;
+	uint64_t size;
+	struct rte_vhost_vring vring;
+	uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING;
+
+	for (i = 0; i < vring_index; i++) {
+		rte_vhost_get_vhost_vring(device->vid, i, &vring);
+
+		size = RTE_ALIGN_CEIL(vring_size(vring.size, rte_mem_page_size()),
+				rte_mem_page_size());
+		rte_vfio_container_dma_unmap(device->vfio_container_fd,
+				(uint64_t)(uintptr_t)device->hw.m_vring[i].desc,
+				m_vring_iova, size);
+
+		rte_free(device->hw.m_vring[i].desc);
+		m_vring_iova += size;
+	}
+}
+
 static int
-nfp_vdpa_start(struct nfp_vdpa_dev *device)
+nfp_vdpa_relay_vring_alloc(struct nfp_vdpa_dev *device)
+{
+	int ret;
+	uint16_t i;
+	uint64_t size;
+	void *vring_buf;
+	uint64_t page_size;
+	struct rte_vhost_vring vring;
+	struct nfp_vdpa_hw *vdpa_hw = &device->hw;
+	uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING;
+
+	page_size = rte_mem_page_size();
+
+	for (i = 0; i < vdpa_hw->nr_vring; i++) {
+		rte_vhost_get_vhost_vring(device->vid, i, &vring);
+
+		size = RTE_ALIGN_CEIL(vring_size(vring.size, page_size), page_size);
+		vring_buf = rte_zmalloc("nfp_vdpa_relay", size, page_size);
+		if (vring_buf == NULL)
+			goto vring_free_all;
+
+		vring_init(&vdpa_hw->m_vring[i], vring.size, vring_buf, page_size);
+
+		ret = rte_vfio_container_dma_map(device->vfio_container_fd,
+				(uint64_t)(uintptr_t)vring_buf, m_vring_iova, size);
+		if (ret != 0) {
+			DRV_VDPA_LOG(ERR, "vDPA vring relay dma map failed.");
+			goto vring_free_one;
+		}
+
+		m_vring_iova += size;
+	}
+
+	return 0;
+
+vring_free_one:
+	rte_free(device->hw.m_vring[i].desc);
+vring_free_all:
+	nfp_vdpa_relay_vring_free(device, i);
+
+	return -ENOSPC;
+}
+
+static int
+nfp_vdpa_start(struct nfp_vdpa_dev *device,
+		bool relay)
 {
 	int ret;
 	int vid;
 	uint16_t i;
 	uint64_t gpa;
+	uint16_t size;
 	struct rte_vhost_vring vring;
 	struct nfp_vdpa_hw *vdpa_hw = &device->hw;
+	uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING;
 
 	vid = device->vid;
 	vdpa_hw->nr_vring = rte_vhost_get_vring_num(vid);
@@ -278,15 +353,21 @@ nfp_vdpa_start(struct nfp_vdpa_dev *device)
 	if (ret != 0)
 		return ret;
 
+	if (relay) {
+		ret = nfp_vdpa_relay_vring_alloc(device);
+		if (ret != 0)
+			return ret;
+	}
+
 	for (i = 0; i < vdpa_hw->nr_vring; i++) {
 		ret = rte_vhost_get_vhost_vring(vid, i, &vring);
 		if (ret != 0)
-			return ret;
+			goto relay_vring_free;
 
 		gpa = nfp_vdpa_qva_to_gpa(vid, (uint64_t)(uintptr_t)vring.desc);
 		if (gpa == 0) {
 			DRV_VDPA_LOG(ERR, "Fail to get GPA for descriptor ring.");
-			return -1;
+			goto relay_vring_free;
 		}
 
 		vdpa_hw->vring[i].desc = gpa;
@@ -294,45 +375,122 @@ nfp_vdpa_start(struct nfp_vdpa_dev *device)
 		gpa = nfp_vdpa_qva_to_gpa(vid, (uint64_t)(uintptr_t)vring.avail);
 		if (gpa == 0) {
 			DRV_VDPA_LOG(ERR, "Fail to get GPA for available ring.");
-			return -1;
+			goto relay_vring_free;
 		}
 
 		vdpa_hw->vring[i].avail = gpa;
 
-		gpa = nfp_vdpa_qva_to_gpa(vid, (uint64_t)(uintptr_t)vring.used);
-		if (gpa == 0) {
-			DRV_VDPA_LOG(ERR, "Fail to get GPA for used ring.");
-			return -1;
-		}
+		/* Direct I/O for Tx queue, relay for Rx queue */
+		if (relay && ((i & 1) == 0)) {
+			vdpa_hw->vring[i].used = m_vring_iova +
+					(char *)vdpa_hw->m_vring[i].used -
+					(char *)vdpa_hw->m_vring[i].desc;
+
+			ret = rte_vhost_get_vring_base(vid, i,
+					&vdpa_hw->m_vring[i].avail->idx,
+					&vdpa_hw->m_vring[i].used->idx);
+			if (ret != 0)
+				goto relay_vring_free;
+		} else {
+			gpa = nfp_vdpa_qva_to_gpa(vid, (uint64_t)(uintptr_t)vring.used);
+			if (gpa == 0) {
+				DRV_VDPA_LOG(ERR, "Fail to get GPA for used ring.");
+				goto relay_vring_free;
+			}
 
-		vdpa_hw->vring[i].used = gpa;
+			vdpa_hw->vring[i].used = gpa;
+		}
 
 		vdpa_hw->vring[i].size = vring.size;
 
+		if (relay) {
+			size = RTE_ALIGN_CEIL(vring_size(vring.size,
+					rte_mem_page_size()), rte_mem_page_size());
+			m_vring_iova += size;
+		}
+
 		ret = rte_vhost_get_vring_base(vid, i,
 				&vdpa_hw->vring[i].last_avail_idx,
 				&vdpa_hw->vring[i].last_used_idx);
 		if (ret != 0)
-			return ret;
+			goto relay_vring_free;
 	}
 
-	return nfp_vdpa_hw_start(&device->hw, vid);
+	if (relay)
+		return nfp_vdpa_relay_hw_start(&device->hw, vid);
+	else
+		return nfp_vdpa_hw_start(&device->hw, vid);
+
+relay_vring_free:
+	if (relay)
+		nfp_vdpa_relay_vring_free(device, vdpa_hw->nr_vring);
+
+	return -EFAULT;
+}
+
+static void
+nfp_vdpa_update_used_ring(struct nfp_vdpa_dev *dev,
+		uint16_t qid)
+{
+	rte_vdpa_relay_vring_used(dev->vid, qid, &dev->hw.m_vring[qid]);
+	rte_vhost_vring_call(dev->vid, qid);
 }
 
 static void
-nfp_vdpa_stop(struct nfp_vdpa_dev *device)
+nfp_vdpa_relay_stop(struct nfp_vdpa_dev *device)
 {
 	int vid;
 	uint32_t i;
+	uint64_t len;
+	struct rte_vhost_vring vring;
 	struct nfp_vdpa_hw *vdpa_hw = &device->hw;
 
 	nfp_vdpa_hw_stop(vdpa_hw);
 
 	vid = device->vid;
-	for (i = 0; i < vdpa_hw->nr_vring; i++)
+	for (i = 0; i < vdpa_hw->nr_vring; i++) {
+		/* Synchronize remaining new used entries if any */
+		if ((i & 1) == 0)
+			nfp_vdpa_update_used_ring(device, i);
+
+		rte_vhost_get_vhost_vring(vid, i, &vring);
+		len = NFP_VDPA_USED_RING_LEN(vring.size);
+		vdpa_hw->vring[i].last_avail_idx = vring.avail->idx;
+		vdpa_hw->vring[i].last_used_idx = vring.used->idx;
+
 		rte_vhost_set_vring_base(vid, i,
 				vdpa_hw->vring[i].last_avail_idx,
 				vdpa_hw->vring[i].last_used_idx);
+
+		rte_vhost_log_used_vring(vid, i, 0, len);
+
+		if (vring.used->idx != vring.avail->idx)
+			rte_atomic_store_explicit(&vring.used->idx, vring.avail->idx,
+					rte_memory_order_relaxed);
+	}
+
+	nfp_vdpa_relay_vring_free(device, vdpa_hw->nr_vring);
+}
+
+static void
+nfp_vdpa_stop(struct nfp_vdpa_dev *device,
+		bool relay)
+{
+	int vid;
+	uint32_t i;
+	struct nfp_vdpa_hw *vdpa_hw = &device->hw;
+
+	nfp_vdpa_hw_stop(vdpa_hw);
+
+	vid = device->vid;
+	if (relay)
+		nfp_vdpa_relay_stop(device);
+	else
+		for (i = 0; i < vdpa_hw->nr_vring; i++)
+			rte_vhost_set_vring_base(vid, i,
+					vdpa_hw->vring[i].last_avail_idx,
+					vdpa_hw->vring[i].last_used_idx);
+
 }
 
 static int
@@ -575,7 +733,7 @@ update_datapath(struct nfp_vdpa_dev *device)
 		if (ret != 0)
 			goto dma_map_rollback;
 
-		ret = nfp_vdpa_start(device);
+		ret = nfp_vdpa_start(device, false);
 		if (ret != 0)
 			goto disable_vfio_intr;
 
@@ -591,7 +749,7 @@ update_datapath(struct nfp_vdpa_dev *device)
 					rte_memory_order_relaxed) != 0))) {
 		nfp_vdpa_unset_notify_relay(device);
 
-		nfp_vdpa_stop(device);
+		nfp_vdpa_stop(device, false);
 
 		ret = nfp_vdpa_disable_vfio_intr(device);
 		if (ret != 0)
@@ -608,7 +766,7 @@ update_datapath(struct nfp_vdpa_dev *device)
 	return 0;
 
 vdpa_stop:
-	nfp_vdpa_stop(device);
+	nfp_vdpa_stop(device, false);
 disable_vfio_intr:
 	nfp_vdpa_disable_vfio_intr(device);
 dma_map_rollback:
@@ -639,10 +797,17 @@ nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
 	if (ret != 0)
 		goto error;
 
+	/* Config the VF */
+	ret = nfp_vdpa_start(device, true);
+	if (ret != 0)
+		goto unset_intr;
+
 	device->hw.sw_fallback_running = true;
 
 	return 0;
 
+unset_intr:
+	nfp_vdpa_disable_vfio_intr(device);
 error:
 	return ret;
 }
@@ -691,6 +856,9 @@ nfp_vdpa_dev_close(int vid)
 
 	device = node->device;
 	if (device->hw.sw_fallback_running) {
+		/* Reset VF */
+		nfp_vdpa_stop(device, true);
+
 		device->hw.sw_fallback_running = false;
 
 		rte_atomic_store_explicit(&device->dev_attached, 0,
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 82a323a6d0..63a69aaf36 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -110,7 +110,8 @@ nfp_vdpa_check_offloads(void)
 
 static int
 nfp_vdpa_vf_config(struct nfp_hw *hw,
-		int vid)
+		int vid,
+		bool relay)
 {
 	int ret;
 	uint32_t update;
@@ -134,6 +135,10 @@ nfp_vdpa_vf_config(struct nfp_hw *hw,
 	nfp_write_mac(hw, (uint8_t *)mac_addr);
 
 	new_ext_ctrl = nfp_vdpa_check_offloads();
+	if (relay)
+		new_ext_ctrl |= NFP_NET_CFG_CTRL_LM_RELAY;
+	else
+		new_ext_ctrl |= NFP_NET_CFG_CTRL_SWLM;
 
 	update = NFP_NET_CFG_UPDATE_GEN;
 	ret = nfp_ext_reconfig(hw, new_ext_ctrl, update);
@@ -150,6 +155,15 @@ nfp_vdpa_vf_config(struct nfp_hw *hw,
 			NFP_NET_CFG_UPDATE_GEN |
 			NFP_NET_CFG_UPDATE_RING;
 
+	if (relay) {
+		update |= NFP_NET_CFG_UPDATE_MSIX;
+
+		/* Enable misx interrupt for vdpa relay */
+		new_ctrl |= NFP_NET_CFG_CTRL_MSIX_TX_OFF;
+
+		nn_cfg_writeb(hw, NFP_NET_CFG_RXR_VEC(0), 1);
+	}
+
 	ret = nfp_reconfig(hw, new_ctrl, update);
 	if (ret < 0)
 		return -EIO;
@@ -165,20 +179,24 @@ nfp_vdpa_vf_config(struct nfp_hw *hw,
 }
 
 static void
-nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw)
+nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw,
+		bool relay)
 {
 	struct nfp_hw *hw = &vdpa_hw->super;
 
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
-	nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0),
-			rte_log2_u32(vdpa_hw->vring[1].size));
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
-	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
+	if (!relay) {
+		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
+		nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0),
+				rte_log2_u32(vdpa_hw->vring[1].size));
+		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
+		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
+
+		nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
+		nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0),
+				rte_log2_u32(vdpa_hw->vring[0].size));
+		nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
+	}
 
-	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
-	nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0),
-			rte_log2_u32(vdpa_hw->vring[0].size));
-	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
 	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
 
 	rte_wmb();
@@ -190,12 +208,23 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
 {
 	struct nfp_hw *hw = &vdpa_hw->super;
 
-	nfp_vdpa_queue_config(vdpa_hw);
+	nfp_vdpa_queue_config(vdpa_hw, false);
 
 	nfp_disable_queues(hw);
 	nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
 
-	return nfp_vdpa_vf_config(hw, vid);
+	return nfp_vdpa_vf_config(hw, vid, false);
+}
+
+int
+nfp_vdpa_relay_hw_start(struct nfp_vdpa_hw *vdpa_hw,
+		int vid)
+{
+	struct nfp_hw *hw = &vdpa_hw->super;
+
+	nfp_vdpa_queue_config(vdpa_hw, true);
+
+	return nfp_vdpa_vf_config(hw, vid, true);
 }
 
 void
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.h b/drivers/vdpa/nfp/nfp_vdpa_core.h
index 0f880fc0c6..a339ace601 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.h
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.h
@@ -9,12 +9,15 @@
 #include <bus_pci_driver.h>
 #include <nfp_common.h>
 #include <rte_ether.h>
+#include <rte_vhost.h>
 
 #define NFP_VDPA_MAX_QUEUES         1
 
 #define NFP_VDPA_NOTIFY_ADDR_BASE        0x4000
 #define NFP_VDPA_NOTIFY_ADDR_INTERVAL    0x1000
 
+#define NFP_VDPA_RELAY_VRING             0xd0000000
+
 struct nfp_vdpa_vring {
 	uint64_t desc;
 	uint64_t avail;
@@ -40,12 +43,17 @@ struct nfp_vdpa_hw {
 	/** Software Live Migration */
 	bool sw_lm;
 	bool sw_fallback_running;
+
+	/** Mediated vring for SW fallback */
+	struct vring m_vring[NFP_VDPA_MAX_QUEUES * 2];
 };
 
 int nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, struct rte_pci_device *dev);
 
 int nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, int vid);
 
+int nfp_vdpa_relay_hw_start(struct nfp_vdpa_hw *vdpa_hw, int vid);
+
 void nfp_vdpa_hw_stop(struct nfp_vdpa_hw *vdpa_hw);
 
 void nfp_vdpa_notify_queue(struct nfp_vdpa_hw *vdpa_hw, uint16_t qid);
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 08/10] vdpa/nfp: recover the ring index on new host
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (6 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 07/10] vdpa/nfp: setup the VF configure Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 09/10] vdpa/nfp: setup vring relay thread Chaoyong He
  2024-04-26  7:48 ` [PATCH 10/10] doc: update nfp document Chaoyong He
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

After migrate to new host, the vring information was
recovered by the value in offset 'NFP_NET_CFG_TX_USED_INDEX'
and 'NFP_NET_CFG_RX_USED_INDEX'.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/common/nfp/nfp_common_ctrl.h |  7 +++++--
 drivers/vdpa/nfp/nfp_vdpa_core.c     | 13 +++++++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h
index 9311d01590..4b273372a8 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -193,8 +193,11 @@ struct nfp_net_fw_ver {
 
 #define NFP_NET_CFG_CAP_WORD1           0x00a4
 
-/* 16B reserved for future use (0x00b0 - 0x00c0). */
-#define NFP_NET_CFG_RESERVED            0x00b0
+#define NFP_NET_CFG_TX_USED_INDEX       0x00b0
+#define NFP_NET_CFG_RX_USED_INDEX       0x00b4
+
+/* 16B reserved for future use (0x00b8 - 0x0010). */
+#define NFP_NET_CFG_RESERVED            0x00b8
 #define NFP_NET_CFG_RESERVED_SZ         0x0010
 
 /*
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 63a69aaf36..8f9aba9519 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -101,6 +101,16 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 	return 0;
 }
 
+static void
+nfp_vdpa_hw_queue_init(struct nfp_vdpa_hw *vdpa_hw)
+{
+	/* Distribute ring information to firmware */
+	nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_TX_USED_INDEX,
+			vdpa_hw->vring[1].last_used_idx);
+	nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_RX_USED_INDEX,
+			vdpa_hw->vring[0].last_used_idx);
+}
+
 static uint32_t
 nfp_vdpa_check_offloads(void)
 {
@@ -199,6 +209,9 @@ nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw,
 
 	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
 
+	if (!relay)
+		nfp_vdpa_hw_queue_init(vdpa_hw);
+
 	rte_wmb();
 }
 
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 09/10] vdpa/nfp: setup vring relay thread
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (7 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 08/10] vdpa/nfp: recover the ring index on new host Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26  7:48 ` [PATCH 10/10] doc: update nfp document Chaoyong He
  9 siblings, 0 replies; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Setup the vring relay thread to monitor the interruption from
device. And do the dirty page logging or notify device according
to event data.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 drivers/vdpa/nfp/nfp_vdpa.c      | 148 +++++++++++++++++++++++++++++++
 drivers/vdpa/nfp/nfp_vdpa_core.c |   9 ++
 drivers/vdpa/nfp/nfp_vdpa_core.h |   2 +
 3 files changed, 159 insertions(+)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 65f7144671..e57765eb1a 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -26,6 +26,8 @@
 #define NFP_VDPA_USED_RING_LEN(size) \
 		((size) * sizeof(struct vring_used_elem) + sizeof(struct vring_used))
 
+#define EPOLL_DATA_INTR        1
+
 struct nfp_vdpa_dev {
 	struct rte_pci_device *pci_dev;
 	struct rte_vdpa_device *vdev;
@@ -776,6 +778,139 @@ update_datapath(struct nfp_vdpa_dev *device)
 	return ret;
 }
 
+static int
+nfp_vdpa_vring_epoll_ctl(uint32_t queue_num,
+		struct nfp_vdpa_dev *device)
+{
+	int ret;
+	uint32_t qid;
+	struct epoll_event ev;
+	struct rte_vhost_vring vring;
+
+	for (qid = 0; qid < queue_num; qid++) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		rte_vhost_get_vhost_vring(device->vid, qid, &vring);
+		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
+		ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD, vring.kickfd, &ev);
+		if (ret < 0) {
+			DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid);
+			return ret;
+		}
+	}
+
+	/* vDPA driver interrupt */
+	for (qid = 0; qid < queue_num; qid += 2) {
+		ev.events = EPOLLIN | EPOLLPRI;
+		/* Leave a flag to mark it's for interrupt */
+		ev.data.u64 = EPOLL_DATA_INTR | qid << 1 |
+				(uint64_t)device->intr_fd[qid] << 32;
+		ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD,
+				device->intr_fd[qid], &ev);
+		if (ret < 0) {
+			DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid);
+			return ret;
+		}
+
+		nfp_vdpa_update_used_ring(device, qid);
+	}
+
+	return 0;
+}
+
+static int
+nfp_vdpa_vring_epoll_wait(uint32_t queue_num,
+		struct nfp_vdpa_dev *device)
+{
+	int i;
+	int fds;
+	int kickfd;
+	uint32_t qid;
+	struct epoll_event events[NFP_VDPA_MAX_QUEUES * 2];
+
+	for (;;) {
+		fds = epoll_wait(device->epoll_fd, events, queue_num * 2, -1);
+		if (fds < 0) {
+			if (errno == EINTR)
+				continue;
+
+			DRV_VDPA_LOG(ERR, "Epoll wait fail");
+			return -EACCES;
+		}
+
+		for (i = 0; i < fds; i++) {
+			qid = events[i].data.u32 >> 1;
+			kickfd = (uint32_t)(events[i].data.u64 >> 32);
+
+			nfp_vdpa_read_kickfd(kickfd);
+			if ((events[i].data.u32 & EPOLL_DATA_INTR) != 0) {
+				nfp_vdpa_update_used_ring(device, qid);
+				nfp_vdpa_irq_unmask(&device->hw);
+			} else {
+				nfp_vdpa_notify_queue(&device->hw, qid);
+			}
+		}
+	}
+
+	return 0;
+}
+
+static uint32_t
+nfp_vdpa_vring_relay(void *arg)
+{
+	int ret;
+	int epoll_fd;
+	uint16_t queue_id;
+	uint32_t queue_num;
+	struct nfp_vdpa_dev *device = arg;
+
+	epoll_fd = epoll_create(NFP_VDPA_MAX_QUEUES * 2);
+	if (epoll_fd < 0) {
+		DRV_VDPA_LOG(ERR, "failed to create epoll instance.");
+		return 1;
+	}
+
+	device->epoll_fd = epoll_fd;
+
+	queue_num = rte_vhost_get_vring_num(device->vid);
+
+	ret = nfp_vdpa_vring_epoll_ctl(queue_num, device);
+	if (ret != 0)
+		goto notify_exit;
+
+	/* Start relay with a first kick */
+	for (queue_id = 0; queue_id < queue_num; queue_id++)
+		nfp_vdpa_notify_queue(&device->hw, queue_id);
+
+	ret = nfp_vdpa_vring_epoll_wait(queue_num, device);
+	if (ret != 0)
+		goto notify_exit;
+
+	return 0;
+
+notify_exit:
+	close(device->epoll_fd);
+	device->epoll_fd = -1;
+
+	return 1;
+}
+
+static int
+nfp_vdpa_setup_vring_relay(struct nfp_vdpa_dev *device)
+{
+	int ret;
+	char name[RTE_THREAD_INTERNAL_NAME_SIZE];
+
+	snprintf(name, sizeof(name), "nfp_vring%d", device->vid);
+	ret = rte_thread_create_internal_control(&device->tid, name,
+			nfp_vdpa_vring_relay, (void *)device);
+	if (ret != 0) {
+		DRV_VDPA_LOG(ERR, "Failed to create vring relay pthread.");
+		return -EPERM;
+	}
+
+	return 0;
+}
+
 static int
 nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
 {
@@ -802,10 +937,17 @@ nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
 	if (ret != 0)
 		goto unset_intr;
 
+	/* Setup vring relay thread */
+	ret = nfp_vdpa_setup_vring_relay(device);
+	if (ret != 0)
+		goto stop_vf;
+
 	device->hw.sw_fallback_running = true;
 
 	return 0;
 
+stop_vf:
+	nfp_vdpa_stop(device, true);
 unset_intr:
 	nfp_vdpa_disable_vfio_intr(device);
 error:
@@ -859,6 +1001,12 @@ nfp_vdpa_dev_close(int vid)
 		/* Reset VF */
 		nfp_vdpa_stop(device, true);
 
+		/* Remove interrupt setting */
+		nfp_vdpa_disable_vfio_intr(device);
+
+		/* Unset DMA map for guest memory */
+		nfp_vdpa_dma_map(device, false);
+
 		device->hw.sw_fallback_running = false;
 
 		rte_atomic_store_explicit(&device->dev_attached, 0,
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 8f9aba9519..70aeb4a3ac 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -271,3 +271,12 @@ nfp_vdpa_notify_queue(struct nfp_vdpa_hw *vdpa_hw,
 	nfp_qcp_notify_ptr_add(vdpa_hw->notify_addr[qid],
 			NFP_QCP_NOTIFY_WRITE_PTR, qid);
 }
+
+void nfp_vdpa_irq_unmask(struct nfp_vdpa_hw *vdpa_hw)
+{
+	struct nfp_hw *hw = &vdpa_hw->super;
+
+	/* Make sure all updates are written before un-masking */
+	rte_wmb();
+	nn_cfg_writeb(hw, NFP_NET_CFG_ICR(1), NFP_NET_CFG_ICR_UNMASKED);
+}
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.h b/drivers/vdpa/nfp/nfp_vdpa_core.h
index a339ace601..bc4db556a2 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.h
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.h
@@ -60,4 +60,6 @@ void nfp_vdpa_notify_queue(struct nfp_vdpa_hw *vdpa_hw, uint16_t qid);
 
 uint64_t nfp_vdpa_get_queue_notify_offset(struct nfp_vdpa_hw *vdpa_hw, int qid);
 
+void nfp_vdpa_irq_unmask(struct nfp_vdpa_hw *vdpa_hw);
+
 #endif /* __NFP_VDPA_CORE_H__ */
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 10/10] doc: update nfp document
  2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
                   ` (8 preceding siblings ...)
  2024-04-26  7:48 ` [PATCH 09/10] vdpa/nfp: setup vring relay thread Chaoyong He
@ 2024-04-26  7:48 ` Chaoyong He
  2024-04-26 21:31   ` Patrick Robb
  9 siblings, 1 reply; 12+ messages in thread
From: Chaoyong He @ 2024-04-26  7:48 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Xinying Yu, Chaoyong He, Long Wu, Peng Zhang

From: Xinying Yu <xinying.yu@corigine.com>

Add the software assisted vDPA live migration feature
into NFP document.

Signed-off-by: Xinying Yu <xinying.yu@corigine.com>
Reviewed-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Long Wu <long.wu@corigine.com>
Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
---
 doc/guides/vdpadevs/nfp.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/vdpadevs/nfp.rst b/doc/guides/vdpadevs/nfp.rst
index dc9e94dbc8..e4736d9f61 100644
--- a/doc/guides/vdpadevs/nfp.rst
+++ b/doc/guides/vdpadevs/nfp.rst
@@ -19,6 +19,15 @@ device will be probed by net/nfp driver and will used as a VF net device.
 
 This PMD uses (common/nfp) code to access the device firmware.
 
+Software Live Migration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Now the NFP vDPA driver only support software assisted live migration mode.
+In this mode, the driver will setup a software relay thread when live migration
+happens, this thread will help device to log dirty pages. Although this mode
+does not require hardware to implement a dirty page logging function block, it
+will consume percentage of CPU resource depending on the network throughput.
+
 Per-Device Parameters
 ~~~~~~~~~~~~~~~~~~~~~
 
-- 
2.39.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 10/10] doc: update nfp document
  2024-04-26  7:48 ` [PATCH 10/10] doc: update nfp document Chaoyong He
@ 2024-04-26 21:31   ` Patrick Robb
  0 siblings, 0 replies; 12+ messages in thread
From: Patrick Robb @ 2024-04-26 21:31 UTC (permalink / raw)
  To: Chaoyong He; +Cc: dev, oss-drivers, Xinying Yu, Long Wu, Peng Zhang

[-- Attachment #1: Type: text/plain, Size: 296 bytes --]

Recheck-request: iol-compile-amd64-testing

The DPDK Community Lab updated to the latest Alpine image yesterday, which
resulted in all Alpine builds failing. The failure is unrelated to your
patch, and this recheck should remove the fail on Patchwork, as we have
disabled Alpine testing for now.

[-- Attachment #2: Type: text/html, Size: 361 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-26 21:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-26  7:48 [PATCH 00/10] support software live migration Chaoyong He
2024-04-26  7:48 ` [PATCH 01/10] mailmap: add new contributor Chaoyong He
2024-04-26  7:48 ` [PATCH 02/10] vdpa/nfp: fix logic in hardware init Chaoyong He
2024-04-26  7:48 ` [PATCH 03/10] vdpa/nfp: fix the logic of reconfiguration Chaoyong He
2024-04-26  7:48 ` [PATCH 04/10] vdpa/nfp: refactor the logic of datapath update Chaoyong He
2024-04-26  7:48 ` [PATCH 05/10] vdpa/nfp: add the live migration logic Chaoyong He
2024-04-26  7:48 ` [PATCH 06/10] vdpa/nfp: add the interrupt logic of vring relay Chaoyong He
2024-04-26  7:48 ` [PATCH 07/10] vdpa/nfp: setup the VF configure Chaoyong He
2024-04-26  7:48 ` [PATCH 08/10] vdpa/nfp: recover the ring index on new host Chaoyong He
2024-04-26  7:48 ` [PATCH 09/10] vdpa/nfp: setup vring relay thread Chaoyong He
2024-04-26  7:48 ` [PATCH 10/10] doc: update nfp document Chaoyong He
2024-04-26 21:31   ` Patrick Robb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).