DPDK patches and discussions
 help / color / mirror / Atom feed
From: Mingjin Ye <mingjinx.ye@intel.com>
To: dev@dpdk.org
Cc: qiming.yang@intel.com, yidingx.zhou@intel.com,
	Mingjin Ye <mingjinx.ye@intel.com>, Simei Su <simei.su@intel.com>,
	Wenjun Wu <wenjun1.wu@intel.com>,
	Yuying Zhang <Yuying.Zhang@intel.com>,
	Beilei Xing <beilei.xing@intel.com>,
	Jingjing Wu <jingjing.wu@intel.com>
Subject: [PATCH v3] net/iavf: support no data path polling mode
Date: Fri, 13 Oct 2023 01:27:37 +0000	[thread overview]
Message-ID: <20231013012737.2794441-1-mingjinx.ye@intel.com> (raw)
In-Reply-To: <20230926075647.2196381-1-mingjinx.ye@intel.com>

Currently, during a PF to VF reset due to an action such as changing
trust settings on a VF, the DPDK application running with iavf PMD
loses connectivity, and the only solution is to reset the DPDK
application.

Instead of forcing a reset of the DPDK application to restore
connectivity, the iavf PMD driver handles the PF to VF reset event
normally by performing all necessary steps to bring the VF back
online.

To minimize downtime, a devargs "no-poll-on-link-down" is introduced
in iavf PMD. When this flag is set, the PMD switches to no-poll mode
when the link state is down (rx/tx bursts return to 0 immediately).
When the link state returns to normal, the PMD switches to normal
rx/tx burst state.

NOTE: The DPDK application needs to handle the
RTE_ETH_EVENT_INTR_RESET event posted by the iavf PMD and reset
the vf upon receipt of this event.

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
V3: Remove redundant code.
---
 doc/guides/nics/intel_vf.rst   |  3 ++
 drivers/net/iavf/iavf.h        |  4 +++
 drivers/net/iavf/iavf_ethdev.c | 16 +++++++++-
 drivers/net/iavf/iavf_rxtx.c   | 53 ++++++++++++++++++++++++++++++++++
 drivers/net/iavf/iavf_rxtx.h   |  1 +
 drivers/net/iavf/iavf_vchnl.c  | 20 +++++++++++++
 6 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst
index 7613e1c5e5..19c461c3de 100644
--- a/doc/guides/nics/intel_vf.rst
+++ b/doc/guides/nics/intel_vf.rst
@@ -104,6 +104,9 @@ For more detail on SR-IOV, please refer to the following documents:
     Enable vf auto-reset by setting the ``devargs`` parameter like ``-a 18:01.0,auto_reset=1`` when IAVF is backed
     by an Intel® E810 device or an Intel® 700 Series Ethernet device.
 
+    Enable vf no-poll-on-link-down by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link-down=1`` when IAVF is backed
+    by an Intel® E810 device or an Intel® 700 Series Ethernet device.
+
 The PCIE host-interface of Intel Ethernet Switch FM10000 Series VF infrastructure
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index 04774ce124..c115f3444e 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -308,6 +308,7 @@ struct iavf_devargs {
 	uint16_t quanta_size;
 	uint32_t watchdog_period;
 	uint8_t  auto_reset;
+	uint16_t no_poll_on_link_down;
 };
 
 struct iavf_security_ctx;
@@ -326,6 +327,9 @@ struct iavf_adapter {
 	uint32_t ptype_tbl[IAVF_MAX_PKT_TYPE] __rte_cache_min_aligned;
 	bool stopped;
 	bool closed;
+	bool no_poll;
+	eth_rx_burst_t rx_pkt_burst;
+	eth_tx_burst_t tx_pkt_burst;
 	uint16_t fdir_ref_cnt;
 	struct iavf_devargs devargs;
 };
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 5b2634a4e3..98cc5c8ea8 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -38,7 +38,7 @@
 #define IAVF_QUANTA_SIZE_ARG       "quanta_size"
 #define IAVF_RESET_WATCHDOG_ARG    "watchdog_period"
 #define IAVF_ENABLE_AUTO_RESET_ARG "auto_reset"
-
+#define IAVF_NO_POLL_ON_LINK_DOWN_ARG "no-poll-on-link-down"
 uint64_t iavf_timestamp_dynflag;
 int iavf_timestamp_dynfield_offset = -1;
 
@@ -47,6 +47,7 @@ static const char * const iavf_valid_args[] = {
 	IAVF_QUANTA_SIZE_ARG,
 	IAVF_RESET_WATCHDOG_ARG,
 	IAVF_ENABLE_AUTO_RESET_ARG,
+	IAVF_NO_POLL_ON_LINK_DOWN_ARG,
 	NULL
 };
 
@@ -2291,6 +2292,7 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev)
 	struct rte_kvargs *kvlist;
 	int ret;
 	int watchdog_period = -1;
+	uint16_t no_poll_on_link_down;
 
 	if (!devargs)
 		return 0;
@@ -2324,6 +2326,15 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev)
 	else
 		ad->devargs.watchdog_period = watchdog_period;
 
+	ret = rte_kvargs_process(kvlist, IAVF_NO_POLL_ON_LINK_DOWN_ARG,
+				 &parse_u16, &no_poll_on_link_down);
+	if (ret)
+		goto bail;
+	if (no_poll_on_link_down == 0)
+		ad->devargs.no_poll_on_link_down = 0;
+	else
+		ad->devargs.no_poll_on_link_down = 1;
+
 	if (ad->devargs.quanta_size != 0 &&
 	    (ad->devargs.quanta_size < 256 || ad->devargs.quanta_size > 4096 ||
 	     ad->devargs.quanta_size & 0x40)) {
@@ -2337,6 +2348,9 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev)
 	if (ret)
 		goto bail;
 
+	if (ad->devargs.auto_reset != 0)
+		ad->devargs.no_poll_on_link_down = 1;
+
 bail:
 	rte_kvargs_free(kvlist);
 	return ret;
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0484988d13..7feadee7d0 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -777,6 +777,7 @@ iavf_dev_tx_queue_setup(struct rte_eth_dev *dev,
 		IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
 	struct iavf_info *vf =
 		IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+	struct iavf_vsi *vsi = &vf->vsi;
 	struct iavf_tx_queue *txq;
 	const struct rte_memzone *mz;
 	uint32_t ring_size;
@@ -850,6 +851,7 @@ iavf_dev_tx_queue_setup(struct rte_eth_dev *dev,
 	txq->port_id = dev->data->port_id;
 	txq->offloads = offloads;
 	txq->tx_deferred_start = tx_conf->tx_deferred_start;
+	txq->vsi = vsi;
 
 	if (iavf_ipsec_crypto_supported(adapter))
 		txq->ipsec_crypto_pkt_md_offset =
@@ -3707,6 +3709,30 @@ iavf_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
 	return i;
 }
 
+static uint16_t
+iavf_recv_pkts_no_poll(void *rx_queue, struct rte_mbuf **rx_pkts,
+				uint16_t nb_pkts)
+{
+	struct iavf_rx_queue *rxq = rx_queue;
+	if (!rxq->vsi || rxq->vsi->adapter->no_poll)
+		return 0;
+
+	return rxq->vsi->adapter->rx_pkt_burst(rx_queue,
+								rx_pkts, nb_pkts);
+}
+
+static uint16_t
+iavf_xmit_pkts_no_poll(void *tx_queue, struct rte_mbuf **tx_pkts,
+				uint16_t nb_pkts)
+{
+	struct iavf_tx_queue *txq = tx_queue;
+	if (!txq->vsi || txq->vsi->adapter->no_poll)
+		return 0;
+
+	return txq->vsi->adapter->tx_pkt_burst(tx_queue,
+								tx_pkts, nb_pkts);
+}
+
 /* choose rx function*/
 void
 iavf_set_rx_function(struct rte_eth_dev *dev)
@@ -3714,6 +3740,7 @@ iavf_set_rx_function(struct rte_eth_dev *dev)
 	struct iavf_adapter *adapter =
 		IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
 	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+	int no_poll_on_link_down = adapter->devargs.no_poll_on_link_down;
 	int i;
 	struct iavf_rx_queue *rxq;
 	bool use_flex = true;
@@ -3891,6 +3918,10 @@ iavf_set_rx_function(struct rte_eth_dev *dev)
 			}
 		}
 
+		if (no_poll_on_link_down) {
+			adapter->rx_pkt_burst = dev->rx_pkt_burst;
+			dev->rx_pkt_burst = iavf_recv_pkts_no_poll;
+		}
 		return;
 	}
 #elif defined RTE_ARCH_ARM
@@ -3906,6 +3937,11 @@ iavf_set_rx_function(struct rte_eth_dev *dev)
 			(void)iavf_rxq_vec_setup(rxq);
 		}
 		dev->rx_pkt_burst = iavf_recv_pkts_vec;
+
+		if (no_poll_on_link_down) {
+			adapter->rx_pkt_burst = dev->rx_pkt_burst;
+			dev->rx_pkt_burst = iavf_recv_pkts_no_poll;
+		}
 		return;
 	}
 #endif
@@ -3928,12 +3964,20 @@ iavf_set_rx_function(struct rte_eth_dev *dev)
 		else
 			dev->rx_pkt_burst = iavf_recv_pkts;
 	}
+
+	if (no_poll_on_link_down) {
+		adapter->rx_pkt_burst = dev->rx_pkt_burst;
+		dev->rx_pkt_burst = iavf_recv_pkts_no_poll;
+	}
 }
 
 /* choose tx function*/
 void
 iavf_set_tx_function(struct rte_eth_dev *dev)
 {
+	struct iavf_adapter *adapter =
+		IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+	int no_poll_on_link_down = adapter->devargs.no_poll_on_link_down;
 #ifdef RTE_ARCH_X86
 	struct iavf_tx_queue *txq;
 	int i;
@@ -4022,6 +4066,10 @@ iavf_set_tx_function(struct rte_eth_dev *dev)
 #endif
 		}
 
+		if (no_poll_on_link_down) {
+			adapter->tx_pkt_burst = dev->tx_pkt_burst;
+			dev->tx_pkt_burst = iavf_xmit_pkts_no_poll;
+		}
 		return;
 	}
 
@@ -4031,6 +4079,11 @@ iavf_set_tx_function(struct rte_eth_dev *dev)
 		    dev->data->port_id);
 	dev->tx_pkt_burst = iavf_xmit_pkts;
 	dev->tx_pkt_prepare = iavf_prep_pkts;
+
+	if (no_poll_on_link_down) {
+		adapter->tx_pkt_burst = dev->tx_pkt_burst;
+		dev->tx_pkt_burst = iavf_xmit_pkts_no_poll;
+	}
 }
 
 static int
diff --git a/drivers/net/iavf/iavf_rxtx.h b/drivers/net/iavf/iavf_rxtx.h
index 605ea3f824..d3324e0e6e 100644
--- a/drivers/net/iavf/iavf_rxtx.h
+++ b/drivers/net/iavf/iavf_rxtx.h
@@ -288,6 +288,7 @@ struct iavf_tx_queue {
 	uint16_t free_thresh;
 	uint16_t rs_thresh;
 	uint8_t rel_mbufs_type;
+	struct iavf_vsi *vsi; /**< the VSI this queue belongs to */
 
 	uint16_t port_id;
 	uint16_t queue_id;
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index 7f49eb2c1e..0a3e1d082c 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -272,6 +272,16 @@ iavf_read_msg_from_pf(struct iavf_adapter *adapter, uint16_t buf_len,
 				if (!vf->link_up)
 					iavf_dev_watchdog_enable(adapter);
 			}
+			if (adapter->devargs.no_poll_on_link_down) {
+				if (vf->link_up && adapter->no_poll) {
+					adapter->no_poll = false;
+					PMD_DRV_LOG(DEBUG, "VF no poll turned off");
+				}
+				if (!vf->link_up) {
+					adapter->no_poll = true;
+					PMD_DRV_LOG(DEBUG, "VF no poll turned on");
+				}
+			}
 			PMD_DRV_LOG(INFO, "Link status update:%s",
 					vf->link_up ? "up" : "down");
 			break;
@@ -474,6 +484,16 @@ iavf_handle_pf_event_msg(struct rte_eth_dev *dev, uint8_t *msg,
 			if (!vf->link_up)
 				iavf_dev_watchdog_enable(adapter);
 		}
+		if (adapter->devargs.no_poll_on_link_down) {
+			if (vf->link_up && adapter->no_poll) {
+				adapter->no_poll = false;
+				PMD_DRV_LOG(DEBUG, "VF no poll turned off");
+			}
+			if (!vf->link_up) {
+				adapter->no_poll = true;
+				PMD_DRV_LOG(DEBUG, "VF no poll turned on");
+			}
+		}
 		iavf_dev_event_post(dev, RTE_ETH_EVENT_INTR_LSC, NULL, 0);
 		break;
 	case VIRTCHNL_EVENT_PF_DRIVER_CLOSE:
-- 
2.25.1


  reply	other threads:[~2023-10-13  1:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-20  6:16 [PATCH] net/ice: CVL support double vlan Mingjin Ye
2023-05-06 10:04 ` [PATCH v2] net/ice: " Mingjin Ye
2023-05-26 10:16   ` Xu, Ke1
2023-05-26 11:10     ` Zhang, Qi Z
2023-07-17  9:36 ` [POC] net/iavf: support no data path polling mode Mingjin Ye
2023-07-20 10:08   ` [POC v2] " Mingjin Ye
2023-07-20 15:45     ` Stephen Hemminger
2023-07-21  9:57     ` [POC v3] " Mingjin Ye
2023-08-11  6:27       ` [PATCH] " Mingjin Ye
2023-09-26  7:56         ` [PATCH v2] " Mingjin Ye
2023-10-13  1:27           ` Mingjin Ye [this message]
2023-10-17  1:44             ` [PATCH v4] " Mingjin Ye
2023-10-17  2:19             ` Mingjin Ye
2023-10-19  9:04               ` [PATCH v5] net/iavf: data paths support no-polling mode Mingjin Ye
2023-10-20  0:39                 ` Zhang, Qi Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231013012737.2794441-1-mingjinx.ye@intel.com \
    --to=mingjinx.ye@intel.com \
    --cc=Yuying.Zhang@intel.com \
    --cc=beilei.xing@intel.com \
    --cc=dev@dpdk.org \
    --cc=jingjing.wu@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=simei.su@intel.com \
    --cc=wenjun1.wu@intel.com \
    --cc=yidingx.zhou@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).