From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D4611431A8; Thu, 19 Oct 2023 11:14:39 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 587C040279; Thu, 19 Oct 2023 11:14:39 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id A19074021F for ; Thu, 19 Oct 2023 11:14:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697706877; x=1729242877; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0Mg3zbaKbUCq6VeWZ37SU9vzeDJzI4t+DE6Hzgg0tKI=; b=MxQhoFhfbogSRYxQEVK9rONlBbXoD7bC/C6U5+CdJqxylKuWyl7iNrCu FPmNP8kBAxqTpRGbSVJTPeCw5QAvpQw120s5W0XU122QG3rj06THeRHJ7 onGLlj8sSm3Glbov+lFyh11OSfp1ACtZlSN619OfKdn2tRjHk9hmRZvE6 o0BN240t3N9rKyHRkTh+gwPzu7CBficPMv28643MX7i3GaoR9ew0zAdfB aU5t5KTQwWGQ4Z5dbDWSECRReol4pTOCm5U3oQY0D8LxXRrQKZJUH2xHD 144iH9dloVaUI+Zl/QqZzPGC/Khr1OMp0yICLYbHOA9BWj12aILR/MSgf Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10867"; a="366448135" X-IronPort-AV: E=Sophos;i="6.03,236,1694761200"; d="scan'208";a="366448135" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2023 02:14:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10867"; a="760570870" X-IronPort-AV: E=Sophos;i="6.03,236,1694761200"; d="scan'208";a="760570870" Received: from unknown (HELO localhost.localdomain) ([10.239.252.253]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2023 02:14:33 -0700 From: Mingjin Ye To: dev@dpdk.org Cc: qiming.yang@intel.com, yidingx.zhou@intel.com, Mingjin Ye , Simei Su , Wenjun Wu , Yuying Zhang , Beilei Xing , Jingjing Wu Subject: [PATCH v5] net/iavf: data paths support no-polling mode Date: Thu, 19 Oct 2023 09:04:15 +0000 Message-Id: <20231019090416.2923016-1-mingjinx.ye@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231017021921.2844680-1-mingjinx.ye@intel.com> References: <20231017021921.2844680-1-mingjinx.ye@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In a scenario involving a hot firmware upgrade, the network device on the host side need to be reset, potentially causing the hardware queues to become unreachable. In a VM, continuing to run VF PMD Rx/Tx during this process can lead to application crash. The solution is to implement a 'no-polling' Rx and Tx wrapper. This wrapper will check the link status and return immediately if the link is down. This is especially important because the link down events will continue to be sent from the PF to the VF during firmware hot upgrades, and the event will always occur before the RESET IMPENDING event. The no-polling rx/tx mechanism will only be active when the devarg "no-poll-on-link-down" is enabled. This devarg is typically recommended for use in this specific hot upgrade scenario. Ideally, "no-poll-on-link-down" should be used in conjunction with the devarg "auto-reset" to provide a seamless and user-friendly experience within the VM. Signed-off-by: Mingjin Ye --- V3: Remove redundant code. --- v4: Delete the git log note. --- v5: Optimize the commit log --- doc/guides/nics/intel_vf.rst | 3 ++ drivers/net/iavf/iavf.h | 4 +++ drivers/net/iavf/iavf_ethdev.c | 16 +++++++++- drivers/net/iavf/iavf_rxtx.c | 53 ++++++++++++++++++++++++++++++++++ drivers/net/iavf/iavf_rxtx.h | 1 + drivers/net/iavf/iavf_vchnl.c | 20 +++++++++++++ 6 files changed, 96 insertions(+), 1 deletion(-) diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst index e06d62a873..df298c6086 100644 --- a/doc/guides/nics/intel_vf.rst +++ b/doc/guides/nics/intel_vf.rst @@ -107,6 +107,9 @@ For more detail on SR-IOV, please refer to the following documents: when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 Series Ethernet device. + Enable vf no-poll-on-link-down by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link-down=1`` when IAVF is backed + by an IntelĀ® E810 device or an IntelĀ® 700 Series Ethernet device. + The PCIE host-interface of Intel Ethernet Switch FM10000 Series VF infrastructure ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index 04774ce124..c115f3444e 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -308,6 +308,7 @@ struct iavf_devargs { uint16_t quanta_size; uint32_t watchdog_period; uint8_t auto_reset; + uint16_t no_poll_on_link_down; }; struct iavf_security_ctx; @@ -326,6 +327,9 @@ struct iavf_adapter { uint32_t ptype_tbl[IAVF_MAX_PKT_TYPE] __rte_cache_min_aligned; bool stopped; bool closed; + bool no_poll; + eth_rx_burst_t rx_pkt_burst; + eth_tx_burst_t tx_pkt_burst; uint16_t fdir_ref_cnt; struct iavf_devargs devargs; }; diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index 5b2634a4e3..98cc5c8ea8 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -38,7 +38,7 @@ #define IAVF_QUANTA_SIZE_ARG "quanta_size" #define IAVF_RESET_WATCHDOG_ARG "watchdog_period" #define IAVF_ENABLE_AUTO_RESET_ARG "auto_reset" - +#define IAVF_NO_POLL_ON_LINK_DOWN_ARG "no-poll-on-link-down" uint64_t iavf_timestamp_dynflag; int iavf_timestamp_dynfield_offset = -1; @@ -47,6 +47,7 @@ static const char * const iavf_valid_args[] = { IAVF_QUANTA_SIZE_ARG, IAVF_RESET_WATCHDOG_ARG, IAVF_ENABLE_AUTO_RESET_ARG, + IAVF_NO_POLL_ON_LINK_DOWN_ARG, NULL }; @@ -2291,6 +2292,7 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev) struct rte_kvargs *kvlist; int ret; int watchdog_period = -1; + uint16_t no_poll_on_link_down; if (!devargs) return 0; @@ -2324,6 +2326,15 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev) else ad->devargs.watchdog_period = watchdog_period; + ret = rte_kvargs_process(kvlist, IAVF_NO_POLL_ON_LINK_DOWN_ARG, + &parse_u16, &no_poll_on_link_down); + if (ret) + goto bail; + if (no_poll_on_link_down == 0) + ad->devargs.no_poll_on_link_down = 0; + else + ad->devargs.no_poll_on_link_down = 1; + if (ad->devargs.quanta_size != 0 && (ad->devargs.quanta_size < 256 || ad->devargs.quanta_size > 4096 || ad->devargs.quanta_size & 0x40)) { @@ -2337,6 +2348,9 @@ static int iavf_parse_devargs(struct rte_eth_dev *dev) if (ret) goto bail; + if (ad->devargs.auto_reset != 0) + ad->devargs.no_poll_on_link_down = 1; + bail: rte_kvargs_free(kvlist); return ret; diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c index c6ef6af1d8..72263870a4 100644 --- a/drivers/net/iavf/iavf_rxtx.c +++ b/drivers/net/iavf/iavf_rxtx.c @@ -777,6 +777,7 @@ iavf_dev_tx_queue_setup(struct rte_eth_dev *dev, IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); + struct iavf_vsi *vsi = &vf->vsi; struct iavf_tx_queue *txq; const struct rte_memzone *mz; uint32_t ring_size; @@ -850,6 +851,7 @@ iavf_dev_tx_queue_setup(struct rte_eth_dev *dev, txq->port_id = dev->data->port_id; txq->offloads = offloads; txq->tx_deferred_start = tx_conf->tx_deferred_start; + txq->vsi = vsi; if (iavf_ipsec_crypto_supported(adapter)) txq->ipsec_crypto_pkt_md_offset = @@ -3703,6 +3705,30 @@ iavf_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts, return i; } +static uint16_t +iavf_recv_pkts_no_poll(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct iavf_rx_queue *rxq = rx_queue; + if (!rxq->vsi || rxq->vsi->adapter->no_poll) + return 0; + + return rxq->vsi->adapter->rx_pkt_burst(rx_queue, + rx_pkts, nb_pkts); +} + +static uint16_t +iavf_xmit_pkts_no_poll(void *tx_queue, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct iavf_tx_queue *txq = tx_queue; + if (!txq->vsi || txq->vsi->adapter->no_poll) + return 0; + + return txq->vsi->adapter->tx_pkt_burst(tx_queue, + tx_pkts, nb_pkts); +} + /* choose rx function*/ void iavf_set_rx_function(struct rte_eth_dev *dev) @@ -3710,6 +3736,7 @@ iavf_set_rx_function(struct rte_eth_dev *dev) struct iavf_adapter *adapter = IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); + int no_poll_on_link_down = adapter->devargs.no_poll_on_link_down; int i; struct iavf_rx_queue *rxq; bool use_flex = true; @@ -3887,6 +3914,10 @@ iavf_set_rx_function(struct rte_eth_dev *dev) } } + if (no_poll_on_link_down) { + adapter->rx_pkt_burst = dev->rx_pkt_burst; + dev->rx_pkt_burst = iavf_recv_pkts_no_poll; + } return; } #elif defined RTE_ARCH_ARM @@ -3902,6 +3933,11 @@ iavf_set_rx_function(struct rte_eth_dev *dev) (void)iavf_rxq_vec_setup(rxq); } dev->rx_pkt_burst = iavf_recv_pkts_vec; + + if (no_poll_on_link_down) { + adapter->rx_pkt_burst = dev->rx_pkt_burst; + dev->rx_pkt_burst = iavf_recv_pkts_no_poll; + } return; } #endif @@ -3924,12 +3960,20 @@ iavf_set_rx_function(struct rte_eth_dev *dev) else dev->rx_pkt_burst = iavf_recv_pkts; } + + if (no_poll_on_link_down) { + adapter->rx_pkt_burst = dev->rx_pkt_burst; + dev->rx_pkt_burst = iavf_recv_pkts_no_poll; + } } /* choose tx function*/ void iavf_set_tx_function(struct rte_eth_dev *dev) { + struct iavf_adapter *adapter = + IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); + int no_poll_on_link_down = adapter->devargs.no_poll_on_link_down; #ifdef RTE_ARCH_X86 struct iavf_tx_queue *txq; int i; @@ -4018,6 +4062,10 @@ iavf_set_tx_function(struct rte_eth_dev *dev) #endif } + if (no_poll_on_link_down) { + adapter->tx_pkt_burst = dev->tx_pkt_burst; + dev->tx_pkt_burst = iavf_xmit_pkts_no_poll; + } return; } @@ -4027,6 +4075,11 @@ iavf_set_tx_function(struct rte_eth_dev *dev) dev->data->port_id); dev->tx_pkt_burst = iavf_xmit_pkts; dev->tx_pkt_prepare = iavf_prep_pkts; + + if (no_poll_on_link_down) { + adapter->tx_pkt_burst = dev->tx_pkt_burst; + dev->tx_pkt_burst = iavf_xmit_pkts_no_poll; + } } static int diff --git a/drivers/net/iavf/iavf_rxtx.h b/drivers/net/iavf/iavf_rxtx.h index 605ea3f824..d3324e0e6e 100644 --- a/drivers/net/iavf/iavf_rxtx.h +++ b/drivers/net/iavf/iavf_rxtx.h @@ -288,6 +288,7 @@ struct iavf_tx_queue { uint16_t free_thresh; uint16_t rs_thresh; uint8_t rel_mbufs_type; + struct iavf_vsi *vsi; /**< the VSI this queue belongs to */ uint16_t port_id; uint16_t queue_id; diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index 7f49eb2c1e..0a3e1d082c 100644 --- a/drivers/net/iavf/iavf_vchnl.c +++ b/drivers/net/iavf/iavf_vchnl.c @@ -272,6 +272,16 @@ iavf_read_msg_from_pf(struct iavf_adapter *adapter, uint16_t buf_len, if (!vf->link_up) iavf_dev_watchdog_enable(adapter); } + if (adapter->devargs.no_poll_on_link_down) { + if (vf->link_up && adapter->no_poll) { + adapter->no_poll = false; + PMD_DRV_LOG(DEBUG, "VF no poll turned off"); + } + if (!vf->link_up) { + adapter->no_poll = true; + PMD_DRV_LOG(DEBUG, "VF no poll turned on"); + } + } PMD_DRV_LOG(INFO, "Link status update:%s", vf->link_up ? "up" : "down"); break; @@ -474,6 +484,16 @@ iavf_handle_pf_event_msg(struct rte_eth_dev *dev, uint8_t *msg, if (!vf->link_up) iavf_dev_watchdog_enable(adapter); } + if (adapter->devargs.no_poll_on_link_down) { + if (vf->link_up && adapter->no_poll) { + adapter->no_poll = false; + PMD_DRV_LOG(DEBUG, "VF no poll turned off"); + } + if (!vf->link_up) { + adapter->no_poll = true; + PMD_DRV_LOG(DEBUG, "VF no poll turned on"); + } + } iavf_dev_event_post(dev, RTE_ETH_EVENT_INTR_LSC, NULL, 0); break; case VIRTCHNL_EVENT_PF_DRIVER_CLOSE: -- 2.25.1