DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 0/5] fix race-condition of proactive error handling mode
@ 2023-03-01  3:06 Chengwen Feng
  2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
                   ` (7 more replies)
  0 siblings, 8 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev; +Cc: dev

This patch fixes race-condition of proactive error handling mode, the
discussion thread [1].

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Chengwen Feng (5):
  ethdev: fix race-condition of proactive error handling mode
  net/hns3: replace fp ops config function
  net/bnxt: fix race-condition when report error recovery
  net/bnxt: use fp ops setup function
  app/testpmd: add error recovery usage demo

 app/test-pmd/testpmd.c                  | 80 +++++++++++++++++++++++++
 app/test-pmd/testpmd.h                  |  4 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 20 +++----
 drivers/net/bnxt/bnxt_cpr.c             | 18 +++---
 drivers/net/bnxt/bnxt_ethdev.c          |  9 +--
 drivers/net/hns3/hns3_rxtx.c            | 21 +------
 lib/ethdev/ethdev_driver.c              |  8 +++
 lib/ethdev/ethdev_driver.h              | 10 ++++
 lib/ethdev/rte_ethdev.h                 | 32 ++++++----
 lib/ethdev/version.map                  |  1 +
 10 files changed, 143 insertions(+), 60 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
@ 2023-03-01  3:06 ` Chengwen Feng
  2023-03-02 12:08   ` Konstantin Ananyev
  2023-03-02 23:30   ` Honnappa Nagarahalli
  2023-03-01  3:06 ` [PATCH 2/5] net/hns3: replace fp ops config function Chengwen Feng
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, Andrew Rybchenko,
	Kalesh AP, Ajit Khaparde
  Cc: dev

In the proactive error handling mode, the PMD will set the data path
pointers to dummy functions and then try recovery, in this period the
application may still invoking data path API. This will introduce a
race-condition with data path which may lead to crash [1].

Although the PMD added delay after setting data path pointers to cover
the above race-condition, it reduces the probability, but it doesn't
solve the problem.

To solve the race-condition problem fundamentally, the following
requirements are added:
1. The PMD should set the data path pointers to dummy functions after
   report RTE_ETH_EVENT_ERR_RECOVERING event.
2. The application should stop data path API invocation when process
   the RTE_ETH_EVENT_ERR_RECOVERING event.
3. The PMD should set the data path pointers to valid functions before
   report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
4. The application should enable data path API invocation when process
   the RTE_ETH_EVENT_RECOVERY_SUCCESS event.

Also, this patch introduce a driver internal function
rte_eth_fp_ops_setup which used as an help function for PMD.

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
 lib/ethdev/ethdev_driver.c              |  8 +++++++
 lib/ethdev/ethdev_driver.h              | 10 ++++++++
 lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
 lib/ethdev/version.map                  |  1 +
 5 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index c145a9066c..e380ff135a 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
 the PMD automatically recovers from error in PROACTIVE mode,
 and only a small amount of work is required for the application.
 
-During error detection and automatic recovery,
-the PMD sets the data path pointers to dummy functions
-(which will prevent the crash),
-and also make sure the control path operations fail with a return code ``-EBUSY``.
-
-Because the PMD recovers automatically,
-the application can only sense that the data flow is disconnected for a while
-and the control API returns an error in this period.
+During error detection and automatic recovery, the PMD sets the data path
+pointers to dummy functions and also make sure the control path operations
+failed with a return code ``-EBUSY``.
 
 In order to sense the error happening/recovering,
 as well as to restore some additional configuration,
@@ -653,9 +648,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_ERR_RECOVERING``
    Notify the application that an error is detected
-   and the recovery is being started.
+   and the recovery is about to start.
    Upon receiving the event, the application should not invoke
-   any control path function until receiving
+   any control and data path API until receiving
    ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
 
 .. note::
@@ -666,8 +661,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
    Notify the application that the recovery from error is successful,
-   the PMD already re-configures the port,
-   and the effect is the same as a restart operation.
+   the PMD already re-configures the port.
+   The application should restore some additional configuration, and then
+   enable data path API invocation.
 
 ``RTE_ETH_EVENT_RECOVERY_FAILED``
    Notify the application that the recovery from error failed,
diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index 0be1e8ca04..f994653fe9 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
 	return rc;
 }
 
+void
+rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
+{
+	if (dev == NULL)
+		return;
+	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
+}
+
 const struct rte_memzone *
 rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
 			 uint16_t queue_id, size_t size, unsigned int align,
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 2c9d615fb5..0d964d1f67 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1621,6 +1621,16 @@ int
 rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
 		 uint16_t queue_id);
 
+/**
+ * @internal
+ * Setup eth fast-path API to ethdev values.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ */
+__rte_internal
+void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
+
 /**
  * @internal
  * Atomically set the link status for the specific device.
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 049641d57c..44ee7229c1 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
 	 */
 	RTE_ETH_EVENT_RX_AVAIL_THRESH,
 	/** Port recovering from a hardware or firmware error.
-	 * If PMD supports proactive error recovery,
-	 * it should trigger this event to notify application
-	 * that it detected an error and the recovery is being started.
-	 * Upon receiving the event, the application should not invoke any control path API
-	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
-	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
-	 * The PMD will set the data path pointers to dummy functions,
-	 * and re-set the data path pointers to non-dummy functions
-	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
-	 * It means that the application cannot send or receive any packets
-	 * during this period.
+	 *
+	 * If PMD supports proactive error recovery, it should trigger this
+	 * event to notify application that it detected an error and the
+	 * recovery is about to start.
+	 *
+	 * Upon receiving the event, the application should not invoke any
+	 * control and data path API until receiving
+	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
+	 * event.
+	 *
+	 * Once this event is reported, the PMD will set the data path pointers
+	 * to dummy functions, and re-set the data path pointers to valid
+	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
+	 *
 	 * @note Before the PMD reports the recovery result,
 	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
 	 * because a larger error may occur during the recovery.
 	 */
 	RTE_ETH_EVENT_ERR_RECOVERING,
 	/** Port recovers successfully from the error.
-	 * The PMD already re-configured the port,
-	 * and the effect is the same as a restart operation.
+	 *
+	 * The PMD already re-configured the port:
 	 * a) The following operation will be retained: (alphabetically)
 	 *    - DCB configuration
 	 *    - FEC configuration
@@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
 	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
 	 * c) Any other configuration will not be stored
 	 *    and will need to be re-configured.
+	 *
+	 * The application should restore some additional configuration
+	 * (see above case b/c), and then enable data path API invocation.
 	 */
 	RTE_ETH_EVENT_RECOVERY_SUCCESS,
 	/** Port recovery failed.
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 357d1a88c0..c273e0bdae 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -320,6 +320,7 @@ INTERNAL {
 	rte_eth_devices;
 	rte_eth_dma_zone_free;
 	rte_eth_dma_zone_reserve;
+	rte_eth_fp_ops_setup;
 	rte_eth_hairpin_queue_peer_bind;
 	rte_eth_hairpin_queue_peer_unbind;
 	rte_eth_hairpin_queue_peer_update;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 2/5] net/hns3: replace fp ops config function
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
  2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
@ 2023-03-01  3:06 ` Chengwen Feng
  2023-03-02  6:50   ` Dongdong Liu
  2023-03-01  3:06 ` [PATCH 3/5] net/bnxt: fix race-condition when report error recovery Chengwen Feng
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, Dongdong Liu, Yisen Zhuang; +Cc: dev

This patch replace hns3_eth_dev_fp_ops_config() with
rte_eth_fp_ops_setup().

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 4065c519c3..6d02b4ee9f 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -4382,21 +4382,6 @@ hns3_trace_rxtx_function(struct rte_eth_dev *dev)
 		 rx_mode.info, tx_mode.info);
 }
 
-static void
-hns3_eth_dev_fp_ops_config(const struct rte_eth_dev *dev)
-{
-	struct rte_eth_fp_ops *fpo = rte_eth_fp_ops;
-	uint16_t port_id = dev->data->port_id;
-
-	fpo[port_id].rx_pkt_burst = dev->rx_pkt_burst;
-	fpo[port_id].tx_pkt_burst = dev->tx_pkt_burst;
-	fpo[port_id].tx_pkt_prepare = dev->tx_pkt_prepare;
-	fpo[port_id].rx_descriptor_status = dev->rx_descriptor_status;
-	fpo[port_id].tx_descriptor_status = dev->tx_descriptor_status;
-	fpo[port_id].rxq.data = dev->data->rx_queues;
-	fpo[port_id].txq.data = dev->data->tx_queues;
-}
-
 void
 hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 {
@@ -4419,7 +4404,7 @@ hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 	}
 
 	hns3_trace_rxtx_function(eth_dev);
-	hns3_eth_dev_fp_ops_config(eth_dev);
+	rte_eth_fp_ops_setup(eth_dev);
 }
 
 void
@@ -4741,7 +4726,7 @@ hns3_stop_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 	dev->tx_pkt_prepare = NULL;
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
@@ -4758,7 +4743,7 @@ hns3_start_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = hns3_get_tx_function(dev);
 	dev->tx_pkt_prepare = hns3_get_tx_prepare(dev);
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 3/5] net/bnxt: fix race-condition when report error recovery
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
  2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
  2023-03-01  3:06 ` [PATCH 2/5] net/hns3: replace fp ops config function Chengwen Feng
@ 2023-03-01  3:06 ` Chengwen Feng
  2023-03-02 12:23   ` Konstantin Ananyev
  2023-03-01  3:06 ` [PATCH 4/5] net/bnxt: use fp ops setup function Chengwen Feng
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, Ajit Khaparde,
	Somnath Kotur, Kalesh AP
  Cc: dev

If set data path functions to dummy functions before reports error
recovering event, there maybe a race-condition with data path threads,
this patch fixes it by setting data path functions to dummy functions
only after reports such event.

Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
 drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 5bb376d4d5..3950840600 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
 		PMD_DRV_LOG(INFO, "Port conn async event\n");
 		break;
 	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
-		/*
-		 * Avoid any rx/tx packet processing during firmware reset
-		 * operation.
-		 */
-		bnxt_stop_rxtx(bp->eth_dev);
-
 		/* Ignore reset notify async events when stopping the port */
 		if (!bp->eth_dev->data->dev_started) {
+			bnxt_stop_rxtx(bp->eth_dev);
 			bp->flags |= BNXT_FLAG_FATAL_ERROR;
 			return;
 		}
@@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
 					     RTE_ETH_EVENT_ERR_RECOVERING,
 					     NULL);
 
+		/*
+		 * Avoid any rx/tx packet processing during firmware reset
+		 * operation.
+		 */
+		bnxt_stop_rxtx(bp->eth_dev);
+
 		pthread_mutex_lock(&bp->err_recovery_lock);
 		event_data = data1;
 		/* timestamp_lo/hi values are in units of 100ms */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 753e86b4b2..4083a69d02 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4562,14 +4562,14 @@ static void bnxt_check_fw_health(void *arg)
 	bp->flags |= BNXT_FLAG_FATAL_ERROR;
 	bp->flags |= BNXT_FLAG_FW_RESET;
 
-	bnxt_stop_rxtx(bp->eth_dev);
-
 	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
 
 	rte_eth_dev_callback_process(bp->eth_dev,
 				     RTE_ETH_EVENT_ERR_RECOVERING,
 				     NULL);
 
+	bnxt_stop_rxtx(bp->eth_dev);
+
 	if (bnxt_is_primary_func(bp))
 		wait_msec = info->primary_func_wait_period;
 	else
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
                   ` (2 preceding siblings ...)
  2023-03-01  3:06 ` [PATCH 3/5] net/bnxt: fix race-condition when report error recovery Chengwen Feng
@ 2023-03-01  3:06 ` Chengwen Feng
  2023-03-02 12:30   ` Konstantin Ananyev
  2023-03-01  3:06 ` [PATCH 5/5] app/testpmd: add error recovery usage demo Chengwen Feng
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, Ajit Khaparde, Somnath Kotur
  Cc: dev

Use rte_eth_fp_ops_setup() instead of directly manipulating
rte_eth_fp_ops variable.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 5 +----
 drivers/net/bnxt/bnxt_ethdev.c | 5 +----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 3950840600..a3f33c24c3 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
 	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
 	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 
-	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
-		eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
-		eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(eth_dev);
 	rte_mb();
 
 	/* Allow time for threads to exit the real burst functions. */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 4083a69d02..d6064ceea4 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
 	if (rc)
 		goto err_start;
 
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
-		bp->eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
-		bp->eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(bp->eth_dev);
 	rte_mb();
 
 	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 5/5] app/testpmd: add error recovery usage demo
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
                   ` (3 preceding siblings ...)
  2023-03-01  3:06 ` [PATCH 4/5] net/bnxt: use fp ops setup function Chengwen Feng
@ 2023-03-01  3:06 ` Chengwen Feng
  2023-03-02 13:01   ` Konstantin Ananyev
  2023-09-21 11:12 ` [PATCH 0/5] fix race-condition of proactive error handling mode Ferruh Yigit
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-03-01  3:06 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, Aman Singh, Yuying Zhang; +Cc: dev

This patch adds error recovery usage demo which will:
1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
   is received.
2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
   event is received.
3. prompt the ports that fail to recovery and need to be removed when
   the RTE_ETH_EVENT_RECOVERY_FAILED event is received.

In addition, a message is added to the printed information, requiring
no command to be executed during the error recovery.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h |  4 ++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 0c14325b8d..fdc3ae604b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3823,6 +3823,77 @@ rmv_port_callback(void *arg)
 		start_packet_forwarding(0);
 }
 
+static int need_start_when_recovery_over;
+
+static bool
+has_port_in_err_recovering(void)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->err_recovering)
+			return true;
+	}
+
+	return false;
+}
+
+static void
+err_recovering_callback(portid_t port_id)
+{
+	if (!has_port_in_err_recovering())
+		printf("Please stop executing any commands until recovery result events are received!\n");
+
+	ports[port_id].err_recovering = 1;
+	ports[port_id].recover_failed = 0;
+
+	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
+	if (!test_done) {
+		printf("Stop packet forwarding because some ports are in error recovering!\n");
+		stop_packet_forwarding();
+		need_start_when_recovery_over = 1;
+	}
+}
+
+static void
+recover_success_callback(portid_t port_id)
+{
+	ports[port_id].err_recovering = 0;
+	if (has_port_in_err_recovering())
+		return;
+
+	if (need_start_when_recovery_over) {
+		printf("Recovery success! Restart packet forwarding!\n");
+		start_packet_forwarding(0);
+		need_start_when_recovery_over = 0;
+	} else {
+		printf("Recovery success!\n");
+	}
+}
+
+static void
+recover_failed_callback(portid_t port_id)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	ports[port_id].err_recovering = 0;
+	ports[port_id].recover_failed = 1;
+	if (has_port_in_err_recovering())
+		return;
+
+	need_start_when_recovery_over = 0;
+	printf("The ports:");
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->recover_failed)
+			printf(" %u", pid);
+	}
+	printf(" recovery failed! Please remove them!\n");
+}
+
 /* This function is used by the interrupt thread */
 static int
 eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
@@ -3878,6 +3949,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
 		}
 		break;
 	}
+	case RTE_ETH_EVENT_ERR_RECOVERING:
+		err_recovering_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
+		recover_success_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_FAILED:
+		recover_failed_callback(port_id);
+		break;
 	default:
 		break;
 	}
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 329a6378a1..1bbf82a96c 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -323,7 +323,9 @@ struct rte_port {
 	uint8_t                 slave_flag : 1, /**< bonding slave port */
 				bond_flag : 1, /**< port is bond device */
 				fwd_mac_swap : 1, /**< swap packet MAC before forward */
-				update_conf : 1; /**< need to update bonding device configuration */
+				update_conf : 1, /**< need to update bonding device configuration */
+				err_recovering : 1, /**< port is in error recovering */
+				recover_failed : 1; /**< port recover failed */
 	struct port_template    *pattern_templ_list; /**< Pattern templates. */
 	struct port_template    *actions_templ_list; /**< Actions templates. */
 	struct port_table       *table_list; /**< Flow tables. */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 2/5] net/hns3: replace fp ops config function
  2023-03-01  3:06 ` [PATCH 2/5] net/hns3: replace fp ops config function Chengwen Feng
@ 2023-03-02  6:50   ` Dongdong Liu
  0 siblings, 0 replies; 85+ messages in thread
From: Dongdong Liu @ 2023-03-02  6:50 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev, Yisen Zhuang; +Cc: dev


On 2023/3/1 11:06, Chengwen Feng wrote:
> This patch replace hns3_eth_dev_fp_ops_config() with
> rte_eth_fp_ops_setup().
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

Acked-by: Dongdong Liu <liudongdong3@huawei.com>

Thanks,
Dongdong

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
@ 2023-03-02 12:08   ` Konstantin Ananyev
  2023-03-03 16:51     ` Ferruh Yigit
  2023-03-02 23:30   ` Honnappa Nagarahalli
  1 sibling, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-02 12:08 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde
  Cc: dev


> In the proactive error handling mode, the PMD will set the data path
> pointers to dummy functions and then try recovery, in this period the
> application may still invoking data path API. This will introduce a
> race-condition with data path which may lead to crash [1].
> 
> Although the PMD added delay after setting data path pointers to cover
> the above race-condition, it reduces the probability, but it doesn't
> solve the problem.
> 
> To solve the race-condition problem fundamentally, the following
> requirements are added:
> 1. The PMD should set the data path pointers to dummy functions after
>    report RTE_ETH_EVENT_ERR_RECOVERING event.
> 2. The application should stop data path API invocation when process
>    the RTE_ETH_EVENT_ERR_RECOVERING event.
> 3. The PMD should set the data path pointers to valid functions before
>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> 4. The application should enable data path API invocation when process
>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> 
> Also, this patch introduce a driver internal function
> rte_eth_fp_ops_setup which used as an help function for PMD.
> 
> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> 
> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>  lib/ethdev/version.map                  |  1 +
>  5 files changed, 46 insertions(+), 25 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> index c145a9066c..e380ff135a 100644
> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>  the PMD automatically recovers from error in PROACTIVE mode,
>  and only a small amount of work is required for the application.
> 
> -During error detection and automatic recovery,
> -the PMD sets the data path pointers to dummy functions
> -(which will prevent the crash),
> -and also make sure the control path operations fail with a return code ``-EBUSY``.
> -
> -Because the PMD recovers automatically,
> -the application can only sense that the data flow is disconnected for a while
> -and the control API returns an error in this period.
> +During error detection and automatic recovery, the PMD sets the data path
> +pointers to dummy functions and also make sure the control path operations
> +failed with a return code ``-EBUSY``.
> 
>  In order to sense the error happening/recovering,
>  as well as to restore some additional configuration,
> @@ -653,9 +648,9 @@ three events are available:
> 
>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>     Notify the application that an error is detected
> -   and the recovery is being started.
> +   and the recovery is about to start.
>     Upon receiving the event, the application should not invoke
> -   any control path function until receiving
> +   any control and data path API until receiving
>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> 
>  .. note::
> @@ -666,8 +661,9 @@ three events are available:
> 
>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>     Notify the application that the recovery from error is successful,
> -   the PMD already re-configures the port,
> -   and the effect is the same as a restart operation.
> +   the PMD already re-configures the port.
> +   The application should restore some additional configuration, and then
> +   enable data path API invocation.
> 
>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>     Notify the application that the recovery from error failed,
> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> index 0be1e8ca04..f994653fe9 100644
> --- a/lib/ethdev/ethdev_driver.c
> +++ b/lib/ethdev/ethdev_driver.c
> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>  	return rc;
>  }
> 
> +void
> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> +{
> +	if (dev == NULL)
> +		return;
> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> +}
> +
>  const struct rte_memzone *
>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>  			 uint16_t queue_id, size_t size, unsigned int align,
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 2c9d615fb5..0d964d1f67 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1621,6 +1621,16 @@ int
>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>  		 uint16_t queue_id);
> 
> +/**
> + * @internal
> + * Setup eth fast-path API to ethdev values.
> + *
> + * @param dev
> + *  Pointer to struct rte_eth_dev.
> + */
> +__rte_internal
> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> +
>  /**
>   * @internal
>   * Atomically set the link status for the specific device.
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 049641d57c..44ee7229c1 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>  	 */
>  	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>  	/** Port recovering from a hardware or firmware error.
> -	 * If PMD supports proactive error recovery,
> -	 * it should trigger this event to notify application
> -	 * that it detected an error and the recovery is being started.
> -	 * Upon receiving the event, the application should not invoke any control path API
> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
> -	 * The PMD will set the data path pointers to dummy functions,
> -	 * and re-set the data path pointers to non-dummy functions
> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> -	 * It means that the application cannot send or receive any packets
> -	 * during this period.
> +	 *
> +	 * If PMD supports proactive error recovery, it should trigger this
> +	 * event to notify application that it detected an error and the
> +	 * recovery is about to start.
> +	 *
> +	 * Upon receiving the event, the application should not invoke any
> +	 * control and data path API until receiving
> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> +	 * event.
> +	 *
> +	 * Once this event is reported, the PMD will set the data path pointers
> +	 * to dummy functions, and re-set the data path pointers to valid
> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> +	 *
>  	 * @note Before the PMD reports the recovery result,
>  	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>  	 * because a larger error may occur during the recovery.
>  	 */
>  	RTE_ETH_EVENT_ERR_RECOVERING,
>  	/** Port recovers successfully from the error.
> -	 * The PMD already re-configured the port,
> -	 * and the effect is the same as a restart operation.
> +	 *
> +	 * The PMD already re-configured the port:
>  	 * a) The following operation will be retained: (alphabetically)
>  	 *    - DCB configuration
>  	 *    - FEC configuration
> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>  	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>  	 * c) Any other configuration will not be stored
>  	 *    and will need to be re-configured.
> +	 *
> +	 * The application should restore some additional configuration
> +	 * (see above case b/c), and then enable data path API invocation.
>  	 */
>  	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>  	/** Port recovery failed.
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 357d1a88c0..c273e0bdae 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -320,6 +320,7 @@ INTERNAL {
>  	rte_eth_devices;
>  	rte_eth_dma_zone_free;
>  	rte_eth_dma_zone_reserve;
> +	rte_eth_fp_ops_setup;
>  	rte_eth_hairpin_queue_peer_bind;
>  	rte_eth_hairpin_queue_peer_unbind;
>  	rte_eth_hairpin_queue_peer_update;
> --
 
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 3/5] net/bnxt: fix race-condition when report error recovery
  2023-03-01  3:06 ` [PATCH 3/5] net/bnxt: fix race-condition when report error recovery Chengwen Feng
@ 2023-03-02 12:23   ` Konstantin Ananyev
  0 siblings, 0 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-02 12:23 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, Ajit Khaparde, Somnath Kotur,
	Kalesh AP
  Cc: dev


> If set data path functions to dummy functions before reports error
> recovering event, there maybe a race-condition with data path threads,
> this patch fixes it by setting data path functions to dummy functions
> only after reports such event.
> 
> Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
>  drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index 5bb376d4d5..3950840600 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
>  		PMD_DRV_LOG(INFO, "Port conn async event\n");
>  		break;
>  	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
> -		/*
> -		 * Avoid any rx/tx packet processing during firmware reset
> -		 * operation.
> -		 */
> -		bnxt_stop_rxtx(bp->eth_dev);
> -
>  		/* Ignore reset notify async events when stopping the port */
>  		if (!bp->eth_dev->data->dev_started) {
> +			bnxt_stop_rxtx(bp->eth_dev);
>  			bp->flags |= BNXT_FLAG_FATAL_ERROR;
>  			return;
>  		}
> @@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
>  					     RTE_ETH_EVENT_ERR_RECOVERING,
>  					     NULL);
> 
> +		/*
> +		 * Avoid any rx/tx packet processing during firmware reset
> +		 * operation.
> +		 */
> +		bnxt_stop_rxtx(bp->eth_dev);
> +
>  		pthread_mutex_lock(&bp->err_recovery_lock);
>  		event_data = data1;
>  		/* timestamp_lo/hi values are in units of 100ms */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 753e86b4b2..4083a69d02 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4562,14 +4562,14 @@ static void bnxt_check_fw_health(void *arg)
>  	bp->flags |= BNXT_FLAG_FATAL_ERROR;
>  	bp->flags |= BNXT_FLAG_FW_RESET;
> 
> -	bnxt_stop_rxtx(bp->eth_dev);
> -
>  	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
> 
>  	rte_eth_dev_callback_process(bp->eth_dev,
>  				     RTE_ETH_EVENT_ERR_RECOVERING,
>  				     NULL);
> 
> +	bnxt_stop_rxtx(bp->eth_dev);
> +
>  	if (bnxt_is_primary_func(bp))
>  		wait_msec = info->primary_func_wait_period;
>  	else
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-01  3:06 ` [PATCH 4/5] net/bnxt: use fp ops setup function Chengwen Feng
@ 2023-03-02 12:30   ` Konstantin Ananyev
  2023-03-03  0:01     ` Konstantin Ananyev
  2023-03-03  1:38     ` fengchengwen
  0 siblings, 2 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-02 12:30 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, Ajit Khaparde, Somnath Kotur; +Cc: dev


> Use rte_eth_fp_ops_setup() instead of directly manipulating
> rte_eth_fp_ops variable.
> 
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>  drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>  2 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index 3950840600..a3f33c24c3 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>  	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>  	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;

I am not that familiar with bnxt driver, but shouldn't we set here
other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?
 
> 
> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> -		eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> -		eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(eth_dev);
>  	rte_mb();
> 
>  	/* Allow time for threads to exit the real burst functions. */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 4083a69d02..d6064ceea4 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
>  	if (rc)
>  		goto err_start;
> 
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> -		bp->eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> -		bp->eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(bp->eth_dev);
>  	rte_mb();
> 
>  	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 5/5] app/testpmd: add error recovery usage demo
  2023-03-01  3:06 ` [PATCH 5/5] app/testpmd: add error recovery usage demo Chengwen Feng
@ 2023-03-02 13:01   ` Konstantin Ananyev
  2023-03-03  1:49     ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-02 13:01 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, Aman Singh, Yuying Zhang; +Cc: dev



> 
> This patch adds error recovery usage demo which will:
> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>    is received.
> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>    event is received.
> 3. prompt the ports that fail to recovery and need to be removed when
>    the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
> 
> In addition, a message is added to the printed information, requiring
> no command to be executed during the error recovery.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>  app/test-pmd/testpmd.h |  4 ++-
>  2 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 0c14325b8d..fdc3ae604b 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3823,6 +3823,77 @@ rmv_port_callback(void *arg)
>  		start_packet_forwarding(0);
>  }
> 
> +static int need_start_when_recovery_over;
> +
> +static bool
> +has_port_in_err_recovering(void)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->err_recovering)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +static void
> +err_recovering_callback(portid_t port_id)
> +{
> +	if (!has_port_in_err_recovering())
> +		printf("Please stop executing any commands until recovery result events are received!\n");
> +
> +	ports[port_id].err_recovering = 1;
> +	ports[port_id].recover_failed = 0;
> +
> +	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
> +	if (!test_done) {
> +		printf("Stop packet forwarding because some ports are in error recovering!\n");
> +		stop_packet_forwarding();
> +		need_start_when_recovery_over = 1;
> +	}
> +}

One thought I have - should we somehow stop user to attempt restart RX/TX while recovery
in progress?
But probably it is an overkill, and just documenting what is happening is enough....
Do we need to update testpmd UG with some short description?
Apart from that, LGTM:
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

> +
> +static void
> +recover_success_callback(portid_t port_id)
> +{
> +	ports[port_id].err_recovering = 0;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	if (need_start_when_recovery_over) {
> +		printf("Recovery success! Restart packet forwarding!\n");
> +		start_packet_forwarding(0);
> +		need_start_when_recovery_over = 0;
> +	} else {
> +		printf("Recovery success!\n");
> +	}
> +}
> +
> +static void
> +recover_failed_callback(portid_t port_id)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	ports[port_id].err_recovering = 0;
> +	ports[port_id].recover_failed = 1;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	need_start_when_recovery_over = 0;
> +	printf("The ports:");
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->recover_failed)
> +			printf(" %u", pid);
> +	}
> +	printf(" recovery failed! Please remove them!\n");
> +}
> +
>  /* This function is used by the interrupt thread */
>  static int
>  eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
> @@ -3878,6 +3949,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>  		}
>  		break;
>  	}
> +	case RTE_ETH_EVENT_ERR_RECOVERING:
> +		err_recovering_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
> +		recover_success_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_FAILED:
> +		recover_failed_callback(port_id);
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 329a6378a1..1bbf82a96c 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -323,7 +323,9 @@ struct rte_port {
>  	uint8_t                 slave_flag : 1, /**< bonding slave port */
>  				bond_flag : 1, /**< port is bond device */
>  				fwd_mac_swap : 1, /**< swap packet MAC before forward */
> -				update_conf : 1; /**< need to update bonding device configuration */
> +				update_conf : 1, /**< need to update bonding device configuration */
> +				err_recovering : 1, /**< port is in error recovering */
> +				recover_failed : 1; /**< port recover failed */
>  	struct port_template    *pattern_templ_list; /**< Pattern templates. */
>  	struct port_template    *actions_templ_list; /**< Actions templates. */
>  	struct port_table       *table_list; /**< Flow tables. */
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
  2023-03-02 12:08   ` Konstantin Ananyev
@ 2023-03-02 23:30   ` Honnappa Nagarahalli
  2023-03-03  0:21     ` Konstantin Ananyev
  1 sibling, 1 reply; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-02 23:30 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: dev, nd, nd



> -----Original Message-----
> From: Chengwen Feng <fengchengwen@huawei.com>
> Sent: Tuesday, February 28, 2023 9:06 PM
> To: thomas@monjalon.net; ferruh.yigit@amd.com;
> konstantin.ananyev@huawei.com; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> anakkur.purayil@broadcom.com>; Ajit Khaparde
> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> Cc: dev@dpdk.org
> Subject: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> In the proactive error handling mode, the PMD will set the data path pointers to
> dummy functions and then try recovery, in this period the application may still
> invoking data path API. This will introduce a race-condition with data path which
> may lead to crash [1].
> 
> Although the PMD added delay after setting data path pointers to cover the
> above race-condition, it reduces the probability, but it doesn't solve the
> problem.
> 
> To solve the race-condition problem fundamentally, the following requirements
> are added:
> 1. The PMD should set the data path pointers to dummy functions after
>    report RTE_ETH_EVENT_ERR_RECOVERING event.
Do you mean to say, PMD should set the data path pointers after calling the call back function?
The PMD is running in the context of multiple EAL threads. How do these threads synchronize such that only one thread sets these data pointers?

> 2. The application should stop data path API invocation when process
>    the RTE_ETH_EVENT_ERR_RECOVERING event.
Any thoughts on how an application can do this?

> 3. The PMD should set the data path pointers to valid functions before
>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> 4. The application should enable data path API invocation when process
>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
Do you mean to say that the application should not call the datapath APIs while the PMD is running the recovery process?

> 
> Also, this patch introduce a driver internal function rte_eth_fp_ops_setup
> which used as an help function for PMD.
> 
> [1]
> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-
> ashok.k.kaladi@intel.com/
> 
> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>  lib/ethdev/version.map                  |  1 +
>  5 files changed, 46 insertions(+), 25 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> b/doc/guides/prog_guide/poll_mode_drv.rst
> index c145a9066c..e380ff135a 100644
> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> @@ -638,14 +638,9 @@ different from the application invokes recovery in
> PASSIVE mode,  the PMD automatically recovers from error in PROACTIVE
> mode,  and only a small amount of work is required for the application.
> 
> -During error detection and automatic recovery, -the PMD sets the data path
> pointers to dummy functions -(which will prevent the crash), -and also make
> sure the control path operations fail with a return code ``-EBUSY``.
> -
> -Because the PMD recovers automatically, -the application can only sense that
> the data flow is disconnected for a while -and the control API returns an error in
> this period.
> +During error detection and automatic recovery, the PMD sets the data
> +path pointers to dummy functions and also make sure the control path
> +operations failed with a return code ``-EBUSY``.
> 
>  In order to sense the error happening/recovering,  as well as to restore some
> additional configuration, @@ -653,9 +648,9 @@ three events are available:
> 
>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>     Notify the application that an error is detected
> -   and the recovery is being started.
> +   and the recovery is about to start.
>     Upon receiving the event, the application should not invoke
> -   any control path function until receiving
> +   any control and data path API until receiving
>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> 
>  .. note::
> @@ -666,8 +661,9 @@ three events are available:
> 
>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>     Notify the application that the recovery from error is successful,
> -   the PMD already re-configures the port,
> -   and the effect is the same as a restart operation.
> +   the PMD already re-configures the port.
> +   The application should restore some additional configuration, and then
What is the additional configuration? Is this specific to each NIC/PMD?
I thought, this is an auto recovery process and the application does not require to reconfigure anything. If the application has to restore the configuration, how does auto recovery differ from typical recovery process?

> +   enable data path API invocation.
> 
>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>     Notify the application that the recovery from error failed, diff --git
> a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index
> 0be1e8ca04..f994653fe9 100644
> --- a/lib/ethdev/ethdev_driver.c
> +++ b/lib/ethdev/ethdev_driver.c
> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> *dev, const char *ring_name,
>  	return rc;
>  }
> 
> +void
> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
> +	if (dev == NULL)
> +		return;
> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
> +
>  const struct rte_memzone *
>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> *ring_name,
>  			 uint16_t queue_id, size_t size, unsigned int align, diff -
> -git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
> 2c9d615fb5..0d964d1f67 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1621,6 +1621,16 @@ int
>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char
> *name,
>  		 uint16_t queue_id);
> 
> +/**
> + * @internal
> + * Setup eth fast-path API to ethdev values.
> + *
> + * @param dev
> + *  Pointer to struct rte_eth_dev.
> + */
> +__rte_internal
> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> +
>  /**
>   * @internal
>   * Atomically set the link status for the specific device.
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> 049641d57c..44ee7229c1 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>  	 */
>  	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>  	/** Port recovering from a hardware or firmware error.
> -	 * If PMD supports proactive error recovery,
> -	 * it should trigger this event to notify application
> -	 * that it detected an error and the recovery is being started.
> -	 * Upon receiving the event, the application should not invoke any
> control path API
> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> RTE_ETH_EVENT_RECOVERY_FAILED event.
> -	 * The PMD will set the data path pointers to dummy functions,
> -	 * and re-set the data path pointers to non-dummy functions
> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> -	 * It means that the application cannot send or receive any packets
> -	 * during this period.
> +	 *
> +	 * If PMD supports proactive error recovery, it should trigger this
> +	 * event to notify application that it detected an error and the
> +	 * recovery is about to start.
> +	 *
> +	 * Upon receiving the event, the application should not invoke any
> +	 * control and data path API until receiving
> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> RTE_ETH_EVENT_RECOVERY_FAILED
> +	 * event.
> +	 *
> +	 * Once this event is reported, the PMD will set the data path pointers
> +	 * to dummy functions, and re-set the data path pointers to valid
> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> event.
Why do we need to set the data path pointers to dummy functions if the application is restricted from invoking any control and data path APIs till the recovery process is completed?

> +	 *
>  	 * @note Before the PMD reports the recovery result,
>  	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> again,
>  	 * because a larger error may occur during the recovery.
>  	 */
>  	RTE_ETH_EVENT_ERR_RECOVERING,
I understand this is not a change in this patch. But, just wondering, what is the purpose of this? How is the application supposed to use this?

>  	/** Port recovers successfully from the error.
> -	 * The PMD already re-configured the port,
> -	 * and the effect is the same as a restart operation.
> +	 *
> +	 * The PMD already re-configured the port:
>  	 * a) The following operation will be retained: (alphabetically)
>  	 *    - DCB configuration
>  	 *    - FEC configuration
> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>  	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>  	 * c) Any other configuration will not be stored
>  	 *    and will need to be re-configured.
> +	 *
> +	 * The application should restore some additional configuration
> +	 * (see above case b/c), and then enable data path API invocation.
>  	 */
>  	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>  	/** Port recovery failed.
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> 357d1a88c0..c273e0bdae 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -320,6 +320,7 @@ INTERNAL {
>  	rte_eth_devices;
>  	rte_eth_dma_zone_free;
>  	rte_eth_dma_zone_reserve;
> +	rte_eth_fp_ops_setup;
>  	rte_eth_hairpin_queue_peer_bind;
>  	rte_eth_hairpin_queue_peer_unbind;
>  	rte_eth_hairpin_queue_peer_update;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-02 12:30   ` Konstantin Ananyev
@ 2023-03-03  0:01     ` Konstantin Ananyev
  2023-03-03  1:17       ` Ajit Khaparde
  2023-03-03  2:02       ` fengchengwen
  2023-03-03  1:38     ` fengchengwen
  1 sibling, 2 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-03  0:01 UTC (permalink / raw)
  To: dev

02/03/2023 12:30, Konstantin Ananyev пишет:
> 
>> Use rte_eth_fp_ops_setup() instead of directly manipulating
>> rte_eth_fp_ops variable.
>>
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>>   2 files changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
>> index 3950840600..a3f33c24c3 100644
>> --- a/drivers/net/bnxt/bnxt_cpr.c
>> +++ b/drivers/net/bnxt/bnxt_cpr.c
>> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>>   	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>>   	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> 
> I am not that familiar with bnxt driver, but shouldn't we set here
> other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?

After another thought - wouldn't it be better just to call 
fp_ops_reset() here?

>   
>>
>> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
>> -		eth_dev->rx_pkt_burst;
>> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
>> -		eth_dev->tx_pkt_burst;
>> +	rte_eth_fp_ops_setup(eth_dev);
>>   	rte_mb();
>>
>>   	/* Allow time for threads to exit the real burst functions. */
>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
>> index 4083a69d02..d6064ceea4 100644
>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
>>   	if (rc)
>>   		goto err_start;
>>
>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
>> -		bp->eth_dev->rx_pkt_burst;
>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
>> -		bp->eth_dev->tx_pkt_burst;
>> +	rte_eth_fp_ops_setup(bp->eth_dev);
>>   	rte_mb();
>>
>>   	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
>> --
>> 2.17.1
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-02 23:30   ` Honnappa Nagarahalli
@ 2023-03-03  0:21     ` Konstantin Ananyev
  2023-03-04  5:08       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-03  0:21 UTC (permalink / raw)
  To: dev



> 
>> -----Original Message-----
>> From: Chengwen Feng <fengchengwen@huawei.com>
>> Sent: Tuesday, February 28, 2023 9:06 PM
>> To: thomas@monjalon.net; ferruh.yigit@amd.com;
>> konstantin.ananyev@huawei.com; Andrew Rybchenko
>> <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
>> anakkur.purayil@broadcom.com>; Ajit Khaparde
>> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
>> Cc: dev@dpdk.org
>> Subject: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
>> mode
>>
>> In the proactive error handling mode, the PMD will set the data path pointers to
>> dummy functions and then try recovery, in this period the application may still
>> invoking data path API. This will introduce a race-condition with data path which
>> may lead to crash [1].
>>
>> Although the PMD added delay after setting data path pointers to cover the
>> above race-condition, it reduces the probability, but it doesn't solve the
>> problem.
>>
>> To solve the race-condition problem fundamentally, the following requirements
>> are added:
>> 1. The PMD should set the data path pointers to dummy functions after
>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> Do you mean to say, PMD should set the data path pointers after calling the call back function?
> The PMD is running in the context of multiple EAL threads. How do these threads synchronize such that only one thread sets these data pointers?

As I understand this event callback supposed to be called in the context 
of EAL interrupt thread (whoever is more familiar with original idea, 
feel free to correct me if I missed something).
How it is going to signal data-path threads that they need to 
stop/suspend calling data-path API - that's I suppose is left to 
application to decide...
Same as right now it is application responsibility to stop data-path 
threads before doing dev_stop()/dev/_config()/etc.


> 
>> 2. The application should stop data path API invocation when process
>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> Any thoughts on how an application can do this?
> 
>> 3. The PMD should set the data path pointers to valid functions before
>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>> 4. The application should enable data path API invocation when process
>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> Do you mean to say that the application should not call the datapath APIs while the PMD is running the recovery process?

Yes, I believe that's the intention.

>>
>> Also, this patch introduce a driver internal function rte_eth_fp_ops_setup
>> which used as an help function for PMD.
>>
>> [1]
>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-
>> ashok.k.kaladi@intel.com/
>>
>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>   lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>   lib/ethdev/version.map                  |  1 +
>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>
>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>> b/doc/guides/prog_guide/poll_mode_drv.rst
>> index c145a9066c..e380ff135a 100644
>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>> @@ -638,14 +638,9 @@ different from the application invokes recovery in
>> PASSIVE mode,  the PMD automatically recovers from error in PROACTIVE
>> mode,  and only a small amount of work is required for the application.
>>
>> -During error detection and automatic recovery, -the PMD sets the data path
>> pointers to dummy functions -(which will prevent the crash), -and also make
>> sure the control path operations fail with a return code ``-EBUSY``.
>> -
>> -Because the PMD recovers automatically, -the application can only sense that
>> the data flow is disconnected for a while -and the control API returns an error in
>> this period.
>> +During error detection and automatic recovery, the PMD sets the data
>> +path pointers to dummy functions and also make sure the control path
>> +operations failed with a return code ``-EBUSY``.
>>
>>   In order to sense the error happening/recovering,  as well as to restore some
>> additional configuration, @@ -653,9 +648,9 @@ three events are available:
>>
>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>      Notify the application that an error is detected
>> -   and the recovery is being started.
>> +   and the recovery is about to start.
>>      Upon receiving the event, the application should not invoke
>> -   any control path function until receiving
>> +   any control and data path API until receiving
>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>
>>   .. note::
>> @@ -666,8 +661,9 @@ three events are available:
>>
>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>      Notify the application that the recovery from error is successful,
>> -   the PMD already re-configures the port,
>> -   and the effect is the same as a restart operation.
>> +   the PMD already re-configures the port.
>> +   The application should restore some additional configuration, and then
> What is the additional configuration? Is this specific to each NIC/PMD?
> I thought, this is an auto recovery process and the application does not require to reconfigure anything. If the application has to restore the configuration, how does auto recovery differ from typical recovery process?
> 
>> +   enable data path API invocation.
>>
>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>      Notify the application that the recovery from error failed, diff --git
>> a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index
>> 0be1e8ca04..f994653fe9 100644
>> --- a/lib/ethdev/ethdev_driver.c
>> +++ b/lib/ethdev/ethdev_driver.c
>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>> *dev, const char *ring_name,
>>   	return rc;
>>   }
>>
>> +void
>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
>> +	if (dev == NULL)
>> +		return;
>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
>> +
>>   const struct rte_memzone *
>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>> *ring_name,
>>   			 uint16_t queue_id, size_t size, unsigned int align, diff -
>> -git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
>> 2c9d615fb5..0d964d1f67 100644
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -1621,6 +1621,16 @@ int
>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char
>> *name,
>>   		 uint16_t queue_id);
>>
>> +/**
>> + * @internal
>> + * Setup eth fast-path API to ethdev values.
>> + *
>> + * @param dev
>> + *  Pointer to struct rte_eth_dev.
>> + */
>> +__rte_internal
>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>> +
>>   /**
>>    * @internal
>>    * Atomically set the link status for the specific device.
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>> 049641d57c..44ee7229c1 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>   	 */
>>   	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>   	/** Port recovering from a hardware or firmware error.
>> -	 * If PMD supports proactive error recovery,
>> -	 * it should trigger this event to notify application
>> -	 * that it detected an error and the recovery is being started.
>> -	 * Upon receiving the event, the application should not invoke any
>> control path API
>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>> -	 * The PMD will set the data path pointers to dummy functions,
>> -	 * and re-set the data path pointers to non-dummy functions
>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>> -	 * It means that the application cannot send or receive any packets
>> -	 * during this period.
>> +	 *
>> +	 * If PMD supports proactive error recovery, it should trigger this
>> +	 * event to notify application that it detected an error and the
>> +	 * recovery is about to start.
>> +	 *
>> +	 * Upon receiving the event, the application should not invoke any
>> +	 * control and data path API until receiving
>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>> RTE_ETH_EVENT_RECOVERY_FAILED
>> +	 * event.
>> +	 *
>> +	 * Once this event is reported, the PMD will set the data path pointers
>> +	 * to dummy functions, and re-set the data path pointers to valid
>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>> event.
> Why do we need to set the data path pointers to dummy functions if the application is restricted from invoking any control and data path APIs till the recovery process is completed?

You are right, in theory it is not mandatory.
Though it helps to flag a problem if user will still try to call them
while recovery is in progress.
Again, same as we doing in dev_stop().

> 
>> +	 *
>>   	 * @note Before the PMD reports the recovery result,
>>   	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>> again,
>>   	 * because a larger error may occur during the recovery.
>>   	 */
>>   	RTE_ETH_EVENT_ERR_RECOVERING,
> I understand this is not a change in this patch. But, just wondering, what is the purpose of this? How is the application supposed to use this?
> 
>>   	/** Port recovers successfully from the error.
>> -	 * The PMD already re-configured the port,
>> -	 * and the effect is the same as a restart operation.
>> +	 *
>> +	 * The PMD already re-configured the port:
>>   	 * a) The following operation will be retained: (alphabetically)
>>   	 *    - DCB configuration
>>   	 *    - FEC configuration
>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>   	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>   	 * c) Any other configuration will not be stored
>>   	 *    and will need to be re-configured.
>> +	 *
>> +	 * The application should restore some additional configuration
>> +	 * (see above case b/c), and then enable data path API invocation.
>>   	 */
>>   	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>   	/** Port recovery failed.
>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>> 357d1a88c0..c273e0bdae 100644
>> --- a/lib/ethdev/version.map
>> +++ b/lib/ethdev/version.map
>> @@ -320,6 +320,7 @@ INTERNAL {
>>   	rte_eth_devices;
>>   	rte_eth_dma_zone_free;
>>   	rte_eth_dma_zone_reserve;
>> +	rte_eth_fp_ops_setup;
>>   	rte_eth_hairpin_queue_peer_bind;
>>   	rte_eth_hairpin_queue_peer_unbind;
>>   	rte_eth_hairpin_queue_peer_update;
>> --
>> 2.17.1
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-03  0:01     ` Konstantin Ananyev
@ 2023-03-03  1:17       ` Ajit Khaparde
  2023-03-03  2:02       ` fengchengwen
  1 sibling, 0 replies; 85+ messages in thread
From: Ajit Khaparde @ 2023-03-03  1:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 2361 bytes --]

On Thu, Mar 2, 2023 at 4:01 PM Konstantin Ananyev
<konstantin.v.ananyev@yandex.ru> wrote:
>
> 02/03/2023 12:30, Konstantin Ananyev пишет:
> >
> >> Use rte_eth_fp_ops_setup() instead of directly manipulating
> >> rte_eth_fp_ops variable.
> >>
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >> ---
> >>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
> >>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
> >>   2 files changed, 2 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> >> index 3950840600..a3f33c24c3 100644
> >> --- a/drivers/net/bnxt/bnxt_cpr.c
> >> +++ b/drivers/net/bnxt/bnxt_cpr.c
> >> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
> >>      eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
> >>      eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> >
> > I am not that familiar with bnxt driver, but shouldn't we set here
> > other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?
>
> After another thought - wouldn't it be better just to call
> fp_ops_reset() here?
Yes. I was actually thinking the same.
A reset equivalent of the setup.

>
> >
> >>
> >> -    rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> >> -            eth_dev->rx_pkt_burst;
> >> -    rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> >> -            eth_dev->tx_pkt_burst;
> >> +    rte_eth_fp_ops_setup(eth_dev);
> >>      rte_mb();
> >>
> >>      /* Allow time for threads to exit the real burst functions. */
> >> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> >> index 4083a69d02..d6064ceea4 100644
> >> --- a/drivers/net/bnxt/bnxt_ethdev.c
> >> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> >> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
> >>      if (rc)
> >>              goto err_start;
> >>
> >> -    rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> >> -            bp->eth_dev->rx_pkt_burst;
> >> -    rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> >> -            bp->eth_dev->tx_pkt_burst;
> >> +    rte_eth_fp_ops_setup(bp->eth_dev);
> >>      rte_mb();
> >>
> >>      PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> >> --
> >> 2.17.1
> >
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-02 12:30   ` Konstantin Ananyev
  2023-03-03  0:01     ` Konstantin Ananyev
@ 2023-03-03  1:38     ` fengchengwen
  2023-03-05 15:57       ` Konstantin Ananyev
  1 sibling, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-03  1:38 UTC (permalink / raw)
  To: Konstantin Ananyev, thomas, ferruh.yigit, Ajit Khaparde, Somnath Kotur
  Cc: dev

On 2023/3/2 20:30, Konstantin Ananyev wrote:
> 
>> Use rte_eth_fp_ops_setup() instead of directly manipulating
>> rte_eth_fp_ops variable.
>>
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>  drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>>  drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>>  2 files changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
>> index 3950840600..a3f33c24c3 100644
>> --- a/drivers/net/bnxt/bnxt_cpr.c
>> +++ b/drivers/net/bnxt/bnxt_cpr.c
>> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>>  	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>>  	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> 
> I am not that familiar with bnxt driver, but shouldn't we set here
> other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?

I checked the bnxt PMD code, the other fp_ops (rx_queue_count/rx_descriptor_status/tx_descriptor_status)
both add following logic at the beginning of function:

	rc = is_bnxt_in_error(bp);
	if (rc)
		return rc;

So I think it okey to keep it.

>  
>>
>> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
>> -		eth_dev->rx_pkt_burst;
>> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
>> -		eth_dev->tx_pkt_burst;
>> +	rte_eth_fp_ops_setup(eth_dev);
>>  	rte_mb();
>>
>>  	/* Allow time for threads to exit the real burst functions. */
>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
>> index 4083a69d02..d6064ceea4 100644
>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
>>  	if (rc)
>>  		goto err_start;
>>
>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
>> -		bp->eth_dev->rx_pkt_burst;
>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
>> -		bp->eth_dev->tx_pkt_burst;
>> +	rte_eth_fp_ops_setup(bp->eth_dev);
>>  	rte_mb();
>>
>>  	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
>> --
>> 2.17.1
> 
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 5/5] app/testpmd: add error recovery usage demo
  2023-03-02 13:01   ` Konstantin Ananyev
@ 2023-03-03  1:49     ` fengchengwen
  2023-03-03 16:59       ` Ferruh Yigit
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-03  1:49 UTC (permalink / raw)
  To: Konstantin Ananyev, thomas, ferruh.yigit, Aman Singh, Yuying Zhang; +Cc: dev

On 2023/3/2 21:01, Konstantin Ananyev wrote:
> 
> 
>>
>> This patch adds error recovery usage demo which will:
>> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>>    is received.
>> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>>    event is received.
>> 3. prompt the ports that fail to recovery and need to be removed when
>>    the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
>>
>> In addition, a message is added to the printed information, requiring
>> no command to be executed during the error recovery.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>  app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>>  app/test-pmd/testpmd.h |  4 ++-
>>  2 files changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 0c14325b8d..fdc3ae604b 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -3823,6 +3823,77 @@ rmv_port_callback(void *arg)
>>  		start_packet_forwarding(0);
>>  }
>>
>> +static int need_start_when_recovery_over;
>> +
>> +static bool
>> +has_port_in_err_recovering(void)
>> +{
>> +	struct rte_port *port;
>> +	portid_t pid;
>> +
>> +	RTE_ETH_FOREACH_DEV(pid) {
>> +		port = &ports[pid];
>> +		if (port->err_recovering)
>> +			return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +static void
>> +err_recovering_callback(portid_t port_id)
>> +{
>> +	if (!has_port_in_err_recovering())
>> +		printf("Please stop executing any commands until recovery result events are received!\n");
>> +
>> +	ports[port_id].err_recovering = 1;
>> +	ports[port_id].recover_failed = 0;
>> +
>> +	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
>> +	if (!test_done) {
>> +		printf("Stop packet forwarding because some ports are in error recovering!\n");
>> +		stop_packet_forwarding();
>> +		need_start_when_recovery_over = 1;
>> +	}
>> +}
> 
> One thought I have - should we somehow stop user to attempt restart RX/TX while recovery
> in progress?
> But probably it is an overkill, and just documenting what is happening is enough....

Yes, the testpmd is already complicated.
In addition, considering that only a few PMDs support and are not commonly invoking.
So I thinking show above such promote is enough.

> Do we need to update testpmd UG with some short description?

It's better to update UG, but it wasn't triggered by command, I don't know which chapter to put it in.

@Ferruh could you provide some advise ?

> Apart from that, LGTM:
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 
>> +
>> +static void
>> +recover_success_callback(portid_t port_id)
>> +{
>> +	ports[port_id].err_recovering = 0;
>> +	if (has_port_in_err_recovering())
>> +		return;
>> +
>> +	if (need_start_when_recovery_over) {
>> +		printf("Recovery success! Restart packet forwarding!\n");
>> +		start_packet_forwarding(0);
>> +		need_start_when_recovery_over = 0;
>> +	} else {
>> +		printf("Recovery success!\n");
>> +	}
>> +}
>> +
>> +static void
>> +recover_failed_callback(portid_t port_id)
>> +{
>> +	struct rte_port *port;
>> +	portid_t pid;
>> +
>> +	ports[port_id].err_recovering = 0;
>> +	ports[port_id].recover_failed = 1;
>> +	if (has_port_in_err_recovering())
>> +		return;
>> +
>> +	need_start_when_recovery_over = 0;
>> +	printf("The ports:");
>> +	RTE_ETH_FOREACH_DEV(pid) {
>> +		port = &ports[pid];
>> +		if (port->recover_failed)
>> +			printf(" %u", pid);
>> +	}
>> +	printf(" recovery failed! Please remove them!\n");
>> +}
>> +
>>  /* This function is used by the interrupt thread */
>>  static int
>>  eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>> @@ -3878,6 +3949,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>>  		}
>>  		break;
>>  	}
>> +	case RTE_ETH_EVENT_ERR_RECOVERING:
>> +		err_recovering_callback(port_id);
>> +		break;
>> +	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
>> +		recover_success_callback(port_id);
>> +		break;
>> +	case RTE_ETH_EVENT_RECOVERY_FAILED:
>> +		recover_failed_callback(port_id);
>> +		break;
>>  	default:
>>  		break;
>>  	}
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index 329a6378a1..1bbf82a96c 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -323,7 +323,9 @@ struct rte_port {
>>  	uint8_t                 slave_flag : 1, /**< bonding slave port */
>>  				bond_flag : 1, /**< port is bond device */
>>  				fwd_mac_swap : 1, /**< swap packet MAC before forward */
>> -				update_conf : 1; /**< need to update bonding device configuration */
>> +				update_conf : 1, /**< need to update bonding device configuration */
>> +				err_recovering : 1, /**< port is in error recovering */
>> +				recover_failed : 1; /**< port recover failed */
>>  	struct port_template    *pattern_templ_list; /**< Pattern templates. */
>>  	struct port_template    *actions_templ_list; /**< Actions templates. */
>>  	struct port_table       *table_list; /**< Flow tables. */
>> --
>> 2.17.1
> 
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-03  0:01     ` Konstantin Ananyev
  2023-03-03  1:17       ` Ajit Khaparde
@ 2023-03-03  2:02       ` fengchengwen
  1 sibling, 0 replies; 85+ messages in thread
From: fengchengwen @ 2023-03-03  2:02 UTC (permalink / raw)
  To: Konstantin Ananyev, dev

On 2023/3/3 8:01, Konstantin Ananyev wrote:
> 02/03/2023 12:30, Konstantin Ananyev пишет:
>>
>>> Use rte_eth_fp_ops_setup() instead of directly manipulating
>>> rte_eth_fp_ops variable.
>>>
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>>>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>>>   2 files changed, 2 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
>>> index 3950840600..a3f33c24c3 100644
>>> --- a/drivers/net/bnxt/bnxt_cpr.c
>>> +++ b/drivers/net/bnxt/bnxt_cpr.c
>>> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>>>       eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>>>       eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
>>
>> I am not that familiar with bnxt driver, but shouldn't we set here
>> other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?
> 
> After another thought - wouldn't it be better just to call fp_ops_reset() here?

The fp_ops_reset was targeting who violate invocation, so it contain an error log and stack-dump.

It's not suitable for this error recovering scenario.

I've also considered expansion (e.g. add extra parameter for fp_ops_reset), but there are also
other callbacks (e.g. rx_queue_count) should adjust, and make all not simple but complicated.

> 
>>  
>>>
>>> -    rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
>>> -        eth_dev->rx_pkt_burst;
>>> -    rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
>>> -        eth_dev->tx_pkt_burst;
>>> +    rte_eth_fp_ops_setup(eth_dev);
>>>       rte_mb();
>>>
>>>       /* Allow time for threads to exit the real burst functions. */
>>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
>>> index 4083a69d02..d6064ceea4 100644
>>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>>> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
>>>       if (rc)
>>>           goto err_start;
>>>
>>> -    rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
>>> -        bp->eth_dev->rx_pkt_burst;
>>> -    rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
>>> -        bp->eth_dev->tx_pkt_burst;
>>> +    rte_eth_fp_ops_setup(bp->eth_dev);
>>>       rte_mb();
>>>
>>>       PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
>>> -- 
>>> 2.17.1
>>
> 
> .

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-02 12:08   ` Konstantin Ananyev
@ 2023-03-03 16:51     ` Ferruh Yigit
  2023-03-05 14:53       ` Konstantin Ananyev
  2023-03-06  1:41       ` fengchengwen
  0 siblings, 2 replies; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-03 16:51 UTC (permalink / raw)
  To: Konstantin Ananyev, Fengchengwen, thomas, Andrew Rybchenko,
	Kalesh AP, Ajit Khaparde
  Cc: dev

On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
> 
>> In the proactive error handling mode, the PMD will set the data path
>> pointers to dummy functions and then try recovery, in this period the
>> application may still invoking data path API. This will introduce a
>> race-condition with data path which may lead to crash [1].
>>
>> Although the PMD added delay after setting data path pointers to cover
>> the above race-condition, it reduces the probability, but it doesn't
>> solve the problem.
>>
>> To solve the race-condition problem fundamentally, the following
>> requirements are added:
>> 1. The PMD should set the data path pointers to dummy functions after
>>    report RTE_ETH_EVENT_ERR_RECOVERING event.
>> 2. The application should stop data path API invocation when process
>>    the RTE_ETH_EVENT_ERR_RECOVERING event.
>> 3. The PMD should set the data path pointers to valid functions before
>>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>> 4. The application should enable data path API invocation when process
>>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>

How this is solving the race-condition, by pushing responsibility to
stop data path to application?

What if application is not interested in recovery modes at all and not
registered any callback for the recovery?

I think driver should not rely on application for this, unless
application explicitly says (to driver) that it is handling recovery,
right now there is no way for driver to know this.


>> Also, this patch introduce a driver internal function
>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>
>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>
>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>  lib/ethdev/version.map                  |  1 +
>>  5 files changed, 46 insertions(+), 25 deletions(-)
>>
>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
>> index c145a9066c..e380ff135a 100644
>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>>  the PMD automatically recovers from error in PROACTIVE mode,
>>  and only a small amount of work is required for the application.
>>
>> -During error detection and automatic recovery,
>> -the PMD sets the data path pointers to dummy functions
>> -(which will prevent the crash),
>> -and also make sure the control path operations fail with a return code ``-EBUSY``.
>> -
>> -Because the PMD recovers automatically,
>> -the application can only sense that the data flow is disconnected for a while
>> -and the control API returns an error in this period.
>> +During error detection and automatic recovery, the PMD sets the data path
>> +pointers to dummy functions and also make sure the control path operations
>> +failed with a return code ``-EBUSY``.
>>
>>  In order to sense the error happening/recovering,
>>  as well as to restore some additional configuration,
>> @@ -653,9 +648,9 @@ three events are available:
>>
>>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>>     Notify the application that an error is detected
>> -   and the recovery is being started.
>> +   and the recovery is about to start.
>>     Upon receiving the event, the application should not invoke
>> -   any control path function until receiving
>> +   any control and data path API until receiving
>>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>
>>  .. note::
>> @@ -666,8 +661,9 @@ three events are available:
>>
>>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>     Notify the application that the recovery from error is successful,
>> -   the PMD already re-configures the port,
>> -   and the effect is the same as a restart operation.
>> +   the PMD already re-configures the port.
>> +   The application should restore some additional configuration, and then
>> +   enable data path API invocation.
>>
>>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>     Notify the application that the recovery from error failed,
>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>> index 0be1e8ca04..f994653fe9 100644
>> --- a/lib/ethdev/ethdev_driver.c
>> +++ b/lib/ethdev/ethdev_driver.c
>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>>  	return rc;
>>  }
>>
>> +void
>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>> +{
>> +	if (dev == NULL)
>> +		return;
>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>> +}
>> +
>>  const struct rte_memzone *
>>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>>  			 uint16_t queue_id, size_t size, unsigned int align,
>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>> index 2c9d615fb5..0d964d1f67 100644
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -1621,6 +1621,16 @@ int
>>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>>  		 uint16_t queue_id);
>>
>> +/**
>> + * @internal
>> + * Setup eth fast-path API to ethdev values.
>> + *
>> + * @param dev
>> + *  Pointer to struct rte_eth_dev.
>> + */
>> +__rte_internal
>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>> +
>>  /**
>>   * @internal
>>   * Atomically set the link status for the specific device.
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>> index 049641d57c..44ee7229c1 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>  	 */
>>  	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>  	/** Port recovering from a hardware or firmware error.
>> -	 * If PMD supports proactive error recovery,
>> -	 * it should trigger this event to notify application
>> -	 * that it detected an error and the recovery is being started.
>> -	 * Upon receiving the event, the application should not invoke any control path API
>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
>> -	 * The PMD will set the data path pointers to dummy functions,
>> -	 * and re-set the data path pointers to non-dummy functions
>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>> -	 * It means that the application cannot send or receive any packets
>> -	 * during this period.
>> +	 *
>> +	 * If PMD supports proactive error recovery, it should trigger this
>> +	 * event to notify application that it detected an error and the
>> +	 * recovery is about to start.
>> +	 *
>> +	 * Upon receiving the event, the application should not invoke any
>> +	 * control and data path API until receiving
>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>> +	 * event.
>> +	 *
>> +	 * Once this event is reported, the PMD will set the data path pointers
>> +	 * to dummy functions, and re-set the data path pointers to valid
>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>> +	 *
>>  	 * @note Before the PMD reports the recovery result,
>>  	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>>  	 * because a larger error may occur during the recovery.
>>  	 */
>>  	RTE_ETH_EVENT_ERR_RECOVERING,
>>  	/** Port recovers successfully from the error.
>> -	 * The PMD already re-configured the port,
>> -	 * and the effect is the same as a restart operation.
>> +	 *
>> +	 * The PMD already re-configured the port:
>>  	 * a) The following operation will be retained: (alphabetically)
>>  	 *    - DCB configuration
>>  	 *    - FEC configuration
>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>  	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>  	 * c) Any other configuration will not be stored
>>  	 *    and will need to be re-configured.
>> +	 *
>> +	 * The application should restore some additional configuration
>> +	 * (see above case b/c), and then enable data path API invocation.
>>  	 */
>>  	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>  	/** Port recovery failed.
>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>> index 357d1a88c0..c273e0bdae 100644
>> --- a/lib/ethdev/version.map
>> +++ b/lib/ethdev/version.map
>> @@ -320,6 +320,7 @@ INTERNAL {
>>  	rte_eth_devices;
>>  	rte_eth_dma_zone_free;
>>  	rte_eth_dma_zone_reserve;
>> +	rte_eth_fp_ops_setup;
>>  	rte_eth_hairpin_queue_peer_bind;
>>  	rte_eth_hairpin_queue_peer_unbind;
>>  	rte_eth_hairpin_queue_peer_update;
>> --
>  
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 
>> 2.17.1
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 5/5] app/testpmd: add error recovery usage demo
  2023-03-03  1:49     ` fengchengwen
@ 2023-03-03 16:59       ` Ferruh Yigit
  0 siblings, 0 replies; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-03 16:59 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, thomas, Aman Singh, Yuying Zhang; +Cc: dev

On 3/3/2023 1:49 AM, fengchengwen wrote:
> On 2023/3/2 21:01, Konstantin Ananyev wrote:
>>
>>
>>>
>>> This patch adds error recovery usage demo which will:
>>> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>>>    is received.
>>> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>    event is received.
>>> 3. prompt the ports that fail to recovery and need to be removed when
>>>    the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
>>>
>>> In addition, a message is added to the printed information, requiring
>>> no command to be executed during the error recovery.
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>  app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>>>  app/test-pmd/testpmd.h |  4 ++-
>>>  2 files changed, 83 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>>> index 0c14325b8d..fdc3ae604b 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -3823,6 +3823,77 @@ rmv_port_callback(void *arg)
>>>  		start_packet_forwarding(0);
>>>  }
>>>
>>> +static int need_start_when_recovery_over;
>>> +
>>> +static bool
>>> +has_port_in_err_recovering(void)
>>> +{
>>> +	struct rte_port *port;
>>> +	portid_t pid;
>>> +
>>> +	RTE_ETH_FOREACH_DEV(pid) {
>>> +		port = &ports[pid];
>>> +		if (port->err_recovering)
>>> +			return true;
>>> +	}
>>> +
>>> +	return false;
>>> +}
>>> +
>>> +static void
>>> +err_recovering_callback(portid_t port_id)
>>> +{
>>> +	if (!has_port_in_err_recovering())
>>> +		printf("Please stop executing any commands until recovery result events are received!\n");
>>> +
>>> +	ports[port_id].err_recovering = 1;
>>> +	ports[port_id].recover_failed = 0;
>>> +
>>> +	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
>>> +	if (!test_done) {
>>> +		printf("Stop packet forwarding because some ports are in error recovering!\n");
>>> +		stop_packet_forwarding();
>>> +		need_start_when_recovery_over = 1;
>>> +	}
>>> +}
>>
>> One thought I have - should we somehow stop user to attempt restart RX/TX while recovery
>> in progress?
>> But probably it is an overkill, and just documenting what is happening is enough....
> 
> Yes, the testpmd is already complicated.
> In addition, considering that only a few PMDs support and are not commonly invoking.
> So I thinking show above such promote is enough.
> 
>> Do we need to update testpmd UG with some short description?
> 
> It's better to update UG, but it wasn't triggered by command, I don't know which chapter to put it in.
> 
> @Ferruh could you provide some advise ?
> 

I think better to extract event handling to a new .c file, something
like 'event.c', and various events handling optional, controlled by
testpmd parameter.

Right now by default all events are just printed (unless explicitly
requested from command line not to do (--mask-event)), that is very
basic and I think sufficient for default behavior.

And in documentation, it would be nice to have a section like "event
handling" and document what option enables which event handling, how it
is used and what is the expected behavior, etc...


btw, overall I agree to implement recover events in testpmd, it is good
to give some examples on how to handle these events in application, I am
just not sure to enable it by default.

>> Apart from that, LGTM:
>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>
>>> +
>>> +static void
>>> +recover_success_callback(portid_t port_id)
>>> +{
>>> +	ports[port_id].err_recovering = 0;
>>> +	if (has_port_in_err_recovering())
>>> +		return;
>>> +
>>> +	if (need_start_when_recovery_over) {
>>> +		printf("Recovery success! Restart packet forwarding!\n");
>>> +		start_packet_forwarding(0);
>>> +		need_start_when_recovery_over = 0;
>>> +	} else {
>>> +		printf("Recovery success!\n");
>>> +	}
>>> +}
>>> +
>>> +static void
>>> +recover_failed_callback(portid_t port_id)
>>> +{
>>> +	struct rte_port *port;
>>> +	portid_t pid;
>>> +
>>> +	ports[port_id].err_recovering = 0;
>>> +	ports[port_id].recover_failed = 1;
>>> +	if (has_port_in_err_recovering())
>>> +		return;
>>> +
>>> +	need_start_when_recovery_over = 0;
>>> +	printf("The ports:");
>>> +	RTE_ETH_FOREACH_DEV(pid) {
>>> +		port = &ports[pid];
>>> +		if (port->recover_failed)
>>> +			printf(" %u", pid);
>>> +	}
>>> +	printf(" recovery failed! Please remove them!\n");
>>> +}
>>> +
>>>  /* This function is used by the interrupt thread */
>>>  static int
>>>  eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>>> @@ -3878,6 +3949,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>>>  		}
>>>  		break;
>>>  	}
>>> +	case RTE_ETH_EVENT_ERR_RECOVERING:
>>> +		err_recovering_callback(port_id);
>>> +		break;
>>> +	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
>>> +		recover_success_callback(port_id);
>>> +		break;
>>> +	case RTE_ETH_EVENT_RECOVERY_FAILED:
>>> +		recover_failed_callback(port_id);
>>> +		break;
>>>  	default:
>>>  		break;
>>>  	}
>>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>>> index 329a6378a1..1bbf82a96c 100644
>>> --- a/app/test-pmd/testpmd.h
>>> +++ b/app/test-pmd/testpmd.h
>>> @@ -323,7 +323,9 @@ struct rte_port {
>>>  	uint8_t                 slave_flag : 1, /**< bonding slave port */
>>>  				bond_flag : 1, /**< port is bond device */
>>>  				fwd_mac_swap : 1, /**< swap packet MAC before forward */
>>> -				update_conf : 1; /**< need to update bonding device configuration */
>>> +				update_conf : 1, /**< need to update bonding device configuration */
>>> +				err_recovering : 1, /**< port is in error recovering */
>>> +				recover_failed : 1; /**< port recover failed */
>>>  	struct port_template    *pattern_templ_list; /**< Pattern templates. */
>>>  	struct port_template    *actions_templ_list; /**< Actions templates. */
>>>  	struct port_table       *table_list; /**< Flow tables. */
>>> --
>>> 2.17.1
>>
>> .
>>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-03  0:21     ` Konstantin Ananyev
@ 2023-03-04  5:08       ` Honnappa Nagarahalli
  2023-03-05 15:23         ` Konstantin Ananyev
  0 siblings, 1 reply; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-04  5:08 UTC (permalink / raw)
  To: Konstantin Ananyev, dev, Chengwen Feng, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd



> -----Original Message-----
> From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> Sent: Thursday, March 2, 2023 6:22 PM
> To: dev@dpdk.org
> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> 
> 
> >
> >> -----Original Message-----
> >> From: Chengwen Feng <fengchengwen@huawei.com>
> >> Sent: Tuesday, February 28, 2023 9:06 PM
> >> To: thomas@monjalon.net; ferruh.yigit@amd.com;
> >> konstantin.ananyev@huawei.com; Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> >> anakkur.purayil@broadcom.com>; Ajit Khaparde
> >> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> >> Cc: dev@dpdk.org
> >> Subject: [PATCH 1/5] ethdev: fix race-condition of proactive error
> >> handling mode
> >>
> >> In the proactive error handling mode, the PMD will set the data path
> >> pointers to dummy functions and then try recovery, in this period the
> >> application may still invoking data path API. This will introduce a
> >> race-condition with data path which may lead to crash [1].
> >>
> >> Although the PMD added delay after setting data path pointers to
> >> cover the above race-condition, it reduces the probability, but it
> >> doesn't solve the problem.
> >>
> >> To solve the race-condition problem fundamentally, the following
> >> requirements are added:
> >> 1. The PMD should set the data path pointers to dummy functions after
> >>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> > Do you mean to say, PMD should set the data path pointers after calling the
> call back function?
> > The PMD is running in the context of multiple EAL threads. How do these
> threads synchronize such that only one thread sets these data pointers?
> 
> As I understand this event callback supposed to be called in the context of EAL
> interrupt thread (whoever is more familiar with original idea, feel free to correct
> me if I missed something).
I could not figure this out. It looks to be called from the data plane thread context.
I also have a thought on alternate design at the end, appreciate if you can take a look.
 
> How it is going to signal data-path threads that they need to stop/suspend
> calling data-path API - that's I suppose is left to application to decide...
> Same as right now it is application responsibility to stop data-path threads
> before doing dev_stop()/dev/_config()/etc.
Ok, good, this expectation is not new. The application must have a mechanism already.

> 
> 
> >
> >> 2. The application should stop data path API invocation when process
> >>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> > Any thoughts on how an application can do this?
We can ignore this question as there is already similar expectation set for earlier functionalities.

> >
> >> 3. The PMD should set the data path pointers to valid functions before
> >>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >> 4. The application should enable data path API invocation when process
> >>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > Do you mean to say that the application should not call the datapath APIs
> while the PMD is running the recovery process?
> 
> Yes, I believe that's the intention.
Ok, this is good and makes sense.

> 
> >>
> >> Also, this patch introduce a driver internal function
> >> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>
> >> [1]
> >>
> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
> >> -
> >> ashok.k.kaladi@intel.com/
> >>
> >> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >> ---
> >>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>   lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
> >>   lib/ethdev/version.map                  |  1 +
> >>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >> b/doc/guides/prog_guide/poll_mode_drv.rst
> >> index c145a9066c..e380ff135a 100644
> >> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >> @@ -638,14 +638,9 @@ different from the application invokes recovery
> >> in PASSIVE mode,  the PMD automatically recovers from error in
> >> PROACTIVE mode,  and only a small amount of work is required for the
> application.
> >>
> >> -During error detection and automatic recovery, -the PMD sets the
> >> data path pointers to dummy functions -(which will prevent the
> >> crash), -and also make sure the control path operations fail with a return
> code ``-EBUSY``.
> >> -
> >> -Because the PMD recovers automatically, -the application can only
> >> sense that the data flow is disconnected for a while -and the control
> >> API returns an error in this period.
> >> +During error detection and automatic recovery, the PMD sets the data
> >> +path pointers to dummy functions and also make sure the control path
> >> +operations failed with a return code ``-EBUSY``.
> >>
> >>   In order to sense the error happening/recovering,  as well as to
> >> restore some additional configuration, @@ -653,9 +648,9 @@ three events
> are available:
> >>
> >>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>      Notify the application that an error is detected
> >> -   and the recovery is being started.
> >> +   and the recovery is about to start.
> >>      Upon receiving the event, the application should not invoke
> >> -   any control path function until receiving
> >> +   any control and data path API until receiving
> >>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>
> >>   .. note::
> >> @@ -666,8 +661,9 @@ three events are available:
> >>
> >>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>      Notify the application that the recovery from error is successful,
> >> -   the PMD already re-configures the port,
> >> -   and the effect is the same as a restart operation.
> >> +   the PMD already re-configures the port.
> >> +   The application should restore some additional configuration, and
> >> + then
> > What is the additional configuration? Is this specific to each NIC/PMD?
> > I thought, this is an auto recovery process and the application does not require
> to reconfigure anything. If the application has to restore the configuration, how
> does auto recovery differ from typical recovery process?
> >
> >> +   enable data path API invocation.
> >>
> >>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>      Notify the application that the recovery from error failed, diff
> >> --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index
> >> 0be1e8ca04..f994653fe9 100644
> >> --- a/lib/ethdev/ethdev_driver.c
> >> +++ b/lib/ethdev/ethdev_driver.c
> >> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> >> *dev, const char *ring_name,
> >>   	return rc;
> >>   }
> >>
> >> +void
> >> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
> >> +	if (dev == NULL)
> >> +		return;
> >> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
> >> +
> >>   const struct rte_memzone *
> >>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> >> *ring_name,
> >>   			 uint16_t queue_id, size_t size, unsigned int align, diff -
> -git
> >> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
> >> 2c9d615fb5..0d964d1f67 100644
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -1621,6 +1621,16 @@ int
> >>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char
> >> *name,
> >>   		 uint16_t queue_id);
> >>
> >> +/**
> >> + * @internal
> >> + * Setup eth fast-path API to ethdev values.
> >> + *
> >> + * @param dev
> >> + *  Pointer to struct rte_eth_dev.
> >> + */
> >> +__rte_internal
> >> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >> +
> >>   /**
> >>    * @internal
> >>    * Atomically set the link status for the specific device.
> >> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> >> 049641d57c..44ee7229c1 100644
> >> --- a/lib/ethdev/rte_ethdev.h
> >> +++ b/lib/ethdev/rte_ethdev.h
> >> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>   	 */
> >>   	RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>   	/** Port recovering from a hardware or firmware error.
> >> -	 * If PMD supports proactive error recovery,
> >> -	 * it should trigger this event to notify application
> >> -	 * that it detected an error and the recovery is being started.
> >> -	 * Upon receiving the event, the application should not invoke any
> >> control path API
> >> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> >> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >> -	 * The PMD will set the data path pointers to dummy functions,
> >> -	 * and re-set the data path pointers to non-dummy functions
> >> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >> -	 * It means that the application cannot send or receive any packets
> >> -	 * during this period.
> >> +	 *
> >> +	 * If PMD supports proactive error recovery, it should trigger this
> >> +	 * event to notify application that it detected an error and the
> >> +	 * recovery is about to start.
> >> +	 *
> >> +	 * Upon receiving the event, the application should not invoke any
> >> +	 * control and data path API until receiving
> >> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >> RTE_ETH_EVENT_RECOVERY_FAILED
> >> +	 * event.
> >> +	 *
> >> +	 * Once this event is reported, the PMD will set the data path pointers
> >> +	 * to dummy functions, and re-set the data path pointers to valid
> >> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >> event.
> > Why do we need to set the data path pointers to dummy functions if the
> application is restricted from invoking any control and data path APIs till the
> recovery process is completed?
> 
> You are right, in theory it is not mandatory.
> Though it helps to flag a problem if user will still try to call them while recovery is
> in progress.
Ok, may be in debug mode.
I mean, we have already set an expectation to the application that it should not call and the application has implemented a method to do the same. Why do we need to complicate this?
If the application calls the APIs, it is a programming error.

> Again, same as we doing in dev_stop().

> 
> >
> >> +	 *
> >>   	 * @note Before the PMD reports the recovery result,
> >>   	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >> again,
> >>   	 * because a larger error may occur during the recovery.
> >>   	 */
> >>   	RTE_ETH_EVENT_ERR_RECOVERING,
> > I understand this is not a change in this patch. But, just wondering, what is the
> purpose of this? How is the application supposed to use this?
> >
> >>   	/** Port recovers successfully from the error.
> >> -	 * The PMD already re-configured the port,
> >> -	 * and the effect is the same as a restart operation.
> >> +	 *
> >> +	 * The PMD already re-configured the port:
> >>   	 * a) The following operation will be retained: (alphabetically)
> >>   	 *    - DCB configuration
> >>   	 *    - FEC configuration
> >> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>   	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>   	 * c) Any other configuration will not be stored
> >>   	 *    and will need to be re-configured.
> >> +	 *
> >> +	 * The application should restore some additional configuration
> >> +	 * (see above case b/c), and then enable data path API invocation.
> >>   	 */
> >>   	RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>   	/** Port recovery failed.
> >> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> >> 357d1a88c0..c273e0bdae 100644
> >> --- a/lib/ethdev/version.map
> >> +++ b/lib/ethdev/version.map
> >> @@ -320,6 +320,7 @@ INTERNAL {
> >>   	rte_eth_devices;
> >>   	rte_eth_dma_zone_free;
> >>   	rte_eth_dma_zone_reserve;
> >> +	rte_eth_fp_ops_setup;
> >>   	rte_eth_hairpin_queue_peer_bind;
> >>   	rte_eth_hairpin_queue_peer_unbind;
> >>   	rte_eth_hairpin_queue_peer_update;
> >> --
> >> 2.17.1
> >

Is there any reason not to design this in the same way as 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
We could have a similar API 'rte_eth_dev_recover' to do the recovery functionality.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-03 16:51     ` Ferruh Yigit
@ 2023-03-05 14:53       ` Konstantin Ananyev
  2023-03-06  8:55         ` Ferruh Yigit
  2023-03-06  1:41       ` fengchengwen
  1 sibling, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-05 14:53 UTC (permalink / raw)
  To: dev

03/03/2023 16:51, Ferruh Yigit пишет:
> On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
>>
>>> In the proactive error handling mode, the PMD will set the data path
>>> pointers to dummy functions and then try recovery, in this period the
>>> application may still invoking data path API. This will introduce a
>>> race-condition with data path which may lead to crash [1].
>>>
>>> Although the PMD added delay after setting data path pointers to cover
>>> the above race-condition, it reduces the probability, but it doesn't
>>> solve the problem.
>>>
>>> To solve the race-condition problem fundamentally, the following
>>> requirements are added:
>>> 1. The PMD should set the data path pointers to dummy functions after
>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>> 2. The application should stop data path API invocation when process
>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>> 3. The PMD should set the data path pointers to valid functions before
>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> 4. The application should enable data path API invocation when process
>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>
> 
> How this is solving the race-condition, by pushing responsibility to
> stop data path to application?

Exactly, it becomes application responsibility to make sure data-path is
stopped/suspended before recovery will continue.

> 
> What if application is not interested in recovery modes at all and not
> registered any callback for the recovery?


Are you saying there is no way for application to disable
automatic recovery in PMD if it is not interested
(or can't full-fill per-requesties for it)?
If so, then yes it is a problem and we need to fix it.
I assumed that such mechanism to disable unwanted events already exists,
but I can't find anything.
Wonder what would be the easiest way here - can PMD make a decision 
based on callback return value, or do we need a new API to 
enable/disable callbacks, or ...?


> I think driver should not rely on application for this, unless
> application explicitly says (to driver) that it is handling recovery,
> right now there is no way for driver to know this.

I think it is visa-versa:
application should not enable auto-recovery if it can't meet
per-requeststies for it (provide appropriate callback).


> 
>>> Also, this patch introduce a driver internal function
>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>
>>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>
>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>   lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>   lib/ethdev/version.map                  |  1 +
>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
>>> index c145a9066c..e380ff135a 100644
>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>   and only a small amount of work is required for the application.
>>>
>>> -During error detection and automatic recovery,
>>> -the PMD sets the data path pointers to dummy functions
>>> -(which will prevent the crash),
>>> -and also make sure the control path operations fail with a return code ``-EBUSY``.
>>> -
>>> -Because the PMD recovers automatically,
>>> -the application can only sense that the data flow is disconnected for a while
>>> -and the control API returns an error in this period.
>>> +During error detection and automatic recovery, the PMD sets the data path
>>> +pointers to dummy functions and also make sure the control path operations
>>> +failed with a return code ``-EBUSY``.
>>>
>>>   In order to sense the error happening/recovering,
>>>   as well as to restore some additional configuration,
>>> @@ -653,9 +648,9 @@ three events are available:
>>>
>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>      Notify the application that an error is detected
>>> -   and the recovery is being started.
>>> +   and the recovery is about to start.
>>>      Upon receiving the event, the application should not invoke
>>> -   any control path function until receiving
>>> +   any control and data path API until receiving
>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>
>>>   .. note::
>>> @@ -666,8 +661,9 @@ three events are available:
>>>
>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>      Notify the application that the recovery from error is successful,
>>> -   the PMD already re-configures the port,
>>> -   and the effect is the same as a restart operation.
>>> +   the PMD already re-configures the port.
>>> +   The application should restore some additional configuration, and then
>>> +   enable data path API invocation.
>>>
>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>      Notify the application that the recovery from error failed,
>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>> index 0be1e8ca04..f994653fe9 100644
>>> --- a/lib/ethdev/ethdev_driver.c
>>> +++ b/lib/ethdev/ethdev_driver.c
>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>>>   	return rc;
>>>   }
>>>
>>> +void
>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>> +{
>>> +	if (dev == NULL)
>>> +		return;
>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>> +}
>>> +
>>>   const struct rte_memzone *
>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>>>   			 uint16_t queue_id, size_t size, unsigned int align,
>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>> index 2c9d615fb5..0d964d1f67 100644
>>> --- a/lib/ethdev/ethdev_driver.h
>>> +++ b/lib/ethdev/ethdev_driver.h
>>> @@ -1621,6 +1621,16 @@ int
>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>>>   		 uint16_t queue_id);
>>>
>>> +/**
>>> + * @internal
>>> + * Setup eth fast-path API to ethdev values.
>>> + *
>>> + * @param dev
>>> + *  Pointer to struct rte_eth_dev.
>>> + */
>>> +__rte_internal
>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>> +
>>>   /**
>>>    * @internal
>>>    * Atomically set the link status for the specific device.
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 049641d57c..44ee7229c1 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>   	 */
>>>   	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>   	/** Port recovering from a hardware or firmware error.
>>> -	 * If PMD supports proactive error recovery,
>>> -	 * it should trigger this event to notify application
>>> -	 * that it detected an error and the recovery is being started.
>>> -	 * Upon receiving the event, the application should not invoke any control path API
>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
>>> -	 * The PMD will set the data path pointers to dummy functions,
>>> -	 * and re-set the data path pointers to non-dummy functions
>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> -	 * It means that the application cannot send or receive any packets
>>> -	 * during this period.
>>> +	 *
>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>> +	 * event to notify application that it detected an error and the
>>> +	 * recovery is about to start.
>>> +	 *
>>> +	 * Upon receiving the event, the application should not invoke any
>>> +	 * control and data path API until receiving
>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>> +	 * event.
>>> +	 *
>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> +	 *
>>>   	 * @note Before the PMD reports the recovery result,
>>>   	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>>>   	 * because a larger error may occur during the recovery.
>>>   	 */
>>>   	RTE_ETH_EVENT_ERR_RECOVERING,
>>>   	/** Port recovers successfully from the error.
>>> -	 * The PMD already re-configured the port,
>>> -	 * and the effect is the same as a restart operation.
>>> +	 *
>>> +	 * The PMD already re-configured the port:
>>>   	 * a) The following operation will be retained: (alphabetically)
>>>   	 *    - DCB configuration
>>>   	 *    - FEC configuration
>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>   	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>   	 * c) Any other configuration will not be stored
>>>   	 *    and will need to be re-configured.
>>> +	 *
>>> +	 * The application should restore some additional configuration
>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>   	 */
>>>   	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>   	/** Port recovery failed.
>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>> index 357d1a88c0..c273e0bdae 100644
>>> --- a/lib/ethdev/version.map
>>> +++ b/lib/ethdev/version.map
>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>   	rte_eth_devices;
>>>   	rte_eth_dma_zone_free;
>>>   	rte_eth_dma_zone_reserve;
>>> +	rte_eth_fp_ops_setup;
>>>   	rte_eth_hairpin_queue_peer_bind;
>>>   	rte_eth_hairpin_queue_peer_unbind;
>>>   	rte_eth_hairpin_queue_peer_update;
>>> --
>>   
>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>
>>> 2.17.1
>>
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-04  5:08       ` Honnappa Nagarahalli
@ 2023-03-05 15:23         ` Konstantin Ananyev
  2023-03-07  5:34           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-05 15:23 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev, Chengwen Feng, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd


>>>>
>>>> In the proactive error handling mode, the PMD will set the data path
>>>> pointers to dummy functions and then try recovery, in this period the
>>>> application may still invoking data path API. This will introduce a
>>>> race-condition with data path which may lead to crash [1].
>>>>
>>>> Although the PMD added delay after setting data path pointers to
>>>> cover the above race-condition, it reduces the probability, but it
>>>> doesn't solve the problem.
>>>>
>>>> To solve the race-condition problem fundamentally, the following
>>>> requirements are added:
>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>      report RTE_ETH_EVENT_ERR_RECOVERING event.
>>> Do you mean to say, PMD should set the data path pointers after calling the
>> call back function?
>>> The PMD is running in the context of multiple EAL threads. How do these
>> threads synchronize such that only one thread sets these data pointers?
>>
>> As I understand this event callback supposed to be called in the context of EAL
>> interrupt thread (whoever is more familiar with original idea, feel free to correct
>> me if I missed something).
> I could not figure this out. It looks to be called from the data plane thread context.
> I also have a thought on alternate design at the end, appreciate if you can take a look.
>   
>> How it is going to signal data-path threads that they need to stop/suspend
>> calling data-path API - that's I suppose is left to application to decide...
>> Same as right now it is application responsibility to stop data-path threads
>> before doing dev_stop()/dev/_config()/etc.
> Ok, good, this expectation is not new. The application must have a mechanism already.
> 
>>
>>
>>>
>>>> 2. The application should stop data path API invocation when process
>>>>      the RTE_ETH_EVENT_ERR_RECOVERING event.
>>> Any thoughts on how an application can do this?
> We can ignore this question as there is already similar expectation set for earlier functionalities.
> 
>>>
>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>      report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> 4. The application should enable data path API invocation when process
>>>>      the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> Do you mean to say that the application should not call the datapath APIs
>> while the PMD is running the recovery process?
>>
>> Yes, I believe that's the intention.
> Ok, this is good and makes sense.
> 
>>
>>>>
>>>> Also, this patch introduce a driver internal function
>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>
>>>> [1]
>>>>
>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
>>>> -
>>>> ashok.k.kaladi@intel.com/
>>>>
>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>> ---
>>>>    doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>    lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>    lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>    lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>>    lib/ethdev/version.map                  |  1 +
>>>>    5 files changed, 46 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> index c145a9066c..e380ff135a 100644
>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>> in PASSIVE mode,  the PMD automatically recovers from error in
>>>> PROACTIVE mode,  and only a small amount of work is required for the
>> application.
>>>>
>>>> -During error detection and automatic recovery, -the PMD sets the
>>>> data path pointers to dummy functions -(which will prevent the
>>>> crash), -and also make sure the control path operations fail with a return
>> code ``-EBUSY``.
>>>> -
>>>> -Because the PMD recovers automatically, -the application can only
>>>> sense that the data flow is disconnected for a while -and the control
>>>> API returns an error in this period.
>>>> +During error detection and automatic recovery, the PMD sets the data
>>>> +path pointers to dummy functions and also make sure the control path
>>>> +operations failed with a return code ``-EBUSY``.
>>>>
>>>>    In order to sense the error happening/recovering,  as well as to
>>>> restore some additional configuration, @@ -653,9 +648,9 @@ three events
>> are available:
>>>>
>>>>    ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>       Notify the application that an error is detected
>>>> -   and the recovery is being started.
>>>> +   and the recovery is about to start.
>>>>       Upon receiving the event, the application should not invoke
>>>> -   any control path function until receiving
>>>> +   any control and data path API until receiving
>>>>       ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>
>>>>    .. note::
>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>
>>>>    ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>       Notify the application that the recovery from error is successful,
>>>> -   the PMD already re-configures the port,
>>>> -   and the effect is the same as a restart operation.
>>>> +   the PMD already re-configures the port.
>>>> +   The application should restore some additional configuration, and
>>>> + then
>>> What is the additional configuration? Is this specific to each NIC/PMD?
>>> I thought, this is an auto recovery process and the application does not require
>> to reconfigure anything. If the application has to restore the configuration, how
>> does auto recovery differ from typical recovery process?
>>>
>>>> +   enable data path API invocation.
>>>>
>>>>    ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>       Notify the application that the recovery from error failed, diff
>>>> --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index
>>>> 0be1e8ca04..f994653fe9 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>> *dev, const char *ring_name,
>>>>    	return rc;
>>>>    }
>>>>
>>>> +void
>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
>>>> +	if (dev == NULL)
>>>> +		return;
>>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
>>>> +
>>>>    const struct rte_memzone *
>>>>    rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>> *ring_name,
>>>>    			 uint16_t queue_id, size_t size, unsigned int align, diff -
>> -git
>>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
>>>> 2c9d615fb5..0d964d1f67 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1621,6 +1621,16 @@ int
>>>>    rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char
>>>> *name,
>>>>    		 uint16_t queue_id);
>>>>
>>>> +/**
>>>> + * @internal
>>>> + * Setup eth fast-path API to ethdev values.
>>>> + *
>>>> + * @param dev
>>>> + *  Pointer to struct rte_eth_dev.
>>>> + */
>>>> +__rte_internal
>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>> +
>>>>    /**
>>>>     * @internal
>>>>     * Atomically set the link status for the specific device.
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>>> 049641d57c..44ee7229c1 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>    	 */
>>>>    	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>    	/** Port recovering from a hardware or firmware error.
>>>> -	 * If PMD supports proactive error recovery,
>>>> -	 * it should trigger this event to notify application
>>>> -	 * that it detected an error and the recovery is being started.
>>>> -	 * Upon receiving the event, the application should not invoke any
>>>> control path API
>>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>> -	 * The PMD will set the data path pointers to dummy functions,
>>>> -	 * and re-set the data path pointers to non-dummy functions
>>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> -	 * It means that the application cannot send or receive any packets
>>>> -	 * during this period.
>>>> +	 *
>>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>>> +	 * event to notify application that it detected an error and the
>>>> +	 * recovery is about to start.
>>>> +	 *
>>>> +	 * Upon receiving the event, the application should not invoke any
>>>> +	 * control and data path API until receiving
>>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>> RTE_ETH_EVENT_RECOVERY_FAILED
>>>> +	 * event.
>>>> +	 *
>>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>> event.
>>> Why do we need to set the data path pointers to dummy functions if the
>> application is restricted from invoking any control and data path APIs till the
>> recovery process is completed?
>>
>> You are right, in theory it is not mandatory.
>> Though it helps to flag a problem if user will still try to call them while recovery is
>> in progress.
> Ok, may be in debug mode.
> I mean, we have already set an expectation to the application that it should not call and the application has implemented a method to do the same. Why do we need to complicate this?
> If the application calls the APIs, it is a programming error.


My preference would be to keep it this way for both debug and non-debug 
mode.
It doesn't cost anything to us in terms of perfomance, but helps to 
catch problems with wrong behaving app.

> 
>> Again, same as we doing in dev_stop().
> 
>>
>>>
>>>> +	 *
>>>>    	 * @note Before the PMD reports the recovery result,
>>>>    	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>> again,
>>>>    	 * because a larger error may occur during the recovery.
>>>>    	 */
>>>>    	RTE_ETH_EVENT_ERR_RECOVERING,
>>> I understand this is not a change in this patch. But, just wondering, what is the
>> purpose of this? How is the application supposed to use this?
>>>
>>>>    	/** Port recovers successfully from the error.
>>>> -	 * The PMD already re-configured the port,
>>>> -	 * and the effect is the same as a restart operation.
>>>> +	 *
>>>> +	 * The PMD already re-configured the port:
>>>>    	 * a) The following operation will be retained: (alphabetically)
>>>>    	 *    - DCB configuration
>>>>    	 *    - FEC configuration
>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>    	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>    	 * c) Any other configuration will not be stored
>>>>    	 *    and will need to be re-configured.
>>>> +	 *
>>>> +	 * The application should restore some additional configuration
>>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>>    	 */
>>>>    	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>    	/** Port recovery failed.
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>>>> 357d1a88c0..c273e0bdae 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>    	rte_eth_devices;
>>>>    	rte_eth_dma_zone_free;
>>>>    	rte_eth_dma_zone_reserve;
>>>> +	rte_eth_fp_ops_setup;
>>>>    	rte_eth_hairpin_queue_peer_bind;
>>>>    	rte_eth_hairpin_queue_peer_unbind;
>>>>    	rte_eth_hairpin_queue_peer_update;
>>>> --
>>>> 2.17.1
>>>
> 
> Is there any reason not to design this in the same way as 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?

I suppose it is a question for the authors of original patch...

> We could have a similar API 'rte_eth_dev_recover' to do the recovery functionality.

I suppose such approach is also possible.
Personally I am fine with both ways: either existing one or what you 
propose, as long as we'll fix existing race-condition.
What is good with what you suggest - that way we probably don't need to
worry how to allow user to enable/disable auto-recovery inside PMD.

Konstantin



^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-03  1:38     ` fengchengwen
@ 2023-03-05 15:57       ` Konstantin Ananyev
  2023-03-06  2:47         ` Ajit Khaparde
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-05 15:57 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, thomas, ferruh.yigit,
	Ajit Khaparde, Somnath Kotur
  Cc: dev

03/03/2023 01:38, fengchengwen пишет:
> On 2023/3/2 20:30, Konstantin Ananyev wrote:
>>
>>> Use rte_eth_fp_ops_setup() instead of directly manipulating
>>> rte_eth_fp_ops variable.
>>>
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>>>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>>>   2 files changed, 2 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
>>> index 3950840600..a3f33c24c3 100644
>>> --- a/drivers/net/bnxt/bnxt_cpr.c
>>> +++ b/drivers/net/bnxt/bnxt_cpr.c
>>> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>>>   	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>>>   	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
>>
>> I am not that familiar with bnxt driver, but shouldn't we set here
>> other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?
> 
> I checked the bnxt PMD code, the other fp_ops (rx_queue_count/rx_descriptor_status/tx_descriptor_status)
> both add following logic at the beginning of function:
> 
> 	rc = is_bnxt_in_error(bp);
> 	if (rc)
> 		return rc;
> 
> So I think it okey to keep it.

I still think it is much safer/cleaner to update all fp_ops in one go
(use fp_ops_reset()) here.
But as you believe it would work either way, I'll leave it to bnxt
maintainers to decide.


> 
>>   
>>>
>>> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
>>> -		eth_dev->rx_pkt_burst;
>>> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
>>> -		eth_dev->tx_pkt_burst;
>>> +	rte_eth_fp_ops_setup(eth_dev);
>>>   	rte_mb();
>>>
>>>   	/* Allow time for threads to exit the real burst functions. */
>>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
>>> index 4083a69d02..d6064ceea4 100644
>>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>>> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
>>>   	if (rc)
>>>   		goto err_start;
>>>
>>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
>>> -		bp->eth_dev->rx_pkt_burst;
>>> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
>>> -		bp->eth_dev->tx_pkt_burst;
>>> +	rte_eth_fp_ops_setup(bp->eth_dev);
>>>   	rte_mb();
>>>
>>>   	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
>>> --
>>> 2.17.1
>>
>> .
>>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-03 16:51     ` Ferruh Yigit
  2023-03-05 14:53       ` Konstantin Ananyev
@ 2023-03-06  1:41       ` fengchengwen
  2023-03-06  8:57         ` Ferruh Yigit
  2023-03-06  9:10         ` Ferruh Yigit
  1 sibling, 2 replies; 85+ messages in thread
From: fengchengwen @ 2023-03-06  1:41 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev, thomas, Andrew Rybchenko,
	Kalesh AP, Ajit Khaparde
  Cc: dev

On 2023/3/4 0:51, Ferruh Yigit wrote:
> On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
>>
>>> In the proactive error handling mode, the PMD will set the data path
>>> pointers to dummy functions and then try recovery, in this period the
>>> application may still invoking data path API. This will introduce a
>>> race-condition with data path which may lead to crash [1].
>>>
>>> Although the PMD added delay after setting data path pointers to cover
>>> the above race-condition, it reduces the probability, but it doesn't
>>> solve the problem.
>>>
>>> To solve the race-condition problem fundamentally, the following
>>> requirements are added:
>>> 1. The PMD should set the data path pointers to dummy functions after
>>>    report RTE_ETH_EVENT_ERR_RECOVERING event.
>>> 2. The application should stop data path API invocation when process
>>>    the RTE_ETH_EVENT_ERR_RECOVERING event.
>>> 3. The PMD should set the data path pointers to valid functions before
>>>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> 4. The application should enable data path API invocation when process
>>>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>
> 
> How this is solving the race-condition, by pushing responsibility to
> stop data path to application?

Yes, I think it's more practical to collaborate with application.

The application will control API invocation (including control and data path),
From a DPDK SDK perspective, it has a God perspective.

> 
> What if application is not interested in recovery modes at all and not
> registered any callback for the recovery?

There's probably race-condition which may lead to crash, because DPDK worker
threads runs busyloop and located on isolated core, and also PMDs add delay time,
the actual probability of occurence is very very low, at least for HNS3 pmd it
has not run out for at least four years.

> 
> I think driver should not rely on application for this, unless
> application explicitly says (to driver) that it is handling recovery,

If application register the event callback, the PMD could deduce that application will know this.
If application not register, then PMD will recovery itself and maybe race-condition.

> right now there is no way for driver to know this.
> 
> 
>>> Also, this patch introduce a driver internal function
>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>
>>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>
>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>  lib/ethdev/version.map                  |  1 +
>>>  5 files changed, 46 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
>>> index c145a9066c..e380ff135a 100644
>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>>>  the PMD automatically recovers from error in PROACTIVE mode,
>>>  and only a small amount of work is required for the application.
>>>
>>> -During error detection and automatic recovery,
>>> -the PMD sets the data path pointers to dummy functions
>>> -(which will prevent the crash),
>>> -and also make sure the control path operations fail with a return code ``-EBUSY``.
>>> -
>>> -Because the PMD recovers automatically,
>>> -the application can only sense that the data flow is disconnected for a while
>>> -and the control API returns an error in this period.
>>> +During error detection and automatic recovery, the PMD sets the data path
>>> +pointers to dummy functions and also make sure the control path operations
>>> +failed with a return code ``-EBUSY``.
>>>
>>>  In order to sense the error happening/recovering,
>>>  as well as to restore some additional configuration,
>>> @@ -653,9 +648,9 @@ three events are available:
>>>
>>>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>     Notify the application that an error is detected
>>> -   and the recovery is being started.
>>> +   and the recovery is about to start.
>>>     Upon receiving the event, the application should not invoke
>>> -   any control path function until receiving
>>> +   any control and data path API until receiving
>>>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>
>>>  .. note::
>>> @@ -666,8 +661,9 @@ three events are available:
>>>
>>>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>     Notify the application that the recovery from error is successful,
>>> -   the PMD already re-configures the port,
>>> -   and the effect is the same as a restart operation.
>>> +   the PMD already re-configures the port.
>>> +   The application should restore some additional configuration, and then
>>> +   enable data path API invocation.
>>>
>>>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>     Notify the application that the recovery from error failed,
>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>> index 0be1e8ca04..f994653fe9 100644
>>> --- a/lib/ethdev/ethdev_driver.c
>>> +++ b/lib/ethdev/ethdev_driver.c
>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>>>  	return rc;
>>>  }
>>>
>>> +void
>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>> +{
>>> +	if (dev == NULL)
>>> +		return;
>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>> +}
>>> +
>>>  const struct rte_memzone *
>>>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>>>  			 uint16_t queue_id, size_t size, unsigned int align,
>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>> index 2c9d615fb5..0d964d1f67 100644
>>> --- a/lib/ethdev/ethdev_driver.h
>>> +++ b/lib/ethdev/ethdev_driver.h
>>> @@ -1621,6 +1621,16 @@ int
>>>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>>>  		 uint16_t queue_id);
>>>
>>> +/**
>>> + * @internal
>>> + * Setup eth fast-path API to ethdev values.
>>> + *
>>> + * @param dev
>>> + *  Pointer to struct rte_eth_dev.
>>> + */
>>> +__rte_internal
>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>> +
>>>  /**
>>>   * @internal
>>>   * Atomically set the link status for the specific device.
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 049641d57c..44ee7229c1 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>  	 */
>>>  	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>  	/** Port recovering from a hardware or firmware error.
>>> -	 * If PMD supports proactive error recovery,
>>> -	 * it should trigger this event to notify application
>>> -	 * that it detected an error and the recovery is being started.
>>> -	 * Upon receiving the event, the application should not invoke any control path API
>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
>>> -	 * The PMD will set the data path pointers to dummy functions,
>>> -	 * and re-set the data path pointers to non-dummy functions
>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> -	 * It means that the application cannot send or receive any packets
>>> -	 * during this period.
>>> +	 *
>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>> +	 * event to notify application that it detected an error and the
>>> +	 * recovery is about to start.
>>> +	 *
>>> +	 * Upon receiving the event, the application should not invoke any
>>> +	 * control and data path API until receiving
>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>> +	 * event.
>>> +	 *
>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> +	 *
>>>  	 * @note Before the PMD reports the recovery result,
>>>  	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>>>  	 * because a larger error may occur during the recovery.
>>>  	 */
>>>  	RTE_ETH_EVENT_ERR_RECOVERING,
>>>  	/** Port recovers successfully from the error.
>>> -	 * The PMD already re-configured the port,
>>> -	 * and the effect is the same as a restart operation.
>>> +	 *
>>> +	 * The PMD already re-configured the port:
>>>  	 * a) The following operation will be retained: (alphabetically)
>>>  	 *    - DCB configuration
>>>  	 *    - FEC configuration
>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>  	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>  	 * c) Any other configuration will not be stored
>>>  	 *    and will need to be re-configured.
>>> +	 *
>>> +	 * The application should restore some additional configuration
>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>  	 */
>>>  	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>  	/** Port recovery failed.
>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>> index 357d1a88c0..c273e0bdae 100644
>>> --- a/lib/ethdev/version.map
>>> +++ b/lib/ethdev/version.map
>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>  	rte_eth_devices;
>>>  	rte_eth_dma_zone_free;
>>>  	rte_eth_dma_zone_reserve;
>>> +	rte_eth_fp_ops_setup;
>>>  	rte_eth_hairpin_queue_peer_bind;
>>>  	rte_eth_hairpin_queue_peer_unbind;
>>>  	rte_eth_hairpin_queue_peer_update;
>>> --
>>  
>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>
>>> 2.17.1
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 4/5] net/bnxt: use fp ops setup function
  2023-03-05 15:57       ` Konstantin Ananyev
@ 2023-03-06  2:47         ` Ajit Khaparde
  0 siblings, 0 replies; 85+ messages in thread
From: Ajit Khaparde @ 2023-03-06  2:47 UTC (permalink / raw)
  To: Konstantin Ananyev
  Cc: fengchengwen, Konstantin Ananyev, thomas, ferruh.yigit,
	Somnath Kotur, dev

[-- Attachment #1: Type: text/plain, Size: 3038 bytes --]

On Sun, Mar 5, 2023 at 7:58 AM Konstantin Ananyev
<konstantin.v.ananyev@yandex.ru> wrote:
>
> 03/03/2023 01:38, fengchengwen пишет:
> > On 2023/3/2 20:30, Konstantin Ananyev wrote:
> >>
> >>> Use rte_eth_fp_ops_setup() instead of directly manipulating
> >>> rte_eth_fp_ops variable.
> >>>
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>> ---
> >>>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
> >>>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
> >>>   2 files changed, 2 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> >>> index 3950840600..a3f33c24c3 100644
> >>> --- a/drivers/net/bnxt/bnxt_cpr.c
> >>> +++ b/drivers/net/bnxt/bnxt_cpr.c
> >>> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
> >>>     eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
> >>>     eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> >>
> >> I am not that familiar with bnxt driver, but shouldn't we set here
> >> other optional fp_ops (descripto_status, etc.) to some dummy values OR to null values?
> >
> > I checked the bnxt PMD code, the other fp_ops (rx_queue_count/rx_descriptor_status/tx_descriptor_status)
> > both add following logic at the beginning of function:
> >
> >       rc = is_bnxt_in_error(bp);
> >       if (rc)
> >               return rc;
> >
> > So I think it okey to keep it.
>
> I still think it is much safer/cleaner to update all fp_ops in one go
> (use fp_ops_reset()) here.
> But as you believe it would work either way, I'll leave it to bnxt
> maintainers to decide.

We have been operating without the application being aware of the
underlying functionality for some time now.
Each step here is an improvement.
I think it is okay to keep it simple and update them separately.

Thanks
Ajit

>
>
> >
> >>
> >>>
> >>> -   rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> >>> -           eth_dev->rx_pkt_burst;
> >>> -   rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> >>> -           eth_dev->tx_pkt_burst;
> >>> +   rte_eth_fp_ops_setup(eth_dev);
> >>>     rte_mb();
> >>>
> >>>     /* Allow time for threads to exit the real burst functions. */
> >>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> >>> index 4083a69d02..d6064ceea4 100644
> >>> --- a/drivers/net/bnxt/bnxt_ethdev.c
> >>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> >>> @@ -4374,10 +4374,7 @@ static void bnxt_dev_recover(void *arg)
> >>>     if (rc)
> >>>             goto err_start;
> >>>
> >>> -   rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> >>> -           bp->eth_dev->rx_pkt_burst;
> >>> -   rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> >>> -           bp->eth_dev->tx_pkt_burst;
> >>> +   rte_eth_fp_ops_setup(bp->eth_dev);
> >>>     rte_mb();
> >>>
> >>>     PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> >>> --
> >>> 2.17.1
> >>
> >> .
> >>
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-05 14:53       ` Konstantin Ananyev
@ 2023-03-06  8:55         ` Ferruh Yigit
  2023-03-06 10:22           ` Konstantin Ananyev
  0 siblings, 1 reply; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-06  8:55 UTC (permalink / raw)
  To: Konstantin Ananyev, dev

On 3/5/2023 2:53 PM, Konstantin Ananyev wrote:
> 03/03/2023 16:51, Ferruh Yigit пишет:
>> On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
>>>
>>>> In the proactive error handling mode, the PMD will set the data path
>>>> pointers to dummy functions and then try recovery, in this period the
>>>> application may still invoking data path API. This will introduce a
>>>> race-condition with data path which may lead to crash [1].
>>>>
>>>> Although the PMD added delay after setting data path pointers to cover
>>>> the above race-condition, it reduces the probability, but it doesn't
>>>> solve the problem.
>>>>
>>>> To solve the race-condition problem fundamentally, the following
>>>> requirements are added:
>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 2. The application should stop data path API invocation when process
>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> 4. The application should enable data path API invocation when process
>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>
>>
>> How this is solving the race-condition, by pushing responsibility to
>> stop data path to application?
> 
> Exactly, it becomes application responsibility to make sure data-path is
> stopped/suspended before recovery will continue.
> 

From documentation of the feature:

``
Because the PMD recovers automatically,
the application can only sense that the data flow is disconnected for a
while and the control API returns an error in this period.

In order to sense the error happening/recovering, as well as to restore
some additional configuration, three events are available:
``

It looks like initial design is to use events mainly inform application
about what happened and mainly for re-configuration.

Although I am don't disagree to involve the application, I am not sure
that is part of current design.

>>
>> What if application is not interested in recovery modes at all and not
>> registered any callback for the recovery?
> 
> 
> Are you saying there is no way for application to disable
> automatic recovery in PMD if it is not interested
> (or can't full-fill per-requesties for it)?
> If so, then yes it is a problem and we need to fix it.
> I assumed that such mechanism to disable unwanted events already exists,
> but I can't find anything.
> Wonder what would be the easiest way here - can PMD make a decision
> based on callback return value, or do we need a new API to
> enable/disable callbacks, or ...?
> 
> 

As far as I can see automatic recovery is not configurable by app.

But that is not all, PMD sends events to application but PMD can't know
if application is handling them or not, so with current design PMD can't
rely on to app.


>> I think driver should not rely on application for this, unless
>> application explicitly says (to driver) that it is handling recovery,
>> right now there is no way for driver to know this.
> 
> I think it is visa-versa:
> application should not enable auto-recovery if it can't meet
> per-requeststies for it (provide appropriate callback).
> 

I agree on above, we are saying similar thing in different perspective.

> 
>>
>>>> Also, this patch introduce a driver internal function
>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>
>>>> [1]
>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>
>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>> ---
>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>> +++++++++++++++----------
>>>>   lib/ethdev/version.map                  |  1 +
>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> index c145a9066c..e380ff135a 100644
>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>> in PASSIVE mode,
>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>   and only a small amount of work is required for the application.
>>>>
>>>> -During error detection and automatic recovery,
>>>> -the PMD sets the data path pointers to dummy functions
>>>> -(which will prevent the crash),
>>>> -and also make sure the control path operations fail with a return
>>>> code ``-EBUSY``.
>>>> -
>>>> -Because the PMD recovers automatically,
>>>> -the application can only sense that the data flow is disconnected
>>>> for a while
>>>> -and the control API returns an error in this period.
>>>> +During error detection and automatic recovery, the PMD sets the
>>>> data path
>>>> +pointers to dummy functions and also make sure the control path
>>>> operations
>>>> +failed with a return code ``-EBUSY``.
>>>>
>>>>   In order to sense the error happening/recovering,
>>>>   as well as to restore some additional configuration,
>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>
>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>      Notify the application that an error is detected
>>>> -   and the recovery is being started.
>>>> +   and the recovery is about to start.
>>>>      Upon receiving the event, the application should not invoke
>>>> -   any control path function until receiving
>>>> +   any control and data path API until receiving
>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>
>>>>   .. note::
>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>
>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>      Notify the application that the recovery from error is successful,
>>>> -   the PMD already re-configures the port,
>>>> -   and the effect is the same as a restart operation.
>>>> +   the PMD already re-configures the port.
>>>> +   The application should restore some additional configuration,
>>>> and then
>>>> +   enable data path API invocation.
>>>>
>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>      Notify the application that the recovery from error failed,
>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>> index 0be1e8ca04..f994653fe9 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>> *dev, const char *ring_name,
>>>>       return rc;
>>>>   }
>>>>
>>>> +void
>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>> +{
>>>> +    if (dev == NULL)
>>>> +        return;
>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>> +}
>>>> +
>>>>   const struct rte_memzone *
>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>> *ring_name,
>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>> index 2c9d615fb5..0d964d1f67 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1621,6 +1621,16 @@ int
>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>> char *name,
>>>>            uint16_t queue_id);
>>>>
>>>> +/**
>>>> + * @internal
>>>> + * Setup eth fast-path API to ethdev values.
>>>> + *
>>>> + * @param dev
>>>> + *  Pointer to struct rte_eth_dev.
>>>> + */
>>>> +__rte_internal
>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>> +
>>>>   /**
>>>>    * @internal
>>>>    * Atomically set the link status for the specific device.
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>> index 049641d57c..44ee7229c1 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>        */
>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>       /** Port recovering from a hardware or firmware error.
>>>> -     * If PMD supports proactive error recovery,
>>>> -     * it should trigger this event to notify application
>>>> -     * that it detected an error and the recovery is being started.
>>>> -     * Upon receiving the event, the application should not invoke
>>>> any control path API
>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>> receiving
>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>> -     * and re-set the data path pointers to non-dummy functions
>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> -     * It means that the application cannot send or receive any
>>>> packets
>>>> -     * during this period.
>>>> +     *
>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>> this
>>>> +     * event to notify application that it detected an error and the
>>>> +     * recovery is about to start.
>>>> +     *
>>>> +     * Upon receiving the event, the application should not invoke any
>>>> +     * control and data path API until receiving
>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>> +     * event.
>>>> +     *
>>>> +     * Once this event is reported, the PMD will set the data path
>>>> pointers
>>>> +     * to dummy functions, and re-set the data path pointers to valid
>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>> event.
>>>> +     *
>>>>        * @note Before the PMD reports the recovery result,
>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>> again,
>>>>        * because a larger error may occur during the recovery.
>>>>        */
>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>       /** Port recovers successfully from the error.
>>>> -     * The PMD already re-configured the port,
>>>> -     * and the effect is the same as a restart operation.
>>>> +     *
>>>> +     * The PMD already re-configured the port:
>>>>        * a) The following operation will be retained: (alphabetically)
>>>>        *    - DCB configuration
>>>>        *    - FEC configuration
>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>        * c) Any other configuration will not be stored
>>>>        *    and will need to be re-configured.
>>>> +     *
>>>> +     * The application should restore some additional configuration
>>>> +     * (see above case b/c), and then enable data path API invocation.
>>>>        */
>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>       /** Port recovery failed.
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>> index 357d1a88c0..c273e0bdae 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>       rte_eth_devices;
>>>>       rte_eth_dma_zone_free;
>>>>       rte_eth_dma_zone_reserve;
>>>> +    rte_eth_fp_ops_setup;
>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>       rte_eth_hairpin_queue_peer_update;
>>>> -- 
>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>
>>>> 2.17.1
>>>
>>
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06  1:41       ` fengchengwen
@ 2023-03-06  8:57         ` Ferruh Yigit
  2023-03-06  9:10         ` Ferruh Yigit
  1 sibling, 0 replies; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-06  8:57 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, thomas, Andrew Rybchenko,
	Kalesh AP, Ajit Khaparde
  Cc: dev

On 3/6/2023 1:41 AM, fengchengwen wrote:
> On 2023/3/4 0:51, Ferruh Yigit wrote:
>> On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
>>>
>>>> In the proactive error handling mode, the PMD will set the data path
>>>> pointers to dummy functions and then try recovery, in this period the
>>>> application may still invoking data path API. This will introduce a
>>>> race-condition with data path which may lead to crash [1].
>>>>
>>>> Although the PMD added delay after setting data path pointers to cover
>>>> the above race-condition, it reduces the probability, but it doesn't
>>>> solve the problem.
>>>>
>>>> To solve the race-condition problem fundamentally, the following
>>>> requirements are added:
>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>    report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 2. The application should stop data path API invocation when process
>>>>    the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> 4. The application should enable data path API invocation when process
>>>>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>
>>
>> How this is solving the race-condition, by pushing responsibility to
>> stop data path to application?
> 
> Yes, I think it's more practical to collaborate with application.
> 
> The application will control API invocation (including control and data path),
> From a DPDK SDK perspective, it has a God perspective.
> 
>>
>> What if application is not interested in recovery modes at all and not
>> registered any callback for the recovery?
> 
> There's probably race-condition which may lead to crash, because DPDK worker
> threads runs busyloop and located on isolated core, and also PMDs add delay time,
> the actual probability of occurence is very very low, at least for HNS3 pmd it
> has not run out for at least four years.
> 
>>
>> I think driver should not rely on application for this, unless
>> application explicitly says (to driver) that it is handling recovery,
> 
> If application register the event callback, the PMD could deduce that application will know this.
> If application not register, then PMD will recovery itself and maybe race-condition.
> 

If application support is required (that makes sense as you mentioned),
in that case I think application should explicitly enable this feature.

>> right now there is no way for driver to know this.
>>
>>
>>>> Also, this patch introduce a driver internal function
>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>
>>>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>
>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>> ---
>>>>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>>  lib/ethdev/version.map                  |  1 +
>>>>  5 files changed, 46 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> index c145a9066c..e380ff135a 100644
>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>>>>  the PMD automatically recovers from error in PROACTIVE mode,
>>>>  and only a small amount of work is required for the application.
>>>>
>>>> -During error detection and automatic recovery,
>>>> -the PMD sets the data path pointers to dummy functions
>>>> -(which will prevent the crash),
>>>> -and also make sure the control path operations fail with a return code ``-EBUSY``.
>>>> -
>>>> -Because the PMD recovers automatically,
>>>> -the application can only sense that the data flow is disconnected for a while
>>>> -and the control API returns an error in this period.
>>>> +During error detection and automatic recovery, the PMD sets the data path
>>>> +pointers to dummy functions and also make sure the control path operations
>>>> +failed with a return code ``-EBUSY``.
>>>>
>>>>  In order to sense the error happening/recovering,
>>>>  as well as to restore some additional configuration,
>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>
>>>>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>     Notify the application that an error is detected
>>>> -   and the recovery is being started.
>>>> +   and the recovery is about to start.
>>>>     Upon receiving the event, the application should not invoke
>>>> -   any control path function until receiving
>>>> +   any control and data path API until receiving
>>>>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>
>>>>  .. note::
>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>
>>>>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>     Notify the application that the recovery from error is successful,
>>>> -   the PMD already re-configures the port,
>>>> -   and the effect is the same as a restart operation.
>>>> +   the PMD already re-configures the port.
>>>> +   The application should restore some additional configuration, and then
>>>> +   enable data path API invocation.
>>>>
>>>>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>     Notify the application that the recovery from error failed,
>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>> index 0be1e8ca04..f994653fe9 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>>>>  	return rc;
>>>>  }
>>>>
>>>> +void
>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>> +{
>>>> +	if (dev == NULL)
>>>> +		return;
>>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>> +}
>>>> +
>>>>  const struct rte_memzone *
>>>>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>>>>  			 uint16_t queue_id, size_t size, unsigned int align,
>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>> index 2c9d615fb5..0d964d1f67 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1621,6 +1621,16 @@ int
>>>>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>>>>  		 uint16_t queue_id);
>>>>
>>>> +/**
>>>> + * @internal
>>>> + * Setup eth fast-path API to ethdev values.
>>>> + *
>>>> + * @param dev
>>>> + *  Pointer to struct rte_eth_dev.
>>>> + */
>>>> +__rte_internal
>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>> +
>>>>  /**
>>>>   * @internal
>>>>   * Atomically set the link status for the specific device.
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>> index 049641d57c..44ee7229c1 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>  	 */
>>>>  	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>  	/** Port recovering from a hardware or firmware error.
>>>> -	 * If PMD supports proactive error recovery,
>>>> -	 * it should trigger this event to notify application
>>>> -	 * that it detected an error and the recovery is being started.
>>>> -	 * Upon receiving the event, the application should not invoke any control path API
>>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>> -	 * The PMD will set the data path pointers to dummy functions,
>>>> -	 * and re-set the data path pointers to non-dummy functions
>>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> -	 * It means that the application cannot send or receive any packets
>>>> -	 * during this period.
>>>> +	 *
>>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>>> +	 * event to notify application that it detected an error and the
>>>> +	 * recovery is about to start.
>>>> +	 *
>>>> +	 * Upon receiving the event, the application should not invoke any
>>>> +	 * control and data path API until receiving
>>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>> +	 * event.
>>>> +	 *
>>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> +	 *
>>>>  	 * @note Before the PMD reports the recovery result,
>>>>  	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>>>>  	 * because a larger error may occur during the recovery.
>>>>  	 */
>>>>  	RTE_ETH_EVENT_ERR_RECOVERING,
>>>>  	/** Port recovers successfully from the error.
>>>> -	 * The PMD already re-configured the port,
>>>> -	 * and the effect is the same as a restart operation.
>>>> +	 *
>>>> +	 * The PMD already re-configured the port:
>>>>  	 * a) The following operation will be retained: (alphabetically)
>>>>  	 *    - DCB configuration
>>>>  	 *    - FEC configuration
>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>  	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>  	 * c) Any other configuration will not be stored
>>>>  	 *    and will need to be re-configured.
>>>> +	 *
>>>> +	 * The application should restore some additional configuration
>>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>>  	 */
>>>>  	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>  	/** Port recovery failed.
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>> index 357d1a88c0..c273e0bdae 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>  	rte_eth_devices;
>>>>  	rte_eth_dma_zone_free;
>>>>  	rte_eth_dma_zone_reserve;
>>>> +	rte_eth_fp_ops_setup;
>>>>  	rte_eth_hairpin_queue_peer_bind;
>>>>  	rte_eth_hairpin_queue_peer_unbind;
>>>>  	rte_eth_hairpin_queue_peer_update;
>>>> --
>>>  
>>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>
>>>> 2.17.1
>>>
>>
>> .
>>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06  1:41       ` fengchengwen
  2023-03-06  8:57         ` Ferruh Yigit
@ 2023-03-06  9:10         ` Ferruh Yigit
  1 sibling, 0 replies; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-06  9:10 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, thomas, Andrew Rybchenko,
	Kalesh AP, Ajit Khaparde
  Cc: dev

On 3/6/2023 1:41 AM, fengchengwen wrote:
>> What if application is not interested in recovery modes at all and not
>> registered any callback for the recover>
> There's probably race-condition which may lead to crash, because DPDK worker
> threads runs busyloop and located on isolated core, and also PMDs add delay time,
> the actual probability of occurence is very very low, at least for HNS3 pmd it
> has not run out for at least four years.
> 

I understand the problem and why application needs to involve, but the
question is what will happen if application is not aware of this and not
handled this event, or ported from different NIC etc.
Do you want to make handling this event mandatory for each DPDK application?


Btw, what about my suggestion [1] to use different version of burst ops
update function in PMDs to prevent crash?

[1]
https://inbox.dpdk.org/dev/20230220060839.1267349-1-ashok.k.kaladi@intel.com/T/#m876b5c5312391557c952198561e6823473bce151


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06  8:55         ` Ferruh Yigit
@ 2023-03-06 10:22           ` Konstantin Ananyev
  2023-03-06 11:00             ` Ferruh Yigit
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-06 10:22 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev, dev



> >>>> In the proactive error handling mode, the PMD will set the data path
> >>>> pointers to dummy functions and then try recovery, in this period the
> >>>> application may still invoking data path API. This will introduce a
> >>>> race-condition with data path which may lead to crash [1].
> >>>>
> >>>> Although the PMD added delay after setting data path pointers to cover
> >>>> the above race-condition, it reduces the probability, but it doesn't
> >>>> solve the problem.
> >>>>
> >>>> To solve the race-condition problem fundamentally, the following
> >>>> requirements are added:
> >>>> 1. The PMD should set the data path pointers to dummy functions after
> >>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>> 2. The application should stop data path API invocation when process
> >>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>> 3. The PMD should set the data path pointers to valid functions before
> >>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>> 4. The application should enable data path API invocation when process
> >>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>
> >>
> >> How this is solving the race-condition, by pushing responsibility to
> >> stop data path to application?
> >
> > Exactly, it becomes application responsibility to make sure data-path is
> > stopped/suspended before recovery will continue.
> >
> 
> From documentation of the feature:
> 
> ``
> Because the PMD recovers automatically,
> the application can only sense that the data flow is disconnected for a
> while and the control API returns an error in this period.
> 
> In order to sense the error happening/recovering, as well as to restore
> some additional configuration, three events are available:
> ``
> 
> It looks like initial design is to use events mainly inform application
> about what happened and mainly for re-configuration.
> 
> Although I am don't disagree to involve the application, I am not sure
> that is part of current design.

I thought we all agreed that initial design contain some fallacies that
need to fixed, no?
Statement that with current rte_ethdev design error recovery can be done
without interaction with the app (to stop/suspend data/control path)
is the main one I think.
It needs some interaction with app layer, one way or another. 

> >>
> >> What if application is not interested in recovery modes at all and not
> >> registered any callback for the recovery?
> >
> >
> > Are you saying there is no way for application to disable
> > automatic recovery in PMD if it is not interested
> > (or can't full-fill per-requesties for it)?
> > If so, then yes it is a problem and we need to fix it.
> > I assumed that such mechanism to disable unwanted events already exists,
> > but I can't find anything.
> > Wonder what would be the easiest way here - can PMD make a decision
> > based on callback return value, or do we need a new API to
> > enable/disable callbacks, or ...?
> >
> >
> 
> As far as I can see automatic recovery is not configurable by app.
> 
> But that is not all, PMD sends events to application but PMD can't know
> if application is handling them or not, so with current design PMD can't
> rely on to app.

Well, PMD invokes user provided callback.
One way to fix that problem - if there is no callback provided,
or callback returns an error code - PMD can assume that recovery
should not be done.
That is probably not the best design choice, but at least it will allow
to fix the problem without too many changes and introducing new API.
That could be sort of a 'quick fix'.
In a meanwhile we can think about new/better approach for that.    

> 
> >> I think driver should not rely on application for this, unless
> >> application explicitly says (to driver) that it is handling recovery,
> >> right now there is no way for driver to know this.
> >
> > I think it is visa-versa:
> > application should not enable auto-recovery if it can't meet
> > per-requeststies for it (provide appropriate callback).
> >
> 
> I agree on above, we are saying similar thing in different perspective.

Ok, that's good we are on the same page.
 

> 
> >
> >>
> >>>> Also, this patch introduce a driver internal function
> >>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>
> >>>> [1]
> >>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> >>>>
> >>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>> Cc: stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>> ---
> >>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>   lib/ethdev/rte_ethdev.h                 | 32
> >>>> +++++++++++++++----------
> >>>>   lib/ethdev/version.map                  |  1 +
> >>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>
> >>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> index c145a9066c..e380ff135a 100644
> >>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> >>>> in PASSIVE mode,
> >>>>   the PMD automatically recovers from error in PROACTIVE mode,
> >>>>   and only a small amount of work is required for the application.
> >>>>
> >>>> -During error detection and automatic recovery,
> >>>> -the PMD sets the data path pointers to dummy functions
> >>>> -(which will prevent the crash),
> >>>> -and also make sure the control path operations fail with a return
> >>>> code ``-EBUSY``.
> >>>> -
> >>>> -Because the PMD recovers automatically,
> >>>> -the application can only sense that the data flow is disconnected
> >>>> for a while
> >>>> -and the control API returns an error in this period.
> >>>> +During error detection and automatic recovery, the PMD sets the
> >>>> data path
> >>>> +pointers to dummy functions and also make sure the control path
> >>>> operations
> >>>> +failed with a return code ``-EBUSY``.
> >>>>
> >>>>   In order to sense the error happening/recovering,
> >>>>   as well as to restore some additional configuration,
> >>>> @@ -653,9 +648,9 @@ three events are available:
> >>>>
> >>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>      Notify the application that an error is detected
> >>>> -   and the recovery is being started.
> >>>> +   and the recovery is about to start.
> >>>>      Upon receiving the event, the application should not invoke
> >>>> -   any control path function until receiving
> >>>> +   any control and data path API until receiving
> >>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>
> >>>>   .. note::
> >>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>
> >>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>      Notify the application that the recovery from error is successful,
> >>>> -   the PMD already re-configures the port,
> >>>> -   and the effect is the same as a restart operation.
> >>>> +   the PMD already re-configures the port.
> >>>> +   The application should restore some additional configuration,
> >>>> and then
> >>>> +   enable data path API invocation.
> >>>>
> >>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>      Notify the application that the recovery from error failed,
> >>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> >>>> index 0be1e8ca04..f994653fe9 100644
> >>>> --- a/lib/ethdev/ethdev_driver.c
> >>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> >>>> *dev, const char *ring_name,
> >>>>       return rc;
> >>>>   }
> >>>>
> >>>> +void
> >>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> >>>> +{
> >>>> +    if (dev == NULL)
> >>>> +        return;
> >>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> >>>> +}
> >>>> +
> >>>>   const struct rte_memzone *
> >>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> >>>> *ring_name,
> >>>>                uint16_t queue_id, size_t size, unsigned int align,
> >>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >>>> index 2c9d615fb5..0d964d1f67 100644
> >>>> --- a/lib/ethdev/ethdev_driver.h
> >>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>> @@ -1621,6 +1621,16 @@ int
> >>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>> char *name,
> >>>>            uint16_t queue_id);
> >>>>
> >>>> +/**
> >>>> + * @internal
> >>>> + * Setup eth fast-path API to ethdev values.
> >>>> + *
> >>>> + * @param dev
> >>>> + *  Pointer to struct rte_eth_dev.
> >>>> + */
> >>>> +__rte_internal
> >>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>> +
> >>>>   /**
> >>>>    * @internal
> >>>>    * Atomically set the link status for the specific device.
> >>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>> index 049641d57c..44ee7229c1 100644
> >>>> --- a/lib/ethdev/rte_ethdev.h
> >>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>        */
> >>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>       /** Port recovering from a hardware or firmware error.
> >>>> -     * If PMD supports proactive error recovery,
> >>>> -     * it should trigger this event to notify application
> >>>> -     * that it detected an error and the recovery is being started.
> >>>> -     * Upon receiving the event, the application should not invoke
> >>>> any control path API
> >>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> >>>> receiving
> >>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>> -     * The PMD will set the data path pointers to dummy functions,
> >>>> -     * and re-set the data path pointers to non-dummy functions
> >>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>> -     * It means that the application cannot send or receive any
> >>>> packets
> >>>> -     * during this period.
> >>>> +     *
> >>>> +     * If PMD supports proactive error recovery, it should trigger
> >>>> this
> >>>> +     * event to notify application that it detected an error and the
> >>>> +     * recovery is about to start.
> >>>> +     *
> >>>> +     * Upon receiving the event, the application should not invoke any
> >>>> +     * control and data path API until receiving
> >>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> >>>> +     * event.
> >>>> +     *
> >>>> +     * Once this event is reported, the PMD will set the data path
> >>>> pointers
> >>>> +     * to dummy functions, and re-set the data path pointers to valid
> >>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>> event.
> >>>> +     *
> >>>>        * @note Before the PMD reports the recovery result,
> >>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>> again,
> >>>>        * because a larger error may occur during the recovery.
> >>>>        */
> >>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> >>>>       /** Port recovers successfully from the error.
> >>>> -     * The PMD already re-configured the port,
> >>>> -     * and the effect is the same as a restart operation.
> >>>> +     *
> >>>> +     * The PMD already re-configured the port:
> >>>>        * a) The following operation will be retained: (alphabetically)
> >>>>        *    - DCB configuration
> >>>>        *    - FEC configuration
> >>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>        * c) Any other configuration will not be stored
> >>>>        *    and will need to be re-configured.
> >>>> +     *
> >>>> +     * The application should restore some additional configuration
> >>>> +     * (see above case b/c), and then enable data path API invocation.
> >>>>        */
> >>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>       /** Port recovery failed.
> >>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >>>> index 357d1a88c0..c273e0bdae 100644
> >>>> --- a/lib/ethdev/version.map
> >>>> +++ b/lib/ethdev/version.map
> >>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>       rte_eth_devices;
> >>>>       rte_eth_dma_zone_free;
> >>>>       rte_eth_dma_zone_reserve;
> >>>> +    rte_eth_fp_ops_setup;
> >>>>       rte_eth_hairpin_queue_peer_bind;
> >>>>       rte_eth_hairpin_queue_peer_unbind;
> >>>>       rte_eth_hairpin_queue_peer_update;
> >>>> --
> >>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >>>
> >>>> 2.17.1
> >>>
> >>
> >


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06 10:22           ` Konstantin Ananyev
@ 2023-03-06 11:00             ` Ferruh Yigit
  2023-03-06 11:05               ` Ajit Khaparde
  0 siblings, 1 reply; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-06 11:00 UTC (permalink / raw)
  To: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko; +Cc: dev, fengchengwen

On 3/6/2023 10:22 AM, Konstantin Ananyev wrote:
> 
> 
>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>> application may still invoking data path API. This will introduce a
>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>
>>>>>> Although the PMD added delay after setting data path pointers to cover
>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>> solve the problem.
>>>>>>
>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>> requirements are added:
>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>> 2. The application should stop data path API invocation when process
>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> 4. The application should enable data path API invocation when process
>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>
>>>>
>>>> How this is solving the race-condition, by pushing responsibility to
>>>> stop data path to application?
>>>
>>> Exactly, it becomes application responsibility to make sure data-path is
>>> stopped/suspended before recovery will continue.
>>>
>>
>> From documentation of the feature:
>>
>> ``
>> Because the PMD recovers automatically,
>> the application can only sense that the data flow is disconnected for a
>> while and the control API returns an error in this period.
>>
>> In order to sense the error happening/recovering, as well as to restore
>> some additional configuration, three events are available:
>> ``
>>
>> It looks like initial design is to use events mainly inform application
>> about what happened and mainly for re-configuration.
>>
>> Although I am don't disagree to involve the application, I am not sure
>> that is part of current design.
> 
> I thought we all agreed that initial design contain some fallacies that
> need to fixed, no?
> Statement that with current rte_ethdev design error recovery can be done
> without interaction with the app (to stop/suspend data/control path)
> is the main one I think.
> It needs some interaction with app layer, one way or another. 
> 
>>>>
>>>> What if application is not interested in recovery modes at all and not
>>>> registered any callback for the recovery?
>>>
>>>
>>> Are you saying there is no way for application to disable
>>> automatic recovery in PMD if it is not interested
>>> (or can't full-fill per-requesties for it)?
>>> If so, then yes it is a problem and we need to fix it.
>>> I assumed that such mechanism to disable unwanted events already exists,
>>> but I can't find anything.
>>> Wonder what would be the easiest way here - can PMD make a decision
>>> based on callback return value, or do we need a new API to
>>> enable/disable callbacks, or ...?
>>>
>>>
>>
>> As far as I can see automatic recovery is not configurable by app.
>>
>> But that is not all, PMD sends events to application but PMD can't know
>> if application is handling them or not, so with current design PMD can't
>> rely on to app.
> 
> Well, PMD invokes user provided callback.
> One way to fix that problem - if there is no callback provided,
> or callback returns an error code - PMD can assume that recovery
> should not be done.
> That is probably not the best design choice, but at least it will allow
> to fix the problem without too many changes and introducing new API.
> That could be sort of a 'quick fix'.
> In a meanwhile we can think about new/better approach for that.    
> 

-rc2 for 23.03 is a few days away.

What do you think to have 'quick fix' as modifying how driver updates
burst ops to prevent the race condition, for this release?

And plan a design update for the next release?


>>
>>>> I think driver should not rely on application for this, unless
>>>> application explicitly says (to driver) that it is handling recovery,
>>>> right now there is no way for driver to know this.
>>>
>>> I think it is visa-versa:
>>> application should not enable auto-recovery if it can't meet
>>> per-requeststies for it (provide appropriate callback).
>>>
>>
>> I agree on above, we are saying similar thing in different perspective.
> 
> Ok, that's good we are on the same page.
>  
> 
>>
>>>
>>>>
>>>>>> Also, this patch introduce a driver internal function
>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>
>>>>>> [1]
>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>>>
>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>> Cc: stable@dpdk.org
>>>>>>
>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>> ---
>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>>>> +++++++++++++++----------
>>>>>>   lib/ethdev/version.map                  |  1 +
>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>
>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> index c145a9066c..e380ff135a 100644
>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>> in PASSIVE mode,
>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>>>   and only a small amount of work is required for the application.
>>>>>>
>>>>>> -During error detection and automatic recovery,
>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>> -(which will prevent the crash),
>>>>>> -and also make sure the control path operations fail with a return
>>>>>> code ``-EBUSY``.
>>>>>> -
>>>>>> -Because the PMD recovers automatically,
>>>>>> -the application can only sense that the data flow is disconnected
>>>>>> for a while
>>>>>> -and the control API returns an error in this period.
>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>> data path
>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>> operations
>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>
>>>>>>   In order to sense the error happening/recovering,
>>>>>>   as well as to restore some additional configuration,
>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>
>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>      Notify the application that an error is detected
>>>>>> -   and the recovery is being started.
>>>>>> +   and the recovery is about to start.
>>>>>>      Upon receiving the event, the application should not invoke
>>>>>> -   any control path function until receiving
>>>>>> +   any control and data path API until receiving
>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>
>>>>>>   .. note::
>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>
>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>      Notify the application that the recovery from error is successful,
>>>>>> -   the PMD already re-configures the port,
>>>>>> -   and the effect is the same as a restart operation.
>>>>>> +   the PMD already re-configures the port.
>>>>>> +   The application should restore some additional configuration,
>>>>>> and then
>>>>>> +   enable data path API invocation.
>>>>>>
>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>      Notify the application that the recovery from error failed,
>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>> *dev, const char *ring_name,
>>>>>>       return rc;
>>>>>>   }
>>>>>>
>>>>>> +void
>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>> +{
>>>>>> +    if (dev == NULL)
>>>>>> +        return;
>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>> +}
>>>>>> +
>>>>>>   const struct rte_memzone *
>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>> *ring_name,
>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>> char *name,
>>>>>>            uint16_t queue_id);
>>>>>>
>>>>>> +/**
>>>>>> + * @internal
>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>> + *
>>>>>> + * @param dev
>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>> + */
>>>>>> +__rte_internal
>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>> +
>>>>>>   /**
>>>>>>    * @internal
>>>>>>    * Atomically set the link status for the specific device.
>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>        */
>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>       /** Port recovering from a hardware or firmware error.
>>>>>> -     * If PMD supports proactive error recovery,
>>>>>> -     * it should trigger this event to notify application
>>>>>> -     * that it detected an error and the recovery is being started.
>>>>>> -     * Upon receiving the event, the application should not invoke
>>>>>> any control path API
>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>> receiving
>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>>>> -     * and re-set the data path pointers to non-dummy functions
>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> -     * It means that the application cannot send or receive any
>>>>>> packets
>>>>>> -     * during this period.
>>>>>> +     *
>>>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>>>> this
>>>>>> +     * event to notify application that it detected an error and the
>>>>>> +     * recovery is about to start.
>>>>>> +     *
>>>>>> +     * Upon receiving the event, the application should not invoke any
>>>>>> +     * control and data path API until receiving
>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>> +     * event.
>>>>>> +     *
>>>>>> +     * Once this event is reported, the PMD will set the data path
>>>>>> pointers
>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>> event.
>>>>>> +     *
>>>>>>        * @note Before the PMD reports the recovery result,
>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>> again,
>>>>>>        * because a larger error may occur during the recovery.
>>>>>>        */
>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>>       /** Port recovers successfully from the error.
>>>>>> -     * The PMD already re-configured the port,
>>>>>> -     * and the effect is the same as a restart operation.
>>>>>> +     *
>>>>>> +     * The PMD already re-configured the port:
>>>>>>        * a) The following operation will be retained: (alphabetically)
>>>>>>        *    - DCB configuration
>>>>>>        *    - FEC configuration
>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>        * c) Any other configuration will not be stored
>>>>>>        *    and will need to be re-configured.
>>>>>> +     *
>>>>>> +     * The application should restore some additional configuration
>>>>>> +     * (see above case b/c), and then enable data path API invocation.
>>>>>>        */
>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>       /** Port recovery failed.
>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>> --- a/lib/ethdev/version.map
>>>>>> +++ b/lib/ethdev/version.map
>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>       rte_eth_devices;
>>>>>>       rte_eth_dma_zone_free;
>>>>>>       rte_eth_dma_zone_reserve;
>>>>>> +    rte_eth_fp_ops_setup;
>>>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>>>       rte_eth_hairpin_queue_peer_update;
>>>>>> --
>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>>>
>>>>>> 2.17.1
>>>>>
>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06 11:00             ` Ferruh Yigit
@ 2023-03-06 11:05               ` Ajit Khaparde
  2023-03-06 11:13                 ` Konstantin Ananyev
  0 siblings, 1 reply; 85+ messages in thread
From: Ajit Khaparde @ 2023-03-06 11:05 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev, fengchengwen

[-- Attachment #1: Type: text/plain, Size: 14139 bytes --]

On Mon, Mar 6, 2023 at 3:00 AM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>
> On 3/6/2023 10:22 AM, Konstantin Ananyev wrote:
> >
> >
> >>>>>> In the proactive error handling mode, the PMD will set the data path
> >>>>>> pointers to dummy functions and then try recovery, in this period the
> >>>>>> application may still invoking data path API. This will introduce a
> >>>>>> race-condition with data path which may lead to crash [1].
> >>>>>>
> >>>>>> Although the PMD added delay after setting data path pointers to cover
> >>>>>> the above race-condition, it reduces the probability, but it doesn't
> >>>>>> solve the problem.
> >>>>>>
> >>>>>> To solve the race-condition problem fundamentally, the following
> >>>>>> requirements are added:
> >>>>>> 1. The PMD should set the data path pointers to dummy functions after
> >>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>> 2. The application should stop data path API invocation when process
> >>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>> 3. The PMD should set the data path pointers to valid functions before
> >>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>> 4. The application should enable data path API invocation when process
> >>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>
> >>>>
> >>>> How this is solving the race-condition, by pushing responsibility to
> >>>> stop data path to application?
> >>>
> >>> Exactly, it becomes application responsibility to make sure data-path is
> >>> stopped/suspended before recovery will continue.
> >>>
> >>
> >> From documentation of the feature:
> >>
> >> ``
> >> Because the PMD recovers automatically,
> >> the application can only sense that the data flow is disconnected for a
> >> while and the control API returns an error in this period.
> >>
> >> In order to sense the error happening/recovering, as well as to restore
> >> some additional configuration, three events are available:
> >> ``
> >>
> >> It looks like initial design is to use events mainly inform application
> >> about what happened and mainly for re-configuration.
> >>
> >> Although I am don't disagree to involve the application, I am not sure
> >> that is part of current design.
> >
> > I thought we all agreed that initial design contain some fallacies that
> > need to fixed, no?
> > Statement that with current rte_ethdev design error recovery can be done
> > without interaction with the app (to stop/suspend data/control path)
> > is the main one I think.
> > It needs some interaction with app layer, one way or another.
> >
> >>>>
> >>>> What if application is not interested in recovery modes at all and not
> >>>> registered any callback for the recovery?
> >>>
> >>>
> >>> Are you saying there is no way for application to disable
> >>> automatic recovery in PMD if it is not interested
> >>> (or can't full-fill per-requesties for it)?
> >>> If so, then yes it is a problem and we need to fix it.
> >>> I assumed that such mechanism to disable unwanted events already exists,
> >>> but I can't find anything.
> >>> Wonder what would be the easiest way here - can PMD make a decision
> >>> based on callback return value, or do we need a new API to
> >>> enable/disable callbacks, or ...?
> >>>
> >>>
> >>
> >> As far as I can see automatic recovery is not configurable by app.
> >>
> >> But that is not all, PMD sends events to application but PMD can't know
> >> if application is handling them or not, so with current design PMD can't
> >> rely on to app.
> >
> > Well, PMD invokes user provided callback.
> > One way to fix that problem - if there is no callback provided,
> > or callback returns an error code - PMD can assume that recovery
> > should not be done.
> > That is probably not the best design choice, but at least it will allow
> > to fix the problem without too many changes and introducing new API.
> > That could be sort of a 'quick fix'.
> > In a meanwhile we can think about new/better approach for that.
> >
>
> -rc2 for 23.03 is a few days away.
>
> What do you think to have 'quick fix' as modifying how driver updates
> burst ops to prevent the race condition, for this release?
>
> And plan a design update for the next release?
+1 on the overall approach.

>
>
> >>
> >>>> I think driver should not rely on application for this, unless
> >>>> application explicitly says (to driver) that it is handling recovery,
> >>>> right now there is no way for driver to know this.
> >>>
> >>> I think it is visa-versa:
> >>> application should not enable auto-recovery if it can't meet
> >>> per-requeststies for it (provide appropriate callback).
> >>>
> >>
> >> I agree on above, we are saying similar thing in different perspective.
> >
> > Ok, that's good we are on the same page.
> >
> >
> >>
> >>>
> >>>>
> >>>>>> Also, this patch introduce a driver internal function
> >>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>>>
> >>>>>> [1]
> >>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> >>>>>>
> >>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>>>> Cc: stable@dpdk.org
> >>>>>>
> >>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>>>> ---
> >>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> >>>>>> +++++++++++++++----------
> >>>>>>   lib/ethdev/version.map                  |  1 +
> >>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>>>
> >>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>> index c145a9066c..e380ff135a 100644
> >>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> >>>>>> in PASSIVE mode,
> >>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> >>>>>>   and only a small amount of work is required for the application.
> >>>>>>
> >>>>>> -During error detection and automatic recovery,
> >>>>>> -the PMD sets the data path pointers to dummy functions
> >>>>>> -(which will prevent the crash),
> >>>>>> -and also make sure the control path operations fail with a return
> >>>>>> code ``-EBUSY``.
> >>>>>> -
> >>>>>> -Because the PMD recovers automatically,
> >>>>>> -the application can only sense that the data flow is disconnected
> >>>>>> for a while
> >>>>>> -and the control API returns an error in this period.
> >>>>>> +During error detection and automatic recovery, the PMD sets the
> >>>>>> data path
> >>>>>> +pointers to dummy functions and also make sure the control path
> >>>>>> operations
> >>>>>> +failed with a return code ``-EBUSY``.
> >>>>>>
> >>>>>>   In order to sense the error happening/recovering,
> >>>>>>   as well as to restore some additional configuration,
> >>>>>> @@ -653,9 +648,9 @@ three events are available:
> >>>>>>
> >>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>>>      Notify the application that an error is detected
> >>>>>> -   and the recovery is being started.
> >>>>>> +   and the recovery is about to start.
> >>>>>>      Upon receiving the event, the application should not invoke
> >>>>>> -   any control path function until receiving
> >>>>>> +   any control and data path API until receiving
> >>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>>>
> >>>>>>   .. note::
> >>>>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>>>
> >>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>>>      Notify the application that the recovery from error is successful,
> >>>>>> -   the PMD already re-configures the port,
> >>>>>> -   and the effect is the same as a restart operation.
> >>>>>> +   the PMD already re-configures the port.
> >>>>>> +   The application should restore some additional configuration,
> >>>>>> and then
> >>>>>> +   enable data path API invocation.
> >>>>>>
> >>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>>>      Notify the application that the recovery from error failed,
> >>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> >>>>>> index 0be1e8ca04..f994653fe9 100644
> >>>>>> --- a/lib/ethdev/ethdev_driver.c
> >>>>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> >>>>>> *dev, const char *ring_name,
> >>>>>>       return rc;
> >>>>>>   }
> >>>>>>
> >>>>>> +void
> >>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> >>>>>> +{
> >>>>>> +    if (dev == NULL)
> >>>>>> +        return;
> >>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> >>>>>> +}
> >>>>>> +
> >>>>>>   const struct rte_memzone *
> >>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> >>>>>> *ring_name,
> >>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> >>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >>>>>> index 2c9d615fb5..0d964d1f67 100644
> >>>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>>> @@ -1621,6 +1621,16 @@ int
> >>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>>>> char *name,
> >>>>>>            uint16_t queue_id);
> >>>>>>
> >>>>>> +/**
> >>>>>> + * @internal
> >>>>>> + * Setup eth fast-path API to ethdev values.
> >>>>>> + *
> >>>>>> + * @param dev
> >>>>>> + *  Pointer to struct rte_eth_dev.
> >>>>>> + */
> >>>>>> +__rte_internal
> >>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>>>> +
> >>>>>>   /**
> >>>>>>    * @internal
> >>>>>>    * Atomically set the link status for the specific device.
> >>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>>>> index 049641d57c..44ee7229c1 100644
> >>>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>>>        */
> >>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>>>       /** Port recovering from a hardware or firmware error.
> >>>>>> -     * If PMD supports proactive error recovery,
> >>>>>> -     * it should trigger this event to notify application
> >>>>>> -     * that it detected an error and the recovery is being started.
> >>>>>> -     * Upon receiving the event, the application should not invoke
> >>>>>> any control path API
> >>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> >>>>>> receiving
> >>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>>>> -     * The PMD will set the data path pointers to dummy functions,
> >>>>>> -     * and re-set the data path pointers to non-dummy functions
> >>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>> -     * It means that the application cannot send or receive any
> >>>>>> packets
> >>>>>> -     * during this period.
> >>>>>> +     *
> >>>>>> +     * If PMD supports proactive error recovery, it should trigger
> >>>>>> this
> >>>>>> +     * event to notify application that it detected an error and the
> >>>>>> +     * recovery is about to start.
> >>>>>> +     *
> >>>>>> +     * Upon receiving the event, the application should not invoke any
> >>>>>> +     * control and data path API until receiving
> >>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> >>>>>> +     * event.
> >>>>>> +     *
> >>>>>> +     * Once this event is reported, the PMD will set the data path
> >>>>>> pointers
> >>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> >>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>>>> event.
> >>>>>> +     *
> >>>>>>        * @note Before the PMD reports the recovery result,
> >>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>>>> again,
> >>>>>>        * because a larger error may occur during the recovery.
> >>>>>>        */
> >>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> >>>>>>       /** Port recovers successfully from the error.
> >>>>>> -     * The PMD already re-configured the port,
> >>>>>> -     * and the effect is the same as a restart operation.
> >>>>>> +     *
> >>>>>> +     * The PMD already re-configured the port:
> >>>>>>        * a) The following operation will be retained: (alphabetically)
> >>>>>>        *    - DCB configuration
> >>>>>>        *    - FEC configuration
> >>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>>>        * c) Any other configuration will not be stored
> >>>>>>        *    and will need to be re-configured.
> >>>>>> +     *
> >>>>>> +     * The application should restore some additional configuration
> >>>>>> +     * (see above case b/c), and then enable data path API invocation.
> >>>>>>        */
> >>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>>>       /** Port recovery failed.
> >>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >>>>>> index 357d1a88c0..c273e0bdae 100644
> >>>>>> --- a/lib/ethdev/version.map
> >>>>>> +++ b/lib/ethdev/version.map
> >>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>>>       rte_eth_devices;
> >>>>>>       rte_eth_dma_zone_free;
> >>>>>>       rte_eth_dma_zone_reserve;
> >>>>>> +    rte_eth_fp_ops_setup;
> >>>>>>       rte_eth_hairpin_queue_peer_bind;
> >>>>>>       rte_eth_hairpin_queue_peer_unbind;
> >>>>>>       rte_eth_hairpin_queue_peer_update;
> >>>>>> --
> >>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >>>>>
> >>>>>> 2.17.1
> >>>>>
> >>>>
> >>>
> >
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06 11:05               ` Ajit Khaparde
@ 2023-03-06 11:13                 ` Konstantin Ananyev
  2023-03-07  8:25                   ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-06 11:13 UTC (permalink / raw)
  To: Ajit Khaparde, Ferruh Yigit
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev, Fengchengwen



> > >>>>>> In the proactive error handling mode, the PMD will set the data path
> > >>>>>> pointers to dummy functions and then try recovery, in this period the
> > >>>>>> application may still invoking data path API. This will introduce a
> > >>>>>> race-condition with data path which may lead to crash [1].
> > >>>>>>
> > >>>>>> Although the PMD added delay after setting data path pointers to cover
> > >>>>>> the above race-condition, it reduces the probability, but it doesn't
> > >>>>>> solve the problem.
> > >>>>>>
> > >>>>>> To solve the race-condition problem fundamentally, the following
> > >>>>>> requirements are added:
> > >>>>>> 1. The PMD should set the data path pointers to dummy functions after
> > >>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>> 2. The application should stop data path API invocation when process
> > >>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>> 3. The PMD should set the data path pointers to valid functions before
> > >>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>> 4. The application should enable data path API invocation when process
> > >>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>
> > >>>>
> > >>>> How this is solving the race-condition, by pushing responsibility to
> > >>>> stop data path to application?
> > >>>
> > >>> Exactly, it becomes application responsibility to make sure data-path is
> > >>> stopped/suspended before recovery will continue.
> > >>>
> > >>
> > >> From documentation of the feature:
> > >>
> > >> ``
> > >> Because the PMD recovers automatically,
> > >> the application can only sense that the data flow is disconnected for a
> > >> while and the control API returns an error in this period.
> > >>
> > >> In order to sense the error happening/recovering, as well as to restore
> > >> some additional configuration, three events are available:
> > >> ``
> > >>
> > >> It looks like initial design is to use events mainly inform application
> > >> about what happened and mainly for re-configuration.
> > >>
> > >> Although I am don't disagree to involve the application, I am not sure
> > >> that is part of current design.
> > >
> > > I thought we all agreed that initial design contain some fallacies that
> > > need to fixed, no?
> > > Statement that with current rte_ethdev design error recovery can be done
> > > without interaction with the app (to stop/suspend data/control path)
> > > is the main one I think.
> > > It needs some interaction with app layer, one way or another.
> > >
> > >>>>
> > >>>> What if application is not interested in recovery modes at all and not
> > >>>> registered any callback for the recovery?
> > >>>
> > >>>
> > >>> Are you saying there is no way for application to disable
> > >>> automatic recovery in PMD if it is not interested
> > >>> (or can't full-fill per-requesties for it)?
> > >>> If so, then yes it is a problem and we need to fix it.
> > >>> I assumed that such mechanism to disable unwanted events already exists,
> > >>> but I can't find anything.
> > >>> Wonder what would be the easiest way here - can PMD make a decision
> > >>> based on callback return value, or do we need a new API to
> > >>> enable/disable callbacks, or ...?
> > >>>
> > >>>
> > >>
> > >> As far as I can see automatic recovery is not configurable by app.
> > >>
> > >> But that is not all, PMD sends events to application but PMD can't know
> > >> if application is handling them or not, so with current design PMD can't
> > >> rely on to app.
> > >
> > > Well, PMD invokes user provided callback.
> > > One way to fix that problem - if there is no callback provided,
> > > or callback returns an error code - PMD can assume that recovery
> > > should not be done.
> > > That is probably not the best design choice, but at least it will allow
> > > to fix the problem without too many changes and introducing new API.
> > > That could be sort of a 'quick fix'.
> > > In a meanwhile we can think about new/better approach for that.
> > >
> >
> > -rc2 for 23.03 is a few days away.
> >
> > What do you think to have 'quick fix' as modifying how driver updates
> > burst ops to prevent the race condition, for this release?
> >
> > And plan a design update for the next release?
> +1 on the overall approach.

Yep, agree.
 
> 
> >
> >
> > >>
> > >>>> I think driver should not rely on application for this, unless
> > >>>> application explicitly says (to driver) that it is handling recovery,
> > >>>> right now there is no way for driver to know this.
> > >>>
> > >>> I think it is visa-versa:
> > >>> application should not enable auto-recovery if it can't meet
> > >>> per-requeststies for it (provide appropriate callback).
> > >>>
> > >>
> > >> I agree on above, we are saying similar thing in different perspective.
> > >
> > > Ok, that's good we are on the same page.
> > >
> > >
> > >>
> > >>>
> > >>>>
> > >>>>>> Also, this patch introduce a driver internal function
> > >>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> > >>>>>>
> > >>>>>> [1]
> > >>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> > >>>>>>
> > >>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> > >>>>>> Cc: stable@dpdk.org
> > >>>>>>
> > >>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>>>>> ---
> > >>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> > >>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> > >>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> > >>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> > >>>>>> +++++++++++++++----------
> > >>>>>>   lib/ethdev/version.map                  |  1 +
> > >>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>> index c145a9066c..e380ff135a 100644
> > >>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> > >>>>>> in PASSIVE mode,
> > >>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> > >>>>>>   and only a small amount of work is required for the application.
> > >>>>>>
> > >>>>>> -During error detection and automatic recovery,
> > >>>>>> -the PMD sets the data path pointers to dummy functions
> > >>>>>> -(which will prevent the crash),
> > >>>>>> -and also make sure the control path operations fail with a return
> > >>>>>> code ``-EBUSY``.
> > >>>>>> -
> > >>>>>> -Because the PMD recovers automatically,
> > >>>>>> -the application can only sense that the data flow is disconnected
> > >>>>>> for a while
> > >>>>>> -and the control API returns an error in this period.
> > >>>>>> +During error detection and automatic recovery, the PMD sets the
> > >>>>>> data path
> > >>>>>> +pointers to dummy functions and also make sure the control path
> > >>>>>> operations
> > >>>>>> +failed with a return code ``-EBUSY``.
> > >>>>>>
> > >>>>>>   In order to sense the error happening/recovering,
> > >>>>>>   as well as to restore some additional configuration,
> > >>>>>> @@ -653,9 +648,9 @@ three events are available:
> > >>>>>>
> > >>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> > >>>>>>      Notify the application that an error is detected
> > >>>>>> -   and the recovery is being started.
> > >>>>>> +   and the recovery is about to start.
> > >>>>>>      Upon receiving the event, the application should not invoke
> > >>>>>> -   any control path function until receiving
> > >>>>>> +   any control and data path API until receiving
> > >>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> > >>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> > >>>>>>
> > >>>>>>   .. note::
> > >>>>>> @@ -666,8 +661,9 @@ three events are available:
> > >>>>>>
> > >>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> > >>>>>>      Notify the application that the recovery from error is successful,
> > >>>>>> -   the PMD already re-configures the port,
> > >>>>>> -   and the effect is the same as a restart operation.
> > >>>>>> +   the PMD already re-configures the port.
> > >>>>>> +   The application should restore some additional configuration,
> > >>>>>> and then
> > >>>>>> +   enable data path API invocation.
> > >>>>>>
> > >>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> > >>>>>>      Notify the application that the recovery from error failed,
> > >>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> > >>>>>> index 0be1e8ca04..f994653fe9 100644
> > >>>>>> --- a/lib/ethdev/ethdev_driver.c
> > >>>>>> +++ b/lib/ethdev/ethdev_driver.c
> > >>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> > >>>>>> *dev, const char *ring_name,
> > >>>>>>       return rc;
> > >>>>>>   }
> > >>>>>>
> > >>>>>> +void
> > >>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> > >>>>>> +{
> > >>>>>> +    if (dev == NULL)
> > >>>>>> +        return;
> > >>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> > >>>>>> +}
> > >>>>>> +
> > >>>>>>   const struct rte_memzone *
> > >>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> > >>>>>> *ring_name,
> > >>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> > >>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > >>>>>> index 2c9d615fb5..0d964d1f67 100644
> > >>>>>> --- a/lib/ethdev/ethdev_driver.h
> > >>>>>> +++ b/lib/ethdev/ethdev_driver.h
> > >>>>>> @@ -1621,6 +1621,16 @@ int
> > >>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> > >>>>>> char *name,
> > >>>>>>            uint16_t queue_id);
> > >>>>>>
> > >>>>>> +/**
> > >>>>>> + * @internal
> > >>>>>> + * Setup eth fast-path API to ethdev values.
> > >>>>>> + *
> > >>>>>> + * @param dev
> > >>>>>> + *  Pointer to struct rte_eth_dev.
> > >>>>>> + */
> > >>>>>> +__rte_internal
> > >>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> > >>>>>> +
> > >>>>>>   /**
> > >>>>>>    * @internal
> > >>>>>>    * Atomically set the link status for the specific device.
> > >>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > >>>>>> index 049641d57c..44ee7229c1 100644
> > >>>>>> --- a/lib/ethdev/rte_ethdev.h
> > >>>>>> +++ b/lib/ethdev/rte_ethdev.h
> > >>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> > >>>>>>        */
> > >>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> > >>>>>>       /** Port recovering from a hardware or firmware error.
> > >>>>>> -     * If PMD supports proactive error recovery,
> > >>>>>> -     * it should trigger this event to notify application
> > >>>>>> -     * that it detected an error and the recovery is being started.
> > >>>>>> -     * Upon receiving the event, the application should not invoke
> > >>>>>> any control path API
> > >>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> > >>>>>> receiving
> > >>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> > >>>>>> -     * The PMD will set the data path pointers to dummy functions,
> > >>>>>> -     * and re-set the data path pointers to non-dummy functions
> > >>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>> -     * It means that the application cannot send or receive any
> > >>>>>> packets
> > >>>>>> -     * during this period.
> > >>>>>> +     *
> > >>>>>> +     * If PMD supports proactive error recovery, it should trigger
> > >>>>>> this
> > >>>>>> +     * event to notify application that it detected an error and the
> > >>>>>> +     * recovery is about to start.
> > >>>>>> +     *
> > >>>>>> +     * Upon receiving the event, the application should not invoke any
> > >>>>>> +     * control and data path API until receiving
> > >>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> > >>>>>> +     * event.
> > >>>>>> +     *
> > >>>>>> +     * Once this event is reported, the PMD will set the data path
> > >>>>>> pointers
> > >>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> > >>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> > >>>>>> event.
> > >>>>>> +     *
> > >>>>>>        * @note Before the PMD reports the recovery result,
> > >>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> > >>>>>> again,
> > >>>>>>        * because a larger error may occur during the recovery.
> > >>>>>>        */
> > >>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> > >>>>>>       /** Port recovers successfully from the error.
> > >>>>>> -     * The PMD already re-configured the port,
> > >>>>>> -     * and the effect is the same as a restart operation.
> > >>>>>> +     *
> > >>>>>> +     * The PMD already re-configured the port:
> > >>>>>>        * a) The following operation will be retained: (alphabetically)
> > >>>>>>        *    - DCB configuration
> > >>>>>>        *    - FEC configuration
> > >>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> > >>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> > >>>>>>        * c) Any other configuration will not be stored
> > >>>>>>        *    and will need to be re-configured.
> > >>>>>> +     *
> > >>>>>> +     * The application should restore some additional configuration
> > >>>>>> +     * (see above case b/c), and then enable data path API invocation.
> > >>>>>>        */
> > >>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> > >>>>>>       /** Port recovery failed.
> > >>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > >>>>>> index 357d1a88c0..c273e0bdae 100644
> > >>>>>> --- a/lib/ethdev/version.map
> > >>>>>> +++ b/lib/ethdev/version.map
> > >>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> > >>>>>>       rte_eth_devices;
> > >>>>>>       rte_eth_dma_zone_free;
> > >>>>>>       rte_eth_dma_zone_reserve;
> > >>>>>> +    rte_eth_fp_ops_setup;
> > >>>>>>       rte_eth_hairpin_queue_peer_bind;
> > >>>>>>       rte_eth_hairpin_queue_peer_unbind;
> > >>>>>>       rte_eth_hairpin_queue_peer_update;
> > >>>>>> --
> > >>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > >>>>>
> > >>>>>> 2.17.1
> > >>>>>
> > >>>>
> > >>>
> > >
> >

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-05 15:23         ` Konstantin Ananyev
@ 2023-03-07  5:34           ` Honnappa Nagarahalli
  2023-03-07  8:39             ` fengchengwen
  2023-03-07  9:56             ` Konstantin Ananyev
  0 siblings, 2 replies; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-07  5:34 UTC (permalink / raw)
  To: Konstantin Ananyev, dev, Chengwen Feng, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd



> -----Original Message-----
> From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> Sent: Sunday, March 5, 2023 9:24 AM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> dev@dpdk.org; Chengwen Feng <fengchengwen@huawei.com>;
> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> anakkur.purayil@broadcom.com>; Ajit Khaparde
> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> Cc: nd <nd@arm.com>
> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> 
> >>>>
> >>>> In the proactive error handling mode, the PMD will set the data
> >>>> path pointers to dummy functions and then try recovery, in this
> >>>> period the application may still invoking data path API. This will
> >>>> introduce a race-condition with data path which may lead to crash [1].
> >>>>
> >>>> Although the PMD added delay after setting data path pointers to
> >>>> cover the above race-condition, it reduces the probability, but it
> >>>> doesn't solve the problem.
> >>>>
> >>>> To solve the race-condition problem fundamentally, the following
> >>>> requirements are added:
> >>>> 1. The PMD should set the data path pointers to dummy functions after
> >>>>      report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>> Do you mean to say, PMD should set the data path pointers after
> >>> calling the
> >> call back function?
> >>> The PMD is running in the context of multiple EAL threads. How do
> >>> these
> >> threads synchronize such that only one thread sets these data pointers?
> >>
> >> As I understand this event callback supposed to be called in the
> >> context of EAL interrupt thread (whoever is more familiar with
> >> original idea, feel free to correct me if I missed something).
> > I could not figure this out. It looks to be called from the data plane thread
> context.
> > I also have a thought on alternate design at the end, appreciate if you can
> take a look.
> >
> >> How it is going to signal data-path threads that they need to
> >> stop/suspend calling data-path API - that's I suppose is left to application
> to decide...
> >> Same as right now it is application responsibility to stop data-path
> >> threads before doing dev_stop()/dev/_config()/etc.
> > Ok, good, this expectation is not new. The application must have a
> mechanism already.
> >
> >>
> >>
> >>>
> >>>> 2. The application should stop data path API invocation when process
> >>>>      the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>> Any thoughts on how an application can do this?
> > We can ignore this question as there is already similar expectation set for
> earlier functionalities.
> >
> >>>
> >>>> 3. The PMD should set the data path pointers to valid functions before
> >>>>      report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>> 4. The application should enable data path API invocation when process
> >>>>      the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>> Do you mean to say that the application should not call the datapath
> >>> APIs
> >> while the PMD is running the recovery process?
> >>
> >> Yes, I believe that's the intention.
> > Ok, this is good and makes sense.
> >
> >>
> >>>>
> >>>> Also, this patch introduce a driver internal function
> >>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>
> >>>> [1]
> >>>>
> >>
> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
> >>>> -
> >>>> ashok.k.kaladi@intel.com/
> >>>>
> >>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>> Cc: stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>> ---
> >>>>    doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>    lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>    lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>    lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
> >>>>    lib/ethdev/version.map                  |  1 +
> >>>>    5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>
> >>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> index c145a9066c..e380ff135a 100644
> >>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>> @@ -638,14 +638,9 @@ different from the application invokes
> >>>> recovery in PASSIVE mode,  the PMD automatically recovers from
> >>>> error in PROACTIVE mode,  and only a small amount of work is
> >>>> required for the
> >> application.
> >>>>
> >>>> -During error detection and automatic recovery, -the PMD sets the
> >>>> data path pointers to dummy functions -(which will prevent the
> >>>> crash), -and also make sure the control path operations fail with a
> >>>> return
> >> code ``-EBUSY``.
> >>>> -
> >>>> -Because the PMD recovers automatically, -the application can only
> >>>> sense that the data flow is disconnected for a while -and the
> >>>> control API returns an error in this period.
> >>>> +During error detection and automatic recovery, the PMD sets the
> >>>> +data path pointers to dummy functions and also make sure the
> >>>> +control path operations failed with a return code ``-EBUSY``.
> >>>>
> >>>>    In order to sense the error happening/recovering,  as well as to
> >>>> restore some additional configuration, @@ -653,9 +648,9 @@ three
> >>>> events
> >> are available:
> >>>>
> >>>>    ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>       Notify the application that an error is detected
> >>>> -   and the recovery is being started.
> >>>> +   and the recovery is about to start.
> >>>>       Upon receiving the event, the application should not invoke
> >>>> -   any control path function until receiving
> >>>> +   any control and data path API until receiving
> >>>>       ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>
> >>>>    .. note::
> >>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>
> >>>>    ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>       Notify the application that the recovery from error is successful,
> >>>> -   the PMD already re-configures the port,
> >>>> -   and the effect is the same as a restart operation.
> >>>> +   the PMD already re-configures the port.
> >>>> +   The application should restore some additional configuration,
> >>>> + and then
> >>> What is the additional configuration? Is this specific to each NIC/PMD?
> >>> I thought, this is an auto recovery process and the application does
> >>> not require
> >> to reconfigure anything. If the application has to restore the
> >> configuration, how does auto recovery differ from typical recovery
> process?
> >>>
> >>>> +   enable data path API invocation.
> >>>>
> >>>>    ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>       Notify the application that the recovery from error failed,
> >>>> diff --git a/lib/ethdev/ethdev_driver.c
> >>>> b/lib/ethdev/ethdev_driver.c index
> >>>> 0be1e8ca04..f994653fe9 100644
> >>>> --- a/lib/ethdev/ethdev_driver.c
> >>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct
> rte_eth_dev
> >>>> *dev, const char *ring_name,
> >>>>    	return rc;
> >>>>    }
> >>>>
> >>>> +void
> >>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
> >>>> +	if (dev == NULL)
> >>>> +		return;
> >>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
> >>>> +
> >>>>    const struct rte_memzone *
> >>>>    rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const
> >>>> char *ring_name,
> >>>>    			 uint16_t queue_id, size_t size, unsigned int align, diff
> -
> >> -git
> >>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
> >>>> 2c9d615fb5..0d964d1f67 100644
> >>>> --- a/lib/ethdev/ethdev_driver.h
> >>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>> @@ -1621,6 +1621,16 @@ int
> >>>>    rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>> char *name,
> >>>>    		 uint16_t queue_id);
> >>>>
> >>>> +/**
> >>>> + * @internal
> >>>> + * Setup eth fast-path API to ethdev values.
> >>>> + *
> >>>> + * @param dev
> >>>> + *  Pointer to struct rte_eth_dev.
> >>>> + */
> >>>> +__rte_internal
> >>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>> +
> >>>>    /**
> >>>>     * @internal
> >>>>     * Atomically set the link status for the specific device.
> >>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>> index
> >>>> 049641d57c..44ee7229c1 100644
> >>>> --- a/lib/ethdev/rte_ethdev.h
> >>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>    	 */
> >>>>    	RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>    	/** Port recovering from a hardware or firmware error.
> >>>> -	 * If PMD supports proactive error recovery,
> >>>> -	 * it should trigger this event to notify application
> >>>> -	 * that it detected an error and the recovery is being started.
> >>>> -	 * Upon receiving the event, the application should not invoke any
> >>>> control path API
> >>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> >>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>> -	 * The PMD will set the data path pointers to dummy functions,
> >>>> -	 * and re-set the data path pointers to non-dummy functions
> >>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>> -	 * It means that the application cannot send or receive any packets
> >>>> -	 * during this period.
> >>>> +	 *
> >>>> +	 * If PMD supports proactive error recovery, it should trigger this
> >>>> +	 * event to notify application that it detected an error and the
> >>>> +	 * recovery is about to start.
> >>>> +	 *
> >>>> +	 * Upon receiving the event, the application should not invoke any
> >>>> +	 * control and data path API until receiving
> >>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>> RTE_ETH_EVENT_RECOVERY_FAILED
> >>>> +	 * event.
> >>>> +	 *
> >>>> +	 * Once this event is reported, the PMD will set the data path pointers
> >>>> +	 * to dummy functions, and re-set the data path pointers to valid
> >>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>> event.
> >>> Why do we need to set the data path pointers to dummy functions if
> >>> the
> >> application is restricted from invoking any control and data path
> >> APIs till the recovery process is completed?
> >>
> >> You are right, in theory it is not mandatory.
> >> Though it helps to flag a problem if user will still try to call them
> >> while recovery is in progress.
> > Ok, may be in debug mode.
> > I mean, we have already set an expectation to the application that it should
> not call and the application has implemented a method to do the same. Why
> do we need to complicate this?
> > If the application calls the APIs, it is a programming error.
> 
> 
> My preference would be to keep it this way for both debug and non-debug
> mode.
> It doesn't cost anything to us in terms of perfomance, but helps to catch
> problems with wrong behaving app.

This is also causing a synchronization problem. i.e. if this has to be done correctly, we need to use correct synchronization mechanisms.
We cannot set the function pointers and assume that data will be visible to other threads/cores in the correct order.
A possible mechanism (though I see some problems with this) could be to use a guard variable, which indicates when it is safe to use the function pointers on the data plane threads. This would require a load-acquire in the data plane threads.

> 
> >
> >> Again, same as we doing in dev_stop().
> >
> >>
> >>>
> >>>> +	 *
> >>>>    	 * @note Before the PMD reports the recovery result,
> >>>>    	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>> again,
> >>>>    	 * because a larger error may occur during the recovery.
> >>>>    	 */
> >>>>    	RTE_ETH_EVENT_ERR_RECOVERING,
> >>> I understand this is not a change in this patch. But, just
> >>> wondering, what is the
> >> purpose of this? How is the application supposed to use this?
> >>>
> >>>>    	/** Port recovers successfully from the error.
> >>>> -	 * The PMD already re-configured the port,
> >>>> -	 * and the effect is the same as a restart operation.
> >>>> +	 *
> >>>> +	 * The PMD already re-configured the port:
> >>>>    	 * a) The following operation will be retained: (alphabetically)
> >>>>    	 *    - DCB configuration
> >>>>    	 *    - FEC configuration
> >>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>    	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>    	 * c) Any other configuration will not be stored
> >>>>    	 *    and will need to be re-configured.
> >>>> +	 *
> >>>> +	 * The application should restore some additional configuration
> >>>> +	 * (see above case b/c), and then enable data path API invocation.
> >>>>    	 */
> >>>>    	RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>    	/** Port recovery failed.
> >>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> >>>> 357d1a88c0..c273e0bdae 100644
> >>>> --- a/lib/ethdev/version.map
> >>>> +++ b/lib/ethdev/version.map
> >>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>    	rte_eth_devices;
> >>>>    	rte_eth_dma_zone_free;
> >>>>    	rte_eth_dma_zone_reserve;
> >>>> +	rte_eth_fp_ops_setup;
> >>>>    	rte_eth_hairpin_queue_peer_bind;
> >>>>    	rte_eth_hairpin_queue_peer_unbind;
> >>>>    	rte_eth_hairpin_queue_peer_update;
> >>>> --
> >>>> 2.17.1
> >>>
> >
> > Is there any reason not to design this in the same way as
> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> 
> I suppose it is a question for the authors of original patch...
Appreciate if the authors could comment on this.

> 
> > We could have a similar API 'rte_eth_dev_recover' to do the recovery
> functionality.
> 
> I suppose such approach is also possible.
> Personally I am fine with both ways: either existing one or what you propose,
> as long as we'll fix existing race-condition.
> What is good with what you suggest - that way we probably don't need to
> worry how to allow user to enable/disable auto-recovery inside PMD.
> 
> Konstantin
> 


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-06 11:13                 ` Konstantin Ananyev
@ 2023-03-07  8:25                   ` fengchengwen
  2023-03-07  9:52                     ` Konstantin Ananyev
  2023-03-07 12:07                     ` Ferruh Yigit
  0 siblings, 2 replies; 85+ messages in thread
From: fengchengwen @ 2023-03-07  8:25 UTC (permalink / raw)
  To: Konstantin Ananyev, Ajit Khaparde, Ferruh Yigit
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev



On 2023/3/6 19:13, Konstantin Ananyev wrote:
> 
> 
>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>>>>> application may still invoking data path API. This will introduce a
>>>>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>>>>
>>>>>>>>> Although the PMD added delay after setting data path pointers to cover
>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>>>>> solve the problem.
>>>>>>>>>
>>>>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>>>>> requirements are added:
>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>> 2. The application should stop data path API invocation when process
>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>> 4. The application should enable data path API invocation when process
>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>
>>>>>>>
>>>>>>> How this is solving the race-condition, by pushing responsibility to
>>>>>>> stop data path to application?
>>>>>>
>>>>>> Exactly, it becomes application responsibility to make sure data-path is
>>>>>> stopped/suspended before recovery will continue.
>>>>>>
>>>>>
>>>>> From documentation of the feature:
>>>>>
>>>>> ``
>>>>> Because the PMD recovers automatically,
>>>>> the application can only sense that the data flow is disconnected for a
>>>>> while and the control API returns an error in this period.
>>>>>
>>>>> In order to sense the error happening/recovering, as well as to restore
>>>>> some additional configuration, three events are available:
>>>>> ``
>>>>>
>>>>> It looks like initial design is to use events mainly inform application
>>>>> about what happened and mainly for re-configuration.
>>>>>
>>>>> Although I am don't disagree to involve the application, I am not sure
>>>>> that is part of current design.
>>>>
>>>> I thought we all agreed that initial design contain some fallacies that
>>>> need to fixed, no?
>>>> Statement that with current rte_ethdev design error recovery can be done
>>>> without interaction with the app (to stop/suspend data/control path)
>>>> is the main one I think.
>>>> It needs some interaction with app layer, one way or another.
>>>>
>>>>>>>
>>>>>>> What if application is not interested in recovery modes at all and not
>>>>>>> registered any callback for the recovery?
>>>>>>
>>>>>>
>>>>>> Are you saying there is no way for application to disable
>>>>>> automatic recovery in PMD if it is not interested
>>>>>> (or can't full-fill per-requesties for it)?
>>>>>> If so, then yes it is a problem and we need to fix it.
>>>>>> I assumed that such mechanism to disable unwanted events already exists,
>>>>>> but I can't find anything.
>>>>>> Wonder what would be the easiest way here - can PMD make a decision
>>>>>> based on callback return value, or do we need a new API to
>>>>>> enable/disable callbacks, or ...?
>>>>>>
>>>>>>
>>>>>
>>>>> As far as I can see automatic recovery is not configurable by app.
>>>>>
>>>>> But that is not all, PMD sends events to application but PMD can't know
>>>>> if application is handling them or not, so with current design PMD can't
>>>>> rely on to app.
>>>>
>>>> Well, PMD invokes user provided callback.
>>>> One way to fix that problem - if there is no callback provided,
>>>> or callback returns an error code - PMD can assume that recovery
>>>> should not be done.
>>>> That is probably not the best design choice, but at least it will allow
>>>> to fix the problem without too many changes and introducing new API.
>>>> That could be sort of a 'quick fix'.
>>>> In a meanwhile we can think about new/better approach for that.
>>>>
>>>
>>> -rc2 for 23.03 is a few days away.
>>>
>>> What do you think to have 'quick fix' as modifying how driver updates
>>> burst ops to prevent the race condition, for this release?

The 'quick fix', do you mean only update function pointer (without rxq setting) ?
Currently the PMDs which announced support "proactive error handling mode" already
do this.

>>>
>>> And plan a design update for the next release?
>> +1 on the overall approach.
> 
> Yep, agree.

Hope for better solution.
And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
registers RTE_ETH_EVENT_INTR_RESET callback .

Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.

>  
>>
>>>
>>>
>>>>>
>>>>>>> I think driver should not rely on application for this, unless
>>>>>>> application explicitly says (to driver) that it is handling recovery,
>>>>>>> right now there is no way for driver to know this.
>>>>>>
>>>>>> I think it is visa-versa:
>>>>>> application should not enable auto-recovery if it can't meet
>>>>>> per-requeststies for it (provide appropriate callback).
>>>>>>
>>>>>
>>>>> I agree on above, we are saying similar thing in different perspective.
>>>>
>>>> Ok, that's good we are on the same page.
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>> Also, this patch introduce a driver internal function
>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>>>>>>
>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>>
>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>>>>> ---
>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>>>>>>> +++++++++++++++----------
>>>>>>>>>   lib/ethdev/version.map                  |  1 +
>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>> index c145a9066c..e380ff135a 100644
>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>>>>> in PASSIVE mode,
>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>>>>>>   and only a small amount of work is required for the application.
>>>>>>>>>
>>>>>>>>> -During error detection and automatic recovery,
>>>>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>>>>> -(which will prevent the crash),
>>>>>>>>> -and also make sure the control path operations fail with a return
>>>>>>>>> code ``-EBUSY``.
>>>>>>>>> -
>>>>>>>>> -Because the PMD recovers automatically,
>>>>>>>>> -the application can only sense that the data flow is disconnected
>>>>>>>>> for a while
>>>>>>>>> -and the control API returns an error in this period.
>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>>>>> data path
>>>>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>>>>> operations
>>>>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>>>>
>>>>>>>>>   In order to sense the error happening/recovering,
>>>>>>>>>   as well as to restore some additional configuration,
>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>>>>
>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>>>>      Notify the application that an error is detected
>>>>>>>>> -   and the recovery is being started.
>>>>>>>>> +   and the recovery is about to start.
>>>>>>>>>      Upon receiving the event, the application should not invoke
>>>>>>>>> -   any control path function until receiving
>>>>>>>>> +   any control and data path API until receiving
>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>>>>
>>>>>>>>>   .. note::
>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>>>>
>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>>>>      Notify the application that the recovery from error is successful,
>>>>>>>>> -   the PMD already re-configures the port,
>>>>>>>>> -   and the effect is the same as a restart operation.
>>>>>>>>> +   the PMD already re-configures the port.
>>>>>>>>> +   The application should restore some additional configuration,
>>>>>>>>> and then
>>>>>>>>> +   enable data path API invocation.
>>>>>>>>>
>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>>>>      Notify the application that the recovery from error failed,
>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>>>>> *dev, const char *ring_name,
>>>>>>>>>       return rc;
>>>>>>>>>   }
>>>>>>>>>
>>>>>>>>> +void
>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>>>>> +{
>>>>>>>>> +    if (dev == NULL)
>>>>>>>>> +        return;
>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>   const struct rte_memzone *
>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>>>>> *ring_name,
>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>>>>> char *name,
>>>>>>>>>            uint16_t queue_id);
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * @internal
>>>>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>>>>> + *
>>>>>>>>> + * @param dev
>>>>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>>>>> + */
>>>>>>>>> +__rte_internal
>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>>>>> +
>>>>>>>>>   /**
>>>>>>>>>    * @internal
>>>>>>>>>    * Atomically set the link status for the specific device.
>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>>>>        */
>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>>>>       /** Port recovering from a hardware or firmware error.
>>>>>>>>> -     * If PMD supports proactive error recovery,
>>>>>>>>> -     * it should trigger this event to notify application
>>>>>>>>> -     * that it detected an error and the recovery is being started.
>>>>>>>>> -     * Upon receiving the event, the application should not invoke
>>>>>>>>> any control path API
>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>>>>> receiving
>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>> -     * It means that the application cannot send or receive any
>>>>>>>>> packets
>>>>>>>>> -     * during this period.
>>>>>>>>> +     *
>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>>>>>>> this
>>>>>>>>> +     * event to notify application that it detected an error and the
>>>>>>>>> +     * recovery is about to start.
>>>>>>>>> +     *
>>>>>>>>> +     * Upon receiving the event, the application should not invoke any
>>>>>>>>> +     * control and data path API until receiving
>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>>>>> +     * event.
>>>>>>>>> +     *
>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
>>>>>>>>> pointers
>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>>>>> event.
>>>>>>>>> +     *
>>>>>>>>>        * @note Before the PMD reports the recovery result,
>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>>>>> again,
>>>>>>>>>        * because a larger error may occur during the recovery.
>>>>>>>>>        */
>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>>>>>       /** Port recovers successfully from the error.
>>>>>>>>> -     * The PMD already re-configured the port,
>>>>>>>>> -     * and the effect is the same as a restart operation.
>>>>>>>>> +     *
>>>>>>>>> +     * The PMD already re-configured the port:
>>>>>>>>>        * a) The following operation will be retained: (alphabetically)
>>>>>>>>>        *    - DCB configuration
>>>>>>>>>        *    - FEC configuration
>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>>>>        * c) Any other configuration will not be stored
>>>>>>>>>        *    and will need to be re-configured.
>>>>>>>>> +     *
>>>>>>>>> +     * The application should restore some additional configuration
>>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
>>>>>>>>>        */
>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>>>>       /** Port recovery failed.
>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>>>>> --- a/lib/ethdev/version.map
>>>>>>>>> +++ b/lib/ethdev/version.map
>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>>>>       rte_eth_devices;
>>>>>>>>>       rte_eth_dma_zone_free;
>>>>>>>>>       rte_eth_dma_zone_reserve;
>>>>>>>>> +    rte_eth_fp_ops_setup;
>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
>>>>>>>>> --
>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>>>>>>
>>>>>>>>> 2.17.1
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  5:34           ` Honnappa Nagarahalli
@ 2023-03-07  8:39             ` fengchengwen
  2023-03-08  1:09               ` Honnappa Nagarahalli
  2023-03-07  9:56             ` Konstantin Ananyev
  1 sibling, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-07  8:39 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev, dev, thomas,
	Ferruh Yigit, Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd



On 2023/3/7 13:34, Honnappa Nagarahalli wrote:
> 
> 
>> -----Original Message-----
>> From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
>> Sent: Sunday, March 5, 2023 9:24 AM
>> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
>> dev@dpdk.org; Chengwen Feng <fengchengwen@huawei.com>;
>> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
>> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
>> anakkur.purayil@broadcom.com>; Ajit Khaparde
>> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
>> Cc: nd <nd@arm.com>
>> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
>> mode
>>
>>
>>>>>>
>>>>>> In the proactive error handling mode, the PMD will set the data
>>>>>> path pointers to dummy functions and then try recovery, in this
>>>>>> period the application may still invoking data path API. This will
>>>>>> introduce a race-condition with data path which may lead to crash [1].
>>>>>>
>>>>>> Although the PMD added delay after setting data path pointers to
>>>>>> cover the above race-condition, it reduces the probability, but it
>>>>>> doesn't solve the problem.
>>>>>>
>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>> requirements are added:
>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>      report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>> Do you mean to say, PMD should set the data path pointers after
>>>>> calling the
>>>> call back function?
>>>>> The PMD is running in the context of multiple EAL threads. How do
>>>>> these
>>>> threads synchronize such that only one thread sets these data pointers?
>>>>
>>>> As I understand this event callback supposed to be called in the
>>>> context of EAL interrupt thread (whoever is more familiar with
>>>> original idea, feel free to correct me if I missed something).
>>> I could not figure this out. It looks to be called from the data plane thread
>> context.
>>> I also have a thought on alternate design at the end, appreciate if you can
>> take a look.
>>>
>>>> How it is going to signal data-path threads that they need to
>>>> stop/suspend calling data-path API - that's I suppose is left to application
>> to decide...
>>>> Same as right now it is application responsibility to stop data-path
>>>> threads before doing dev_stop()/dev/_config()/etc.
>>> Ok, good, this expectation is not new. The application must have a
>> mechanism already.
>>>
>>>>
>>>>
>>>>>
>>>>>> 2. The application should stop data path API invocation when process
>>>>>>      the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>> Any thoughts on how an application can do this?
>>> We can ignore this question as there is already similar expectation set for
>> earlier functionalities.
>>>
>>>>>
>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>>      report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> 4. The application should enable data path API invocation when process
>>>>>>      the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>> Do you mean to say that the application should not call the datapath
>>>>> APIs
>>>> while the PMD is running the recovery process?
>>>>
>>>> Yes, I believe that's the intention.
>>> Ok, this is good and makes sense.
>>>
>>>>
>>>>>>
>>>>>> Also, this patch introduce a driver internal function
>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>
>>>>>> [1]
>>>>>>
>>>>
>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
>>>>>> -
>>>>>> ashok.k.kaladi@intel.com/
>>>>>>
>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>> Cc: stable@dpdk.org
>>>>>>
>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>> ---
>>>>>>    doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>    lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>    lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>    lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>>>>    lib/ethdev/version.map                  |  1 +
>>>>>>    5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>
>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> index c145a9066c..e380ff135a 100644
>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> @@ -638,14 +638,9 @@ different from the application invokes
>>>>>> recovery in PASSIVE mode,  the PMD automatically recovers from
>>>>>> error in PROACTIVE mode,  and only a small amount of work is
>>>>>> required for the
>>>> application.
>>>>>>
>>>>>> -During error detection and automatic recovery, -the PMD sets the
>>>>>> data path pointers to dummy functions -(which will prevent the
>>>>>> crash), -and also make sure the control path operations fail with a
>>>>>> return
>>>> code ``-EBUSY``.
>>>>>> -
>>>>>> -Because the PMD recovers automatically, -the application can only
>>>>>> sense that the data flow is disconnected for a while -and the
>>>>>> control API returns an error in this period.
>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>> +data path pointers to dummy functions and also make sure the
>>>>>> +control path operations failed with a return code ``-EBUSY``.
>>>>>>
>>>>>>    In order to sense the error happening/recovering,  as well as to
>>>>>> restore some additional configuration, @@ -653,9 +648,9 @@ three
>>>>>> events
>>>> are available:
>>>>>>
>>>>>>    ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>       Notify the application that an error is detected
>>>>>> -   and the recovery is being started.
>>>>>> +   and the recovery is about to start.
>>>>>>       Upon receiving the event, the application should not invoke
>>>>>> -   any control path function until receiving
>>>>>> +   any control and data path API until receiving
>>>>>>       ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>
>>>>>>    .. note::
>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>
>>>>>>    ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>       Notify the application that the recovery from error is successful,
>>>>>> -   the PMD already re-configures the port,
>>>>>> -   and the effect is the same as a restart operation.
>>>>>> +   the PMD already re-configures the port.
>>>>>> +   The application should restore some additional configuration,
>>>>>> + and then
>>>>> What is the additional configuration? Is this specific to each NIC/PMD?
>>>>> I thought, this is an auto recovery process and the application does
>>>>> not require
>>>> to reconfigure anything. If the application has to restore the
>>>> configuration, how does auto recovery differ from typical recovery
>> process?
>>>>>
>>>>>> +   enable data path API invocation.
>>>>>>
>>>>>>    ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>       Notify the application that the recovery from error failed,
>>>>>> diff --git a/lib/ethdev/ethdev_driver.c
>>>>>> b/lib/ethdev/ethdev_driver.c index
>>>>>> 0be1e8ca04..f994653fe9 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct
>> rte_eth_dev
>>>>>> *dev, const char *ring_name,
>>>>>>    	return rc;
>>>>>>    }
>>>>>>
>>>>>> +void
>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
>>>>>> +	if (dev == NULL)
>>>>>> +		return;
>>>>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
>>>>>> +
>>>>>>    const struct rte_memzone *
>>>>>>    rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const
>>>>>> char *ring_name,
>>>>>>    			 uint16_t queue_id, size_t size, unsigned int align, diff
>> -
>>>> -git
>>>>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
>>>>>> 2c9d615fb5..0d964d1f67 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>    rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>> char *name,
>>>>>>    		 uint16_t queue_id);
>>>>>>
>>>>>> +/**
>>>>>> + * @internal
>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>> + *
>>>>>> + * @param dev
>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>> + */
>>>>>> +__rte_internal
>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>> +
>>>>>>    /**
>>>>>>     * @internal
>>>>>>     * Atomically set the link status for the specific device.
>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>> index
>>>>>> 049641d57c..44ee7229c1 100644
>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>    	 */
>>>>>>    	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>    	/** Port recovering from a hardware or firmware error.
>>>>>> -	 * If PMD supports proactive error recovery,
>>>>>> -	 * it should trigger this event to notify application
>>>>>> -	 * that it detected an error and the recovery is being started.
>>>>>> -	 * Upon receiving the event, the application should not invoke any
>>>>>> control path API
>>>>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>>>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>> -	 * The PMD will set the data path pointers to dummy functions,
>>>>>> -	 * and re-set the data path pointers to non-dummy functions
>>>>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> -	 * It means that the application cannot send or receive any packets
>>>>>> -	 * during this period.
>>>>>> +	 *
>>>>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>>>>> +	 * event to notify application that it detected an error and the
>>>>>> +	 * recovery is about to start.
>>>>>> +	 *
>>>>>> +	 * Upon receiving the event, the application should not invoke any
>>>>>> +	 * control and data path API until receiving
>>>>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>> +	 * event.
>>>>>> +	 *
>>>>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>>>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>>>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>> event.
>>>>> Why do we need to set the data path pointers to dummy functions if
>>>>> the
>>>> application is restricted from invoking any control and data path
>>>> APIs till the recovery process is completed?
>>>>
>>>> You are right, in theory it is not mandatory.
>>>> Though it helps to flag a problem if user will still try to call them
>>>> while recovery is in progress.
>>> Ok, may be in debug mode.
>>> I mean, we have already set an expectation to the application that it should
>> not call and the application has implemented a method to do the same. Why
>> do we need to complicate this?
>>> If the application calls the APIs, it is a programming error.
>>
>>
>> My preference would be to keep it this way for both debug and non-debug
>> mode.
>> It doesn't cost anything to us in terms of perfomance, but helps to catch
>> problems with wrong behaving app.
> 
> This is also causing a synchronization problem. i.e. if this has to be done correctly, we need to use correct synchronization mechanisms.
> We cannot set the function pointers and assume that data will be visible to other threads/cores in the correct order.
> A possible mechanism (though I see some problems with this) could be to use a guard variable, which indicates when it is safe to use the function pointers on the data plane threads. This would require a load-acquire in the data plane threads.
> 
>>
>>>
>>>> Again, same as we doing in dev_stop().
>>>
>>>>
>>>>>
>>>>>> +	 *
>>>>>>    	 * @note Before the PMD reports the recovery result,
>>>>>>    	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>> again,
>>>>>>    	 * because a larger error may occur during the recovery.
>>>>>>    	 */
>>>>>>    	RTE_ETH_EVENT_ERR_RECOVERING,
>>>>> I understand this is not a change in this patch. But, just
>>>>> wondering, what is the
>>>> purpose of this? How is the application supposed to use this?
>>>>>
>>>>>>    	/** Port recovers successfully from the error.
>>>>>> -	 * The PMD already re-configured the port,
>>>>>> -	 * and the effect is the same as a restart operation.
>>>>>> +	 *
>>>>>> +	 * The PMD already re-configured the port:
>>>>>>    	 * a) The following operation will be retained: (alphabetically)
>>>>>>    	 *    - DCB configuration
>>>>>>    	 *    - FEC configuration
>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>    	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>    	 * c) Any other configuration will not be stored
>>>>>>    	 *    and will need to be re-configured.
>>>>>> +	 *
>>>>>> +	 * The application should restore some additional configuration
>>>>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>>>>    	 */
>>>>>>    	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>    	/** Port recovery failed.
>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>>>>>> 357d1a88c0..c273e0bdae 100644
>>>>>> --- a/lib/ethdev/version.map
>>>>>> +++ b/lib/ethdev/version.map
>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>    	rte_eth_devices;
>>>>>>    	rte_eth_dma_zone_free;
>>>>>>    	rte_eth_dma_zone_reserve;
>>>>>> +	rte_eth_fp_ops_setup;
>>>>>>    	rte_eth_hairpin_queue_peer_bind;
>>>>>>    	rte_eth_hairpin_queue_peer_unbind;
>>>>>>    	rte_eth_hairpin_queue_peer_update;
>>>>>> --
>>>>>> 2.17.1
>>>>>
>>>
>>> Is there any reason not to design this in the same way as
>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
>>
>> I suppose it is a question for the authors of original patch...
> Appreciate if the authors could comment on this.

The main cause is that the hardware implementation limit, I will try to
explain from hns3 PMD's view.
For a global reset, all the function need responsed within a centain period
of time. otherwise, the reset will fail. and also the reset requirement a few
steps (all may take a long time).

When with multiple functions in one DPDK, and trigger a global reset, the
rte_eth_dev_reset will not cover this scene:
1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
2. then invoke application callback, but due to the same thread, and each
    port's recover will take a long time, so later port will reset failed.

> 
>>
>>> We could have a similar API 'rte_eth_dev_recover' to do the recovery
>> functionality.
>>
>> I suppose such approach is also possible.
>> Personally I am fine with both ways: either existing one or what you propose,
>> as long as we'll fix existing race-condition.
>> What is good with what you suggest - that way we probably don't need to
>> worry how to allow user to enable/disable auto-recovery inside PMD.
>>
>> Konstantin
>>
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  8:25                   ` fengchengwen
@ 2023-03-07  9:52                     ` Konstantin Ananyev
  2023-03-07 10:11                       ` Konstantin Ananyev
  2023-03-07 12:07                     ` Ferruh Yigit
  1 sibling, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-07  9:52 UTC (permalink / raw)
  To: Fengchengwen, Ajit Khaparde, Ferruh Yigit
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev



> >
> >>>>>>>>> In the proactive error handling mode, the PMD will set the data path
> >>>>>>>>> pointers to dummy functions and then try recovery, in this period the
> >>>>>>>>> application may still invoking data path API. This will introduce a
> >>>>>>>>> race-condition with data path which may lead to crash [1].
> >>>>>>>>>
> >>>>>>>>> Although the PMD added delay after setting data path pointers to cover
> >>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
> >>>>>>>>> solve the problem.
> >>>>>>>>>
> >>>>>>>>> To solve the race-condition problem fundamentally, the following
> >>>>>>>>> requirements are added:
> >>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
> >>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>> 2. The application should stop data path API invocation when process
> >>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
> >>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>> 4. The application should enable data path API invocation when process
> >>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>
> >>>>>>>
> >>>>>>> How this is solving the race-condition, by pushing responsibility to
> >>>>>>> stop data path to application?
> >>>>>>
> >>>>>> Exactly, it becomes application responsibility to make sure data-path is
> >>>>>> stopped/suspended before recovery will continue.
> >>>>>>
> >>>>>
> >>>>> From documentation of the feature:
> >>>>>
> >>>>> ``
> >>>>> Because the PMD recovers automatically,
> >>>>> the application can only sense that the data flow is disconnected for a
> >>>>> while and the control API returns an error in this period.
> >>>>>
> >>>>> In order to sense the error happening/recovering, as well as to restore
> >>>>> some additional configuration, three events are available:
> >>>>> ``
> >>>>>
> >>>>> It looks like initial design is to use events mainly inform application
> >>>>> about what happened and mainly for re-configuration.
> >>>>>
> >>>>> Although I am don't disagree to involve the application, I am not sure
> >>>>> that is part of current design.
> >>>>
> >>>> I thought we all agreed that initial design contain some fallacies that
> >>>> need to fixed, no?
> >>>> Statement that with current rte_ethdev design error recovery can be done
> >>>> without interaction with the app (to stop/suspend data/control path)
> >>>> is the main one I think.
> >>>> It needs some interaction with app layer, one way or another.
> >>>>
> >>>>>>>
> >>>>>>> What if application is not interested in recovery modes at all and not
> >>>>>>> registered any callback for the recovery?
> >>>>>>
> >>>>>>
> >>>>>> Are you saying there is no way for application to disable
> >>>>>> automatic recovery in PMD if it is not interested
> >>>>>> (or can't full-fill per-requesties for it)?
> >>>>>> If so, then yes it is a problem and we need to fix it.
> >>>>>> I assumed that such mechanism to disable unwanted events already exists,
> >>>>>> but I can't find anything.
> >>>>>> Wonder what would be the easiest way here - can PMD make a decision
> >>>>>> based on callback return value, or do we need a new API to
> >>>>>> enable/disable callbacks, or ...?
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> As far as I can see automatic recovery is not configurable by app.
> >>>>>
> >>>>> But that is not all, PMD sends events to application but PMD can't know
> >>>>> if application is handling them or not, so with current design PMD can't
> >>>>> rely on to app.
> >>>>
> >>>> Well, PMD invokes user provided callback.
> >>>> One way to fix that problem - if there is no callback provided,
> >>>> or callback returns an error code - PMD can assume that recovery
> >>>> should not be done.
> >>>> That is probably not the best design choice, but at least it will allow
> >>>> to fix the problem without too many changes and introducing new API.
> >>>> That could be sort of a 'quick fix'.
> >>>> In a meanwhile we can think about new/better approach for that.
> >>>>
> >>>
> >>> -rc2 for 23.03 is a few days away.
> >>>
> >>> What do you think to have 'quick fix' as modifying how driver updates
> >>> burst ops to prevent the race condition, for this release?
> 
> The 'quick fix', do you mean only update function pointer (without rxq setting) ?
> Currently the PMDs which announced support "proactive error handling mode" already
> do this.

Really sorry guys, I was too fast on the keyboard, and didn't read properly what Ferruh suggested.
Reading it once again - no I don not agree with that.
It wouldn't fix anything, but will just add extra mess into the code.
Sorry again for the wrong reply.
Konstantin


> 
> >>>
> >>> And plan a design update for the next release?
> >> +1 on the overall approach.
> >
> > Yep, agree.
> 
> Hope for better solution.
> And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
> registers RTE_ETH_EVENT_INTR_RESET callback .
> 
> Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> 
> >
> >>
> >>>
> >>>
> >>>>>
> >>>>>>> I think driver should not rely on application for this, unless
> >>>>>>> application explicitly says (to driver) that it is handling recovery,
> >>>>>>> right now there is no way for driver to know this.
> >>>>>>
> >>>>>> I think it is visa-versa:
> >>>>>> application should not enable auto-recovery if it can't meet
> >>>>>> per-requeststies for it (provide appropriate callback).
> >>>>>>
> >>>>>
> >>>>> I agree on above, we are saying similar thing in different perspective.
> >>>>
> >>>> Ok, that's good we are on the same page.
> >>>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>> Also, this patch introduce a driver internal function
> >>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> >>>>>>>>>
> >>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>>>>>>> Cc: stable@dpdk.org
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>>>>>>> ---
> >>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> >>>>>>>>> +++++++++++++++----------
> >>>>>>>>>   lib/ethdev/version.map                  |  1 +
> >>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>> index c145a9066c..e380ff135a 100644
> >>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> >>>>>>>>> in PASSIVE mode,
> >>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> >>>>>>>>>   and only a small amount of work is required for the application.
> >>>>>>>>>
> >>>>>>>>> -During error detection and automatic recovery,
> >>>>>>>>> -the PMD sets the data path pointers to dummy functions
> >>>>>>>>> -(which will prevent the crash),
> >>>>>>>>> -and also make sure the control path operations fail with a return
> >>>>>>>>> code ``-EBUSY``.
> >>>>>>>>> -
> >>>>>>>>> -Because the PMD recovers automatically,
> >>>>>>>>> -the application can only sense that the data flow is disconnected
> >>>>>>>>> for a while
> >>>>>>>>> -and the control API returns an error in this period.
> >>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> >>>>>>>>> data path
> >>>>>>>>> +pointers to dummy functions and also make sure the control path
> >>>>>>>>> operations
> >>>>>>>>> +failed with a return code ``-EBUSY``.
> >>>>>>>>>
> >>>>>>>>>   In order to sense the error happening/recovering,
> >>>>>>>>>   as well as to restore some additional configuration,
> >>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> >>>>>>>>>
> >>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>>>>>>      Notify the application that an error is detected
> >>>>>>>>> -   and the recovery is being started.
> >>>>>>>>> +   and the recovery is about to start.
> >>>>>>>>>      Upon receiving the event, the application should not invoke
> >>>>>>>>> -   any control path function until receiving
> >>>>>>>>> +   any control and data path API until receiving
> >>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>>>>>>
> >>>>>>>>>   .. note::
> >>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>>>>>>
> >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>>>>>>      Notify the application that the recovery from error is successful,
> >>>>>>>>> -   the PMD already re-configures the port,
> >>>>>>>>> -   and the effect is the same as a restart operation.
> >>>>>>>>> +   the PMD already re-configures the port.
> >>>>>>>>> +   The application should restore some additional configuration,
> >>>>>>>>> and then
> >>>>>>>>> +   enable data path API invocation.
> >>>>>>>>>
> >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>>>>>>      Notify the application that the recovery from error failed,
> >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> >>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> >>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> >>>>>>>>> *dev, const char *ring_name,
> >>>>>>>>>       return rc;
> >>>>>>>>>   }
> >>>>>>>>>
> >>>>>>>>> +void
> >>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> >>>>>>>>> +{
> >>>>>>>>> +    if (dev == NULL)
> >>>>>>>>> +        return;
> >>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>>   const struct rte_memzone *
> >>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> >>>>>>>>> *ring_name,
> >>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> >>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>>>>>> @@ -1621,6 +1621,16 @@ int
> >>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>>>>>>> char *name,
> >>>>>>>>>            uint16_t queue_id);
> >>>>>>>>>
> >>>>>>>>> +/**
> >>>>>>>>> + * @internal
> >>>>>>>>> + * Setup eth fast-path API to ethdev values.
> >>>>>>>>> + *
> >>>>>>>>> + * @param dev
> >>>>>>>>> + *  Pointer to struct rte_eth_dev.
> >>>>>>>>> + */
> >>>>>>>>> +__rte_internal
> >>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>>>>>>> +
> >>>>>>>>>   /**
> >>>>>>>>>    * @internal
> >>>>>>>>>    * Atomically set the link status for the specific device.
> >>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>>>>>>> index 049641d57c..44ee7229c1 100644
> >>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>>>>>>        */
> >>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>>>>>>       /** Port recovering from a hardware or firmware error.
> >>>>>>>>> -     * If PMD supports proactive error recovery,
> >>>>>>>>> -     * it should trigger this event to notify application
> >>>>>>>>> -     * that it detected an error and the recovery is being started.
> >>>>>>>>> -     * Upon receiving the event, the application should not invoke
> >>>>>>>>> any control path API
> >>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> >>>>>>>>> receiving
> >>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
> >>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> >>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>> -     * It means that the application cannot send or receive any
> >>>>>>>>> packets
> >>>>>>>>> -     * during this period.
> >>>>>>>>> +     *
> >>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
> >>>>>>>>> this
> >>>>>>>>> +     * event to notify application that it detected an error and the
> >>>>>>>>> +     * recovery is about to start.
> >>>>>>>>> +     *
> >>>>>>>>> +     * Upon receiving the event, the application should not invoke any
> >>>>>>>>> +     * control and data path API until receiving
> >>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> >>>>>>>>> +     * event.
> >>>>>>>>> +     *
> >>>>>>>>> +     * Once this event is reported, the PMD will set the data path
> >>>>>>>>> pointers
> >>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> >>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>>>>>>> event.
> >>>>>>>>> +     *
> >>>>>>>>>        * @note Before the PMD reports the recovery result,
> >>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>>>>>>> again,
> >>>>>>>>>        * because a larger error may occur during the recovery.
> >>>>>>>>>        */
> >>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> >>>>>>>>>       /** Port recovers successfully from the error.
> >>>>>>>>> -     * The PMD already re-configured the port,
> >>>>>>>>> -     * and the effect is the same as a restart operation.
> >>>>>>>>> +     *
> >>>>>>>>> +     * The PMD already re-configured the port:
> >>>>>>>>>        * a) The following operation will be retained: (alphabetically)
> >>>>>>>>>        *    - DCB configuration
> >>>>>>>>>        *    - FEC configuration
> >>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>>>>>>        * c) Any other configuration will not be stored
> >>>>>>>>>        *    and will need to be re-configured.
> >>>>>>>>> +     *
> >>>>>>>>> +     * The application should restore some additional configuration
> >>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
> >>>>>>>>>        */
> >>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>>>>>>       /** Port recovery failed.
> >>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> >>>>>>>>> --- a/lib/ethdev/version.map
> >>>>>>>>> +++ b/lib/ethdev/version.map
> >>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>>>>>>       rte_eth_devices;
> >>>>>>>>>       rte_eth_dma_zone_free;
> >>>>>>>>>       rte_eth_dma_zone_reserve;
> >>>>>>>>> +    rte_eth_fp_ops_setup;
> >>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> >>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> >>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> >>>>>>>>> --
> >>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >>>>>>>>
> >>>>>>>>> 2.17.1
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  5:34           ` Honnappa Nagarahalli
  2023-03-07  8:39             ` fengchengwen
@ 2023-03-07  9:56             ` Konstantin Ananyev
  1 sibling, 0 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-07  9:56 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev, dev, Fengchengwen,
	thomas, Ferruh Yigit, Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd



> >
> >
> > >>>>
> > >>>> In the proactive error handling mode, the PMD will set the data
> > >>>> path pointers to dummy functions and then try recovery, in this
> > >>>> period the application may still invoking data path API. This will
> > >>>> introduce a race-condition with data path which may lead to crash [1].
> > >>>>
> > >>>> Although the PMD added delay after setting data path pointers to
> > >>>> cover the above race-condition, it reduces the probability, but it
> > >>>> doesn't solve the problem.
> > >>>>
> > >>>> To solve the race-condition problem fundamentally, the following
> > >>>> requirements are added:
> > >>>> 1. The PMD should set the data path pointers to dummy functions after
> > >>>>      report RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>> Do you mean to say, PMD should set the data path pointers after
> > >>> calling the
> > >> call back function?
> > >>> The PMD is running in the context of multiple EAL threads. How do
> > >>> these
> > >> threads synchronize such that only one thread sets these data pointers?
> > >>
> > >> As I understand this event callback supposed to be called in the
> > >> context of EAL interrupt thread (whoever is more familiar with
> > >> original idea, feel free to correct me if I missed something).
> > > I could not figure this out. It looks to be called from the data plane thread
> > context.
> > > I also have a thought on alternate design at the end, appreciate if you can
> > take a look.
> > >
> > >> How it is going to signal data-path threads that they need to
> > >> stop/suspend calling data-path API - that's I suppose is left to application
> > to decide...
> > >> Same as right now it is application responsibility to stop data-path
> > >> threads before doing dev_stop()/dev/_config()/etc.
> > > Ok, good, this expectation is not new. The application must have a
> > mechanism already.
> > >
> > >>
> > >>
> > >>>
> > >>>> 2. The application should stop data path API invocation when process
> > >>>>      the RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>> Any thoughts on how an application can do this?
> > > We can ignore this question as there is already similar expectation set for
> > earlier functionalities.
> > >
> > >>>
> > >>>> 3. The PMD should set the data path pointers to valid functions before
> > >>>>      report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>> 4. The application should enable data path API invocation when process
> > >>>>      the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>> Do you mean to say that the application should not call the datapath
> > >>> APIs
> > >> while the PMD is running the recovery process?
> > >>
> > >> Yes, I believe that's the intention.
> > > Ok, this is good and makes sense.
> > >
> > >>
> > >>>>
> > >>>> Also, this patch introduce a driver internal function
> > >>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> > >>>>
> > >>>> [1]
> > >>>>
> > >>
> > http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
> > >>>> -
> > >>>> ashok.k.kaladi@intel.com/
> > >>>>
> > >>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> > >>>> Cc: stable@dpdk.org
> > >>>>
> > >>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>>> ---
> > >>>>    doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> > >>>>    lib/ethdev/ethdev_driver.c              |  8 +++++++
> > >>>>    lib/ethdev/ethdev_driver.h              | 10 ++++++++
> > >>>>    lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
> > >>>>    lib/ethdev/version.map                  |  1 +
> > >>>>    5 files changed, 46 insertions(+), 25 deletions(-)
> > >>>>
> > >>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>> index c145a9066c..e380ff135a 100644
> > >>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>> @@ -638,14 +638,9 @@ different from the application invokes
> > >>>> recovery in PASSIVE mode,  the PMD automatically recovers from
> > >>>> error in PROACTIVE mode,  and only a small amount of work is
> > >>>> required for the
> > >> application.
> > >>>>
> > >>>> -During error detection and automatic recovery, -the PMD sets the
> > >>>> data path pointers to dummy functions -(which will prevent the
> > >>>> crash), -and also make sure the control path operations fail with a
> > >>>> return
> > >> code ``-EBUSY``.
> > >>>> -
> > >>>> -Because the PMD recovers automatically, -the application can only
> > >>>> sense that the data flow is disconnected for a while -and the
> > >>>> control API returns an error in this period.
> > >>>> +During error detection and automatic recovery, the PMD sets the
> > >>>> +data path pointers to dummy functions and also make sure the
> > >>>> +control path operations failed with a return code ``-EBUSY``.
> > >>>>
> > >>>>    In order to sense the error happening/recovering,  as well as to
> > >>>> restore some additional configuration, @@ -653,9 +648,9 @@ three
> > >>>> events
> > >> are available:
> > >>>>
> > >>>>    ``RTE_ETH_EVENT_ERR_RECOVERING``
> > >>>>       Notify the application that an error is detected
> > >>>> -   and the recovery is being started.
> > >>>> +   and the recovery is about to start.
> > >>>>       Upon receiving the event, the application should not invoke
> > >>>> -   any control path function until receiving
> > >>>> +   any control and data path API until receiving
> > >>>>       ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> > >>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> > >>>>
> > >>>>    .. note::
> > >>>> @@ -666,8 +661,9 @@ three events are available:
> > >>>>
> > >>>>    ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> > >>>>       Notify the application that the recovery from error is successful,
> > >>>> -   the PMD already re-configures the port,
> > >>>> -   and the effect is the same as a restart operation.
> > >>>> +   the PMD already re-configures the port.
> > >>>> +   The application should restore some additional configuration,
> > >>>> + and then
> > >>> What is the additional configuration? Is this specific to each NIC/PMD?
> > >>> I thought, this is an auto recovery process and the application does
> > >>> not require
> > >> to reconfigure anything. If the application has to restore the
> > >> configuration, how does auto recovery differ from typical recovery
> > process?
> > >>>
> > >>>> +   enable data path API invocation.
> > >>>>
> > >>>>    ``RTE_ETH_EVENT_RECOVERY_FAILED``
> > >>>>       Notify the application that the recovery from error failed,
> > >>>> diff --git a/lib/ethdev/ethdev_driver.c
> > >>>> b/lib/ethdev/ethdev_driver.c index
> > >>>> 0be1e8ca04..f994653fe9 100644
> > >>>> --- a/lib/ethdev/ethdev_driver.c
> > >>>> +++ b/lib/ethdev/ethdev_driver.c
> > >>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct
> > rte_eth_dev
> > >>>> *dev, const char *ring_name,
> > >>>>    	return rc;
> > >>>>    }
> > >>>>
> > >>>> +void
> > >>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
> > >>>> +	if (dev == NULL)
> > >>>> +		return;
> > >>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
> > >>>> +
> > >>>>    const struct rte_memzone *
> > >>>>    rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const
> > >>>> char *ring_name,
> > >>>>    			 uint16_t queue_id, size_t size, unsigned int align, diff
> > -
> > >> -git
> > >>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
> > >>>> 2c9d615fb5..0d964d1f67 100644
> > >>>> --- a/lib/ethdev/ethdev_driver.h
> > >>>> +++ b/lib/ethdev/ethdev_driver.h
> > >>>> @@ -1621,6 +1621,16 @@ int
> > >>>>    rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> > >>>> char *name,
> > >>>>    		 uint16_t queue_id);
> > >>>>
> > >>>> +/**
> > >>>> + * @internal
> > >>>> + * Setup eth fast-path API to ethdev values.
> > >>>> + *
> > >>>> + * @param dev
> > >>>> + *  Pointer to struct rte_eth_dev.
> > >>>> + */
> > >>>> +__rte_internal
> > >>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> > >>>> +
> > >>>>    /**
> > >>>>     * @internal
> > >>>>     * Atomically set the link status for the specific device.
> > >>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > >>>> index
> > >>>> 049641d57c..44ee7229c1 100644
> > >>>> --- a/lib/ethdev/rte_ethdev.h
> > >>>> +++ b/lib/ethdev/rte_ethdev.h
> > >>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> > >>>>    	 */
> > >>>>    	RTE_ETH_EVENT_RX_AVAIL_THRESH,
> > >>>>    	/** Port recovering from a hardware or firmware error.
> > >>>> -	 * If PMD supports proactive error recovery,
> > >>>> -	 * it should trigger this event to notify application
> > >>>> -	 * that it detected an error and the recovery is being started.
> > >>>> -	 * Upon receiving the event, the application should not invoke any
> > >>>> control path API
> > >>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> > >>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> > >>>> -	 * The PMD will set the data path pointers to dummy functions,
> > >>>> -	 * and re-set the data path pointers to non-dummy functions
> > >>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>> -	 * It means that the application cannot send or receive any packets
> > >>>> -	 * during this period.
> > >>>> +	 *
> > >>>> +	 * If PMD supports proactive error recovery, it should trigger this
> > >>>> +	 * event to notify application that it detected an error and the
> > >>>> +	 * recovery is about to start.
> > >>>> +	 *
> > >>>> +	 * Upon receiving the event, the application should not invoke any
> > >>>> +	 * control and data path API until receiving
> > >>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>> RTE_ETH_EVENT_RECOVERY_FAILED
> > >>>> +	 * event.
> > >>>> +	 *
> > >>>> +	 * Once this event is reported, the PMD will set the data path pointers
> > >>>> +	 * to dummy functions, and re-set the data path pointers to valid
> > >>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> > >>>> event.
> > >>> Why do we need to set the data path pointers to dummy functions if
> > >>> the
> > >> application is restricted from invoking any control and data path
> > >> APIs till the recovery process is completed?
> > >>
> > >> You are right, in theory it is not mandatory.
> > >> Though it helps to flag a problem if user will still try to call them
> > >> while recovery is in progress.
> > > Ok, may be in debug mode.
> > > I mean, we have already set an expectation to the application that it should
> > not call and the application has implemented a method to do the same. Why
> > do we need to complicate this?
> > > If the application calls the APIs, it is a programming error.
> >
> >
> > My preference would be to keep it this way for both debug and non-debug
> > mode.
> > It doesn't cost anything to us in terms of perfomance, but helps to catch
> > problems with wrong behaving app.
> 
> This is also causing a synchronization problem. i.e. if this has to be done correctly, we need to use correct synchronization
> mechanisms.
> We cannot set the function pointers and assume that data will be visible to other threads/cores in the correct order.
> A possible mechanism (though I see some problems with this) could be to use a guard variable, which indicates when it is safe to use
> the function pointers on the data plane threads. This would require a load-acquire in the data plane threads.

I do realize that it doesn't provide any synchronization by itself. 
It is just best effort approach (no guarantee) to flag a possible problem to the app developer/maintainer, nothing more.
But it showed itself usefull already - as I remember we cached few bugs with it for dev_stop, etc.
Plus it costs us nothing in terms of performance, so why not to have it.

> >
> > >
> > >> Again, same as we doing in dev_stop().
> > >
> > >>
> > >>>
> > >>>> +	 *
> > >>>>    	 * @note Before the PMD reports the recovery result,
> > >>>>    	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> > >>>> again,
> > >>>>    	 * because a larger error may occur during the recovery.
> > >>>>    	 */
> > >>>>    	RTE_ETH_EVENT_ERR_RECOVERING,
> > >>> I understand this is not a change in this patch. But, just
> > >>> wondering, what is the
> > >> purpose of this? How is the application supposed to use this?
> > >>>
> > >>>>    	/** Port recovers successfully from the error.
> > >>>> -	 * The PMD already re-configured the port,
> > >>>> -	 * and the effect is the same as a restart operation.
> > >>>> +	 *
> > >>>> +	 * The PMD already re-configured the port:
> > >>>>    	 * a) The following operation will be retained: (alphabetically)
> > >>>>    	 *    - DCB configuration
> > >>>>    	 *    - FEC configuration
> > >>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> > >>>>    	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> > >>>>    	 * c) Any other configuration will not be stored
> > >>>>    	 *    and will need to be re-configured.
> > >>>> +	 *
> > >>>> +	 * The application should restore some additional configuration
> > >>>> +	 * (see above case b/c), and then enable data path API invocation.
> > >>>>    	 */
> > >>>>    	RTE_ETH_EVENT_RECOVERY_SUCCESS,
> > >>>>    	/** Port recovery failed.
> > >>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> > >>>> 357d1a88c0..c273e0bdae 100644
> > >>>> --- a/lib/ethdev/version.map
> > >>>> +++ b/lib/ethdev/version.map
> > >>>> @@ -320,6 +320,7 @@ INTERNAL {
> > >>>>    	rte_eth_devices;
> > >>>>    	rte_eth_dma_zone_free;
> > >>>>    	rte_eth_dma_zone_reserve;
> > >>>> +	rte_eth_fp_ops_setup;
> > >>>>    	rte_eth_hairpin_queue_peer_bind;
> > >>>>    	rte_eth_hairpin_queue_peer_unbind;
> > >>>>    	rte_eth_hairpin_queue_peer_update;
> > >>>> --
> > >>>> 2.17.1
> > >>>
> > >
> > > Is there any reason not to design this in the same way as
> > 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> >
> > I suppose it is a question for the authors of original patch...
> Appreciate if the authors could comment on this.
> 
> >
> > > We could have a similar API 'rte_eth_dev_recover' to do the recovery
> > functionality.
> >
> > I suppose such approach is also possible.
> > Personally I am fine with both ways: either existing one or what you propose,
> > as long as we'll fix existing race-condition.
> > What is good with what you suggest - that way we probably don't need to
> > worry how to allow user to enable/disable auto-recovery inside PMD.
> >
> > Konstantin
> >


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  9:52                     ` Konstantin Ananyev
@ 2023-03-07 10:11                       ` Konstantin Ananyev
  0 siblings, 0 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-07 10:11 UTC (permalink / raw)
  To: Konstantin Ananyev, Fengchengwen, Ajit Khaparde, Ferruh Yigit
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev



> > >>>>>>>>> In the proactive error handling mode, the PMD will set the data path
> > >>>>>>>>> pointers to dummy functions and then try recovery, in this period the
> > >>>>>>>>> application may still invoking data path API. This will introduce a
> > >>>>>>>>> race-condition with data path which may lead to crash [1].
> > >>>>>>>>>
> > >>>>>>>>> Although the PMD added delay after setting data path pointers to cover
> > >>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
> > >>>>>>>>> solve the problem.
> > >>>>>>>>>
> > >>>>>>>>> To solve the race-condition problem fundamentally, the following
> > >>>>>>>>> requirements are added:
> > >>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
> > >>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>> 2. The application should stop data path API invocation when process
> > >>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
> > >>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>> 4. The application should enable data path API invocation when process
> > >>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>> How this is solving the race-condition, by pushing responsibility to
> > >>>>>>> stop data path to application?
> > >>>>>>
> > >>>>>> Exactly, it becomes application responsibility to make sure data-path is
> > >>>>>> stopped/suspended before recovery will continue.
> > >>>>>>
> > >>>>>
> > >>>>> From documentation of the feature:
> > >>>>>
> > >>>>> ``
> > >>>>> Because the PMD recovers automatically,
> > >>>>> the application can only sense that the data flow is disconnected for a
> > >>>>> while and the control API returns an error in this period.
> > >>>>>
> > >>>>> In order to sense the error happening/recovering, as well as to restore
> > >>>>> some additional configuration, three events are available:
> > >>>>> ``
> > >>>>>
> > >>>>> It looks like initial design is to use events mainly inform application
> > >>>>> about what happened and mainly for re-configuration.
> > >>>>>
> > >>>>> Although I am don't disagree to involve the application, I am not sure
> > >>>>> that is part of current design.
> > >>>>
> > >>>> I thought we all agreed that initial design contain some fallacies that
> > >>>> need to fixed, no?
> > >>>> Statement that with current rte_ethdev design error recovery can be done
> > >>>> without interaction with the app (to stop/suspend data/control path)
> > >>>> is the main one I think.
> > >>>> It needs some interaction with app layer, one way or another.
> > >>>>
> > >>>>>>>
> > >>>>>>> What if application is not interested in recovery modes at all and not
> > >>>>>>> registered any callback for the recovery?
> > >>>>>>
> > >>>>>>
> > >>>>>> Are you saying there is no way for application to disable
> > >>>>>> automatic recovery in PMD if it is not interested
> > >>>>>> (or can't full-fill per-requesties for it)?
> > >>>>>> If so, then yes it is a problem and we need to fix it.
> > >>>>>> I assumed that such mechanism to disable unwanted events already exists,
> > >>>>>> but I can't find anything.
> > >>>>>> Wonder what would be the easiest way here - can PMD make a decision
> > >>>>>> based on callback return value, or do we need a new API to
> > >>>>>> enable/disable callbacks, or ...?
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>> As far as I can see automatic recovery is not configurable by app.
> > >>>>>
> > >>>>> But that is not all, PMD sends events to application but PMD can't know
> > >>>>> if application is handling them or not, so with current design PMD can't
> > >>>>> rely on to app.
> > >>>>
> > >>>> Well, PMD invokes user provided callback.
> > >>>> One way to fix that problem - if there is no callback provided,
> > >>>> or callback returns an error code - PMD can assume that recovery
> > >>>> should not be done.
> > >>>> That is probably not the best design choice, but at least it will allow
> > >>>> to fix the problem without too many changes and introducing new API.
> > >>>> That could be sort of a 'quick fix'.
> > >>>> In a meanwhile we can think about new/better approach for that.
> > >>>>
> > >>>
> > >>> -rc2 for 23.03 is a few days away.
> > >>>
> > >>> What do you think to have 'quick fix' as modifying how driver updates
> > >>> burst ops to prevent the race condition, for this release?
> >
> > The 'quick fix', do you mean only update function pointer (without rxq setting) ?
> > Currently the PMDs which announced support "proactive error handling mode" already
> > do this.
> 
> Really sorry guys, I was too fast on the keyboard, and didn't read properly what Ferruh suggested.
> Reading it once again - no I don not agree with that.
> It wouldn't fix anything, but will just add extra mess into the code.
> Sorry again for the wrong reply.
> Konstantin
> 

Thinking about 'quick fix' once again: I think the patches Fengchengwen already provided:
https://patchwork.dpdk.org/project/dpdk/list/?series=27201
is a much better approach.
I believe it should stop race condition (and crashing) with properly written callback.
If we still have time for it, I'd suggest one extra change in PMD:
check that recovery callback is installed, if not simply not start recovery at all.  

> >
> > >>>
> > >>> And plan a design update for the next release?
> > >> +1 on the overall approach.
> > >
> > > Yep, agree.
> >
> > Hope for better solution.
> > And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
> > registers RTE_ETH_EVENT_INTR_RESET callback .
> >
> > Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
> > with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> >
> > >
> > >>
> > >>>
> > >>>
> > >>>>>
> > >>>>>>> I think driver should not rely on application for this, unless
> > >>>>>>> application explicitly says (to driver) that it is handling recovery,
> > >>>>>>> right now there is no way for driver to know this.
> > >>>>>>
> > >>>>>> I think it is visa-versa:
> > >>>>>> application should not enable auto-recovery if it can't meet
> > >>>>>> per-requeststies for it (provide appropriate callback).
> > >>>>>>
> > >>>>>
> > >>>>> I agree on above, we are saying similar thing in different perspective.
> > >>>>
> > >>>> Ok, that's good we are on the same page.
> > >>>>
> > >>>>
> > >>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>>>> Also, this patch introduce a driver internal function
> > >>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> > >>>>>>>>>
> > >>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> > >>>>>>>>> Cc: stable@dpdk.org
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>>>>>>>> ---
> > >>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> > >>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> > >>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> > >>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> > >>>>>>>>> +++++++++++++++----------
> > >>>>>>>>>   lib/ethdev/version.map                  |  1 +
> > >>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> index c145a9066c..e380ff135a 100644
> > >>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> > >>>>>>>>> in PASSIVE mode,
> > >>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> > >>>>>>>>>   and only a small amount of work is required for the application.
> > >>>>>>>>>
> > >>>>>>>>> -During error detection and automatic recovery,
> > >>>>>>>>> -the PMD sets the data path pointers to dummy functions
> > >>>>>>>>> -(which will prevent the crash),
> > >>>>>>>>> -and also make sure the control path operations fail with a return
> > >>>>>>>>> code ``-EBUSY``.
> > >>>>>>>>> -
> > >>>>>>>>> -Because the PMD recovers automatically,
> > >>>>>>>>> -the application can only sense that the data flow is disconnected
> > >>>>>>>>> for a while
> > >>>>>>>>> -and the control API returns an error in this period.
> > >>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> > >>>>>>>>> data path
> > >>>>>>>>> +pointers to dummy functions and also make sure the control path
> > >>>>>>>>> operations
> > >>>>>>>>> +failed with a return code ``-EBUSY``.
> > >>>>>>>>>
> > >>>>>>>>>   In order to sense the error happening/recovering,
> > >>>>>>>>>   as well as to restore some additional configuration,
> > >>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> > >>>>>>>>>      Notify the application that an error is detected
> > >>>>>>>>> -   and the recovery is being started.
> > >>>>>>>>> +   and the recovery is about to start.
> > >>>>>>>>>      Upon receiving the event, the application should not invoke
> > >>>>>>>>> -   any control path function until receiving
> > >>>>>>>>> +   any control and data path API until receiving
> > >>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> > >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> > >>>>>>>>>
> > >>>>>>>>>   .. note::
> > >>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> > >>>>>>>>>      Notify the application that the recovery from error is successful,
> > >>>>>>>>> -   the PMD already re-configures the port,
> > >>>>>>>>> -   and the effect is the same as a restart operation.
> > >>>>>>>>> +   the PMD already re-configures the port.
> > >>>>>>>>> +   The application should restore some additional configuration,
> > >>>>>>>>> and then
> > >>>>>>>>> +   enable data path API invocation.
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> > >>>>>>>>>      Notify the application that the recovery from error failed,
> > >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> > >>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> > >>>>>>>>> *dev, const char *ring_name,
> > >>>>>>>>>       return rc;
> > >>>>>>>>>   }
> > >>>>>>>>>
> > >>>>>>>>> +void
> > >>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> > >>>>>>>>> +{
> > >>>>>>>>> +    if (dev == NULL)
> > >>>>>>>>> +        return;
> > >>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> > >>>>>>>>> +}
> > >>>>>>>>> +
> > >>>>>>>>>   const struct rte_memzone *
> > >>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> > >>>>>>>>> *ring_name,
> > >>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> > >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> > >>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> @@ -1621,6 +1621,16 @@ int
> > >>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> > >>>>>>>>> char *name,
> > >>>>>>>>>            uint16_t queue_id);
> > >>>>>>>>>
> > >>>>>>>>> +/**
> > >>>>>>>>> + * @internal
> > >>>>>>>>> + * Setup eth fast-path API to ethdev values.
> > >>>>>>>>> + *
> > >>>>>>>>> + * @param dev
> > >>>>>>>>> + *  Pointer to struct rte_eth_dev.
> > >>>>>>>>> + */
> > >>>>>>>>> +__rte_internal
> > >>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> > >>>>>>>>> +
> > >>>>>>>>>   /**
> > >>>>>>>>>    * @internal
> > >>>>>>>>>    * Atomically set the link status for the specific device.
> > >>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> index 049641d57c..44ee7229c1 100644
> > >>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> > >>>>>>>>>       /** Port recovering from a hardware or firmware error.
> > >>>>>>>>> -     * If PMD supports proactive error recovery,
> > >>>>>>>>> -     * it should trigger this event to notify application
> > >>>>>>>>> -     * that it detected an error and the recovery is being started.
> > >>>>>>>>> -     * Upon receiving the event, the application should not invoke
> > >>>>>>>>> any control path API
> > >>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> > >>>>>>>>> receiving
> > >>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> > >>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
> > >>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> > >>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>> -     * It means that the application cannot send or receive any
> > >>>>>>>>> packets
> > >>>>>>>>> -     * during this period.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
> > >>>>>>>>> this
> > >>>>>>>>> +     * event to notify application that it detected an error and the
> > >>>>>>>>> +     * recovery is about to start.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * Upon receiving the event, the application should not invoke any
> > >>>>>>>>> +     * control and data path API until receiving
> > >>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> > >>>>>>>>> +     * event.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * Once this event is reported, the PMD will set the data path
> > >>>>>>>>> pointers
> > >>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> > >>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> > >>>>>>>>> event.
> > >>>>>>>>> +     *
> > >>>>>>>>>        * @note Before the PMD reports the recovery result,
> > >>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> > >>>>>>>>> again,
> > >>>>>>>>>        * because a larger error may occur during the recovery.
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> > >>>>>>>>>       /** Port recovers successfully from the error.
> > >>>>>>>>> -     * The PMD already re-configured the port,
> > >>>>>>>>> -     * and the effect is the same as a restart operation.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * The PMD already re-configured the port:
> > >>>>>>>>>        * a) The following operation will be retained: (alphabetically)
> > >>>>>>>>>        *    - DCB configuration
> > >>>>>>>>>        *    - FEC configuration
> > >>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> > >>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> > >>>>>>>>>        * c) Any other configuration will not be stored
> > >>>>>>>>>        *    and will need to be re-configured.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * The application should restore some additional configuration
> > >>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> > >>>>>>>>>       /** Port recovery failed.
> > >>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > >>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> > >>>>>>>>> --- a/lib/ethdev/version.map
> > >>>>>>>>> +++ b/lib/ethdev/version.map
> > >>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> > >>>>>>>>>       rte_eth_devices;
> > >>>>>>>>>       rte_eth_dma_zone_free;
> > >>>>>>>>>       rte_eth_dma_zone_reserve;
> > >>>>>>>>> +    rte_eth_fp_ops_setup;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> > >>>>>>>>> --
> > >>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > >>>>>>>>
> > >>>>>>>>> 2.17.1
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  8:25                   ` fengchengwen
  2023-03-07  9:52                     ` Konstantin Ananyev
@ 2023-03-07 12:07                     ` Ferruh Yigit
  2023-03-07 12:26                       ` fengchengwen
  1 sibling, 1 reply; 85+ messages in thread
From: Ferruh Yigit @ 2023-03-07 12:07 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, Ajit Khaparde
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev

On 3/7/2023 8:25 AM, fengchengwen wrote:
> 
> 
> On 2023/3/6 19:13, Konstantin Ananyev wrote:
>>
>>
>>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>>>>>> application may still invoking data path API. This will introduce a
>>>>>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>>>>>
>>>>>>>>>> Although the PMD added delay after setting data path pointers to cover
>>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>>>>>> solve the problem.
>>>>>>>>>>
>>>>>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>>>>>> requirements are added:
>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>> 2. The application should stop data path API invocation when process
>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>> 4. The application should enable data path API invocation when process
>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>>
>>>>>>>>
>>>>>>>> How this is solving the race-condition, by pushing responsibility to
>>>>>>>> stop data path to application?
>>>>>>>
>>>>>>> Exactly, it becomes application responsibility to make sure data-path is
>>>>>>> stopped/suspended before recovery will continue.
>>>>>>>
>>>>>>
>>>>>> From documentation of the feature:
>>>>>>
>>>>>> ``
>>>>>> Because the PMD recovers automatically,
>>>>>> the application can only sense that the data flow is disconnected for a
>>>>>> while and the control API returns an error in this period.
>>>>>>
>>>>>> In order to sense the error happening/recovering, as well as to restore
>>>>>> some additional configuration, three events are available:
>>>>>> ``
>>>>>>
>>>>>> It looks like initial design is to use events mainly inform application
>>>>>> about what happened and mainly for re-configuration.
>>>>>>
>>>>>> Although I am don't disagree to involve the application, I am not sure
>>>>>> that is part of current design.
>>>>>
>>>>> I thought we all agreed that initial design contain some fallacies that
>>>>> need to fixed, no?
>>>>> Statement that with current rte_ethdev design error recovery can be done
>>>>> without interaction with the app (to stop/suspend data/control path)
>>>>> is the main one I think.
>>>>> It needs some interaction with app layer, one way or another.
>>>>>
>>>>>>>>
>>>>>>>> What if application is not interested in recovery modes at all and not
>>>>>>>> registered any callback for the recovery?
>>>>>>>
>>>>>>>
>>>>>>> Are you saying there is no way for application to disable
>>>>>>> automatic recovery in PMD if it is not interested
>>>>>>> (or can't full-fill per-requesties for it)?
>>>>>>> If so, then yes it is a problem and we need to fix it.
>>>>>>> I assumed that such mechanism to disable unwanted events already exists,
>>>>>>> but I can't find anything.
>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
>>>>>>> based on callback return value, or do we need a new API to
>>>>>>> enable/disable callbacks, or ...?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> As far as I can see automatic recovery is not configurable by app.
>>>>>>
>>>>>> But that is not all, PMD sends events to application but PMD can't know
>>>>>> if application is handling them or not, so with current design PMD can't
>>>>>> rely on to app.
>>>>>
>>>>> Well, PMD invokes user provided callback.
>>>>> One way to fix that problem - if there is no callback provided,
>>>>> or callback returns an error code - PMD can assume that recovery
>>>>> should not be done.
>>>>> That is probably not the best design choice, but at least it will allow
>>>>> to fix the problem without too many changes and introducing new API.
>>>>> That could be sort of a 'quick fix'.
>>>>> In a meanwhile we can think about new/better approach for that.
>>>>>
>>>>
>>>> -rc2 for 23.03 is a few days away.
>>>>
>>>> What do you think to have 'quick fix' as modifying how driver updates
>>>> burst ops to prevent the race condition, for this release?
> 
> The 'quick fix', do you mean only update function pointer (without rxq setting) ?
> Currently the PMDs which announced support "proactive error handling mode" already
> do this.
> 

Yes.
I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
updates all fields in 'rte_eth_fp_ops' but only function pointer seems
changed in the driver, resulting only function pointers to be updated.

The discussion about race condition started with patch [1], which
mentions a crash because of a race condition. Later in discussions,
recovery event given as a sample for where the race can occur, that is
why we are here.

But after above info, although there is race condition and a bigger
update (that needs application involvement) is required for recovery
mechanism, there is no crash and NO 'quick fix' is required for recovery.

@Konstantin, @Chengwen, can you please confirm above understanding is
correct?



[1]
https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

>>>>
>>>> And plan a design update for the next release?
>>> +1 on the overall approach.
>>
>> Yep, agree.
> 
> Hope for better solution.
> And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
> registers RTE_ETH_EVENT_INTR_RESET callback .
> 
> Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> 
>>  
>>>
>>>>
>>>>
>>>>>>
>>>>>>>> I think driver should not rely on application for this, unless
>>>>>>>> application explicitly says (to driver) that it is handling recovery,
>>>>>>>> right now there is no way for driver to know this.
>>>>>>>
>>>>>>> I think it is visa-versa:
>>>>>>> application should not enable auto-recovery if it can't meet
>>>>>>> per-requeststies for it (provide appropriate callback).
>>>>>>>
>>>>>>
>>>>>> I agree on above, we are saying similar thing in different perspective.
>>>>>
>>>>> Ok, that's good we are on the same page.
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>> Also, this patch introduce a driver internal function
>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>>>>>>>
>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>>>>>> ---
>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>>>>>>>> +++++++++++++++----------
>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> index c145a9066c..e380ff135a 100644
>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>>>>>> in PASSIVE mode,
>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>>>>>>>   and only a small amount of work is required for the application.
>>>>>>>>>>
>>>>>>>>>> -During error detection and automatic recovery,
>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>>>>>> -(which will prevent the crash),
>>>>>>>>>> -and also make sure the control path operations fail with a return
>>>>>>>>>> code ``-EBUSY``.
>>>>>>>>>> -
>>>>>>>>>> -Because the PMD recovers automatically,
>>>>>>>>>> -the application can only sense that the data flow is disconnected
>>>>>>>>>> for a while
>>>>>>>>>> -and the control API returns an error in this period.
>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>>>>>> data path
>>>>>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>>>>>> operations
>>>>>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>>>>>
>>>>>>>>>>   In order to sense the error happening/recovering,
>>>>>>>>>>   as well as to restore some additional configuration,
>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>>>>>      Notify the application that an error is detected
>>>>>>>>>> -   and the recovery is being started.
>>>>>>>>>> +   and the recovery is about to start.
>>>>>>>>>>      Upon receiving the event, the application should not invoke
>>>>>>>>>> -   any control path function until receiving
>>>>>>>>>> +   any control and data path API until receiving
>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>>>>>
>>>>>>>>>>   .. note::
>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>>>>>      Notify the application that the recovery from error is successful,
>>>>>>>>>> -   the PMD already re-configures the port,
>>>>>>>>>> -   and the effect is the same as a restart operation.
>>>>>>>>>> +   the PMD already re-configures the port.
>>>>>>>>>> +   The application should restore some additional configuration,
>>>>>>>>>> and then
>>>>>>>>>> +   enable data path API invocation.
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>>>>>      Notify the application that the recovery from error failed,
>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>>>>>> *dev, const char *ring_name,
>>>>>>>>>>       return rc;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> +void
>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>>>>>> +{
>>>>>>>>>> +    if (dev == NULL)
>>>>>>>>>> +        return;
>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>   const struct rte_memzone *
>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>>>>>> *ring_name,
>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>>>>>> char *name,
>>>>>>>>>>            uint16_t queue_id);
>>>>>>>>>>
>>>>>>>>>> +/**
>>>>>>>>>> + * @internal
>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>>>>>> + *
>>>>>>>>>> + * @param dev
>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>>>>>> + */
>>>>>>>>>> +__rte_internal
>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>>>>>> +
>>>>>>>>>>   /**
>>>>>>>>>>    * @internal
>>>>>>>>>>    * Atomically set the link status for the specific device.
>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
>>>>>>>>>> -     * If PMD supports proactive error recovery,
>>>>>>>>>> -     * it should trigger this event to notify application
>>>>>>>>>> -     * that it detected an error and the recovery is being started.
>>>>>>>>>> -     * Upon receiving the event, the application should not invoke
>>>>>>>>>> any control path API
>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>>>>>> receiving
>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>> -     * It means that the application cannot send or receive any
>>>>>>>>>> packets
>>>>>>>>>> -     * during this period.
>>>>>>>>>> +     *
>>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>>>>>>>> this
>>>>>>>>>> +     * event to notify application that it detected an error and the
>>>>>>>>>> +     * recovery is about to start.
>>>>>>>>>> +     *
>>>>>>>>>> +     * Upon receiving the event, the application should not invoke any
>>>>>>>>>> +     * control and data path API until receiving
>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>>>>>> +     * event.
>>>>>>>>>> +     *
>>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
>>>>>>>>>> pointers
>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>>>>>> event.
>>>>>>>>>> +     *
>>>>>>>>>>        * @note Before the PMD reports the recovery result,
>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>>>>>> again,
>>>>>>>>>>        * because a larger error may occur during the recovery.
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>>>>>>       /** Port recovers successfully from the error.
>>>>>>>>>> -     * The PMD already re-configured the port,
>>>>>>>>>> -     * and the effect is the same as a restart operation.
>>>>>>>>>> +     *
>>>>>>>>>> +     * The PMD already re-configured the port:
>>>>>>>>>>        * a) The following operation will be retained: (alphabetically)
>>>>>>>>>>        *    - DCB configuration
>>>>>>>>>>        *    - FEC configuration
>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>>>>>        * c) Any other configuration will not be stored
>>>>>>>>>>        *    and will need to be re-configured.
>>>>>>>>>> +     *
>>>>>>>>>> +     * The application should restore some additional configuration
>>>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>>>>>       /** Port recovery failed.
>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>>>>>> --- a/lib/ethdev/version.map
>>>>>>>>>> +++ b/lib/ethdev/version.map
>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>>>>>       rte_eth_devices;
>>>>>>>>>>       rte_eth_dma_zone_free;
>>>>>>>>>>       rte_eth_dma_zone_reserve;
>>>>>>>>>> +    rte_eth_fp_ops_setup;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
>>>>>>>>>> --
>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>>>>>>>
>>>>>>>>>> 2.17.1
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07 12:07                     ` Ferruh Yigit
@ 2023-03-07 12:26                       ` fengchengwen
  2023-03-07 12:39                         ` Konstantin Ananyev
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-07 12:26 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev, Ajit Khaparde
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev

On 2023/3/7 20:07, Ferruh Yigit wrote:
> On 3/7/2023 8:25 AM, fengchengwen wrote:
>>
>>
>> On 2023/3/6 19:13, Konstantin Ananyev wrote:
>>>
>>>
>>>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>>>>>>> application may still invoking data path API. This will introduce a
>>>>>>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>>>>>>
>>>>>>>>>>> Although the PMD added delay after setting data path pointers to cover
>>>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>>>>>>> solve the problem.
>>>>>>>>>>>
>>>>>>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>>>>>>> requirements are added:
>>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>>> 2. The application should stop data path API invocation when process
>>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>>> 4. The application should enable data path API invocation when process
>>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How this is solving the race-condition, by pushing responsibility to
>>>>>>>>> stop data path to application?
>>>>>>>>
>>>>>>>> Exactly, it becomes application responsibility to make sure data-path is
>>>>>>>> stopped/suspended before recovery will continue.
>>>>>>>>
>>>>>>>
>>>>>>> From documentation of the feature:
>>>>>>>
>>>>>>> ``
>>>>>>> Because the PMD recovers automatically,
>>>>>>> the application can only sense that the data flow is disconnected for a
>>>>>>> while and the control API returns an error in this period.
>>>>>>>
>>>>>>> In order to sense the error happening/recovering, as well as to restore
>>>>>>> some additional configuration, three events are available:
>>>>>>> ``
>>>>>>>
>>>>>>> It looks like initial design is to use events mainly inform application
>>>>>>> about what happened and mainly for re-configuration.
>>>>>>>
>>>>>>> Although I am don't disagree to involve the application, I am not sure
>>>>>>> that is part of current design.
>>>>>>
>>>>>> I thought we all agreed that initial design contain some fallacies that
>>>>>> need to fixed, no?
>>>>>> Statement that with current rte_ethdev design error recovery can be done
>>>>>> without interaction with the app (to stop/suspend data/control path)
>>>>>> is the main one I think.
>>>>>> It needs some interaction with app layer, one way or another.
>>>>>>
>>>>>>>>>
>>>>>>>>> What if application is not interested in recovery modes at all and not
>>>>>>>>> registered any callback for the recovery?
>>>>>>>>
>>>>>>>>
>>>>>>>> Are you saying there is no way for application to disable
>>>>>>>> automatic recovery in PMD if it is not interested
>>>>>>>> (or can't full-fill per-requesties for it)?
>>>>>>>> If so, then yes it is a problem and we need to fix it.
>>>>>>>> I assumed that such mechanism to disable unwanted events already exists,
>>>>>>>> but I can't find anything.
>>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
>>>>>>>> based on callback return value, or do we need a new API to
>>>>>>>> enable/disable callbacks, or ...?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> As far as I can see automatic recovery is not configurable by app.
>>>>>>>
>>>>>>> But that is not all, PMD sends events to application but PMD can't know
>>>>>>> if application is handling them or not, so with current design PMD can't
>>>>>>> rely on to app.
>>>>>>
>>>>>> Well, PMD invokes user provided callback.
>>>>>> One way to fix that problem - if there is no callback provided,
>>>>>> or callback returns an error code - PMD can assume that recovery
>>>>>> should not be done.
>>>>>> That is probably not the best design choice, but at least it will allow
>>>>>> to fix the problem without too many changes and introducing new API.
>>>>>> That could be sort of a 'quick fix'.
>>>>>> In a meanwhile we can think about new/better approach for that.
>>>>>>
>>>>>
>>>>> -rc2 for 23.03 is a few days away.
>>>>>
>>>>> What do you think to have 'quick fix' as modifying how driver updates
>>>>> burst ops to prevent the race condition, for this release?
>>
>> The 'quick fix', do you mean only update function pointer (without rxq setting) ?
>> Currently the PMDs which announced support "proactive error handling mode" already
>> do this.
>>
> 
> Yes.
> I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
> updates all fields in 'rte_eth_fp_ops' but only function pointer seems
> changed in the driver, resulting only function pointers to be updated.
> 
> The discussion about race condition started with patch [1], which
> mentions a crash because of a race condition. Later in discussions,
> recovery event given as a sample for where the race can occur, that is
> why we are here.
> 
> But after above info, although there is race condition and a bigger
> update (that needs application involvement) is required for recovery
> mechanism, there is no crash and NO 'quick fix' is required for recovery.
> 
> @Konstantin, @Chengwen, can you please confirm above understanding is
> correct?

Yes, that's what.

> 
> 
> 
> [1]
> https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> 
>>>>>
>>>>> And plan a design update for the next release?
>>>> +1 on the overall approach.
>>>
>>> Yep, agree.
>>
>> Hope for better solution.
>> And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
>> registers RTE_ETH_EVENT_INTR_RESET callback .
>>
>> Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
>> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
>>
>>>  
>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>>> I think driver should not rely on application for this, unless
>>>>>>>>> application explicitly says (to driver) that it is handling recovery,
>>>>>>>>> right now there is no way for driver to know this.
>>>>>>>>
>>>>>>>> I think it is visa-versa:
>>>>>>>> application should not enable auto-recovery if it can't meet
>>>>>>>> per-requeststies for it (provide appropriate callback).
>>>>>>>>
>>>>>>>
>>>>>>> I agree on above, we are saying similar thing in different perspective.
>>>>>>
>>>>>> Ok, that's good we are on the same page.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> Also, this patch introduce a driver internal function
>>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>>>>>>>>
>>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>>>>>>> ---
>>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>>>>>>>>> +++++++++++++++----------
>>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
>>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>>> index c145a9066c..e380ff135a 100644
>>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>>>>>>> in PASSIVE mode,
>>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>>>>>>>>   and only a small amount of work is required for the application.
>>>>>>>>>>>
>>>>>>>>>>> -During error detection and automatic recovery,
>>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>>>>>>> -(which will prevent the crash),
>>>>>>>>>>> -and also make sure the control path operations fail with a return
>>>>>>>>>>> code ``-EBUSY``.
>>>>>>>>>>> -
>>>>>>>>>>> -Because the PMD recovers automatically,
>>>>>>>>>>> -the application can only sense that the data flow is disconnected
>>>>>>>>>>> for a while
>>>>>>>>>>> -and the control API returns an error in this period.
>>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>>>>>>> data path
>>>>>>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>>>>>>> operations
>>>>>>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>>>>>>
>>>>>>>>>>>   In order to sense the error happening/recovering,
>>>>>>>>>>>   as well as to restore some additional configuration,
>>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>>>>>>
>>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>>>>>>      Notify the application that an error is detected
>>>>>>>>>>> -   and the recovery is being started.
>>>>>>>>>>> +   and the recovery is about to start.
>>>>>>>>>>>      Upon receiving the event, the application should not invoke
>>>>>>>>>>> -   any control path function until receiving
>>>>>>>>>>> +   any control and data path API until receiving
>>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>>>>>>
>>>>>>>>>>>   .. note::
>>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>>>>>>
>>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>>>>>>      Notify the application that the recovery from error is successful,
>>>>>>>>>>> -   the PMD already re-configures the port,
>>>>>>>>>>> -   and the effect is the same as a restart operation.
>>>>>>>>>>> +   the PMD already re-configures the port.
>>>>>>>>>>> +   The application should restore some additional configuration,
>>>>>>>>>>> and then
>>>>>>>>>>> +   enable data path API invocation.
>>>>>>>>>>>
>>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>>>>>>      Notify the application that the recovery from error failed,
>>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>>>>>>> *dev, const char *ring_name,
>>>>>>>>>>>       return rc;
>>>>>>>>>>>   }
>>>>>>>>>>>
>>>>>>>>>>> +void
>>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>>>>>>> +{
>>>>>>>>>>> +    if (dev == NULL)
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>   const struct rte_memzone *
>>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>>>>>>> *ring_name,
>>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>>>>>>> char *name,
>>>>>>>>>>>            uint16_t queue_id);
>>>>>>>>>>>
>>>>>>>>>>> +/**
>>>>>>>>>>> + * @internal
>>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @param dev
>>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>>>>>>> + */
>>>>>>>>>>> +__rte_internal
>>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>>>>>>> +
>>>>>>>>>>>   /**
>>>>>>>>>>>    * @internal
>>>>>>>>>>>    * Atomically set the link status for the specific device.
>>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>>>>>>        */
>>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
>>>>>>>>>>> -     * If PMD supports proactive error recovery,
>>>>>>>>>>> -     * it should trigger this event to notify application
>>>>>>>>>>> -     * that it detected an error and the recovery is being started.
>>>>>>>>>>> -     * Upon receiving the event, the application should not invoke
>>>>>>>>>>> any control path API
>>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>>>>>>> receiving
>>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
>>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>>> -     * It means that the application cannot send or receive any
>>>>>>>>>>> packets
>>>>>>>>>>> -     * during this period.
>>>>>>>>>>> +     *
>>>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>>>>>>>>> this
>>>>>>>>>>> +     * event to notify application that it detected an error and the
>>>>>>>>>>> +     * recovery is about to start.
>>>>>>>>>>> +     *
>>>>>>>>>>> +     * Upon receiving the event, the application should not invoke any
>>>>>>>>>>> +     * control and data path API until receiving
>>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>>>>>>> +     * event.
>>>>>>>>>>> +     *
>>>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
>>>>>>>>>>> pointers
>>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
>>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>>>>>>> event.
>>>>>>>>>>> +     *
>>>>>>>>>>>        * @note Before the PMD reports the recovery result,
>>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>>>>>>> again,
>>>>>>>>>>>        * because a larger error may occur during the recovery.
>>>>>>>>>>>        */
>>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>>>>>>>       /** Port recovers successfully from the error.
>>>>>>>>>>> -     * The PMD already re-configured the port,
>>>>>>>>>>> -     * and the effect is the same as a restart operation.
>>>>>>>>>>> +     *
>>>>>>>>>>> +     * The PMD already re-configured the port:
>>>>>>>>>>>        * a) The following operation will be retained: (alphabetically)
>>>>>>>>>>>        *    - DCB configuration
>>>>>>>>>>>        *    - FEC configuration
>>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>>>>>>        * c) Any other configuration will not be stored
>>>>>>>>>>>        *    and will need to be re-configured.
>>>>>>>>>>> +     *
>>>>>>>>>>> +     * The application should restore some additional configuration
>>>>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
>>>>>>>>>>>        */
>>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>>>>>>       /** Port recovery failed.
>>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>>>>>>> --- a/lib/ethdev/version.map
>>>>>>>>>>> +++ b/lib/ethdev/version.map
>>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>>>>>>       rte_eth_devices;
>>>>>>>>>>>       rte_eth_dma_zone_free;
>>>>>>>>>>>       rte_eth_dma_zone_reserve;
>>>>>>>>>>> +    rte_eth_fp_ops_setup;
>>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
>>>>>>>>>>> --
>>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>>>>>>>>
>>>>>>>>>>> 2.17.1
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07 12:26                       ` fengchengwen
@ 2023-03-07 12:39                         ` Konstantin Ananyev
  2023-03-09  2:05                           ` Ajit Khaparde
  0 siblings, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-03-07 12:39 UTC (permalink / raw)
  To: Fengchengwen, Ferruh Yigit, Ajit Khaparde
  Cc: Konstantin Ananyev, Thomas Monjalon, Andrew Rybchenko, dev



> >>>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
> >>>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
> >>>>>>>>>>> application may still invoking data path API. This will introduce a
> >>>>>>>>>>> race-condition with data path which may lead to crash [1].
> >>>>>>>>>>>
> >>>>>>>>>>> Although the PMD added delay after setting data path pointers to cover
> >>>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
> >>>>>>>>>>> solve the problem.
> >>>>>>>>>>>
> >>>>>>>>>>> To solve the race-condition problem fundamentally, the following
> >>>>>>>>>>> requirements are added:
> >>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
> >>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>>>> 2. The application should stop data path API invocation when process
> >>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
> >>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>> 4. The application should enable data path API invocation when process
> >>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> How this is solving the race-condition, by pushing responsibility to
> >>>>>>>>> stop data path to application?
> >>>>>>>>
> >>>>>>>> Exactly, it becomes application responsibility to make sure data-path is
> >>>>>>>> stopped/suspended before recovery will continue.
> >>>>>>>>
> >>>>>>>
> >>>>>>> From documentation of the feature:
> >>>>>>>
> >>>>>>> ``
> >>>>>>> Because the PMD recovers automatically,
> >>>>>>> the application can only sense that the data flow is disconnected for a
> >>>>>>> while and the control API returns an error in this period.
> >>>>>>>
> >>>>>>> In order to sense the error happening/recovering, as well as to restore
> >>>>>>> some additional configuration, three events are available:
> >>>>>>> ``
> >>>>>>>
> >>>>>>> It looks like initial design is to use events mainly inform application
> >>>>>>> about what happened and mainly for re-configuration.
> >>>>>>>
> >>>>>>> Although I am don't disagree to involve the application, I am not sure
> >>>>>>> that is part of current design.
> >>>>>>
> >>>>>> I thought we all agreed that initial design contain some fallacies that
> >>>>>> need to fixed, no?
> >>>>>> Statement that with current rte_ethdev design error recovery can be done
> >>>>>> without interaction with the app (to stop/suspend data/control path)
> >>>>>> is the main one I think.
> >>>>>> It needs some interaction with app layer, one way or another.
> >>>>>>
> >>>>>>>>>
> >>>>>>>>> What if application is not interested in recovery modes at all and not
> >>>>>>>>> registered any callback for the recovery?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Are you saying there is no way for application to disable
> >>>>>>>> automatic recovery in PMD if it is not interested
> >>>>>>>> (or can't full-fill per-requesties for it)?
> >>>>>>>> If so, then yes it is a problem and we need to fix it.
> >>>>>>>> I assumed that such mechanism to disable unwanted events already exists,
> >>>>>>>> but I can't find anything.
> >>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
> >>>>>>>> based on callback return value, or do we need a new API to
> >>>>>>>> enable/disable callbacks, or ...?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> As far as I can see automatic recovery is not configurable by app.
> >>>>>>>
> >>>>>>> But that is not all, PMD sends events to application but PMD can't know
> >>>>>>> if application is handling them or not, so with current design PMD can't
> >>>>>>> rely on to app.
> >>>>>>
> >>>>>> Well, PMD invokes user provided callback.
> >>>>>> One way to fix that problem - if there is no callback provided,
> >>>>>> or callback returns an error code - PMD can assume that recovery
> >>>>>> should not be done.
> >>>>>> That is probably not the best design choice, but at least it will allow
> >>>>>> to fix the problem without too many changes and introducing new API.
> >>>>>> That could be sort of a 'quick fix'.
> >>>>>> In a meanwhile we can think about new/better approach for that.
> >>>>>>
> >>>>>
> >>>>> -rc2 for 23.03 is a few days away.
> >>>>>
> >>>>> What do you think to have 'quick fix' as modifying how driver updates
> >>>>> burst ops to prevent the race condition, for this release?
> >>
> >> The 'quick fix', do you mean only update function pointer (without rxq setting) ?
> >> Currently the PMDs which announced support "proactive error handling mode" already
> >> do this.
> >>
> >
> > Yes.
> > I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
> > updates all fields in 'rte_eth_fp_ops' but only function pointer seems
> > changed in the driver, resulting only function pointers to be updated.
> >
> > The discussion about race condition started with patch [1], which
> > mentions a crash because of a race condition. Later in discussions,
> > recovery event given as a sample for where the race can occur, that is
> > why we are here.
> >
> > But after above info, although there is race condition and a bigger
> > update (that needs application involvement) is required for recovery
> > mechanism, there is no crash and NO 'quick fix' is required for recovery.
> >
> > @Konstantin, @Chengwen, can you please confirm above understanding is
> > correct?
> 
> Yes, that's what.

Yes, I think with Chengwen patch the race condition problem should be fixed.
Though for that user has to provide a properly implemented callback.
What is not currently addressed - user can not disable this auto-recovery procedure on his will. 
So if user will not provide a proper call-back the recovery can still proceed and race can happen. 

> 
> >
> >
> >
> > [1]
> > https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> >
> >>>>>
> >>>>> And plan a design update for the next release?
> >>>> +1 on the overall approach.
> >>>
> >>> Yep, agree.
> >>
> >> Hope for better solution.
> >> And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
> >> registers RTE_ETH_EVENT_INTR_RESET callback .
> >>
> >> Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
> >> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> >>
> >>>
> >>>>
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>>>> I think driver should not rely on application for this, unless
> >>>>>>>>> application explicitly says (to driver) that it is handling recovery,
> >>>>>>>>> right now there is no way for driver to know this.
> >>>>>>>>
> >>>>>>>> I think it is visa-versa:
> >>>>>>>> application should not enable auto-recovery if it can't meet
> >>>>>>>> per-requeststies for it (provide appropriate callback).
> >>>>>>>>
> >>>>>>>
> >>>>>>> I agree on above, we are saying similar thing in different perspective.
> >>>>>>
> >>>>>> Ok, that's good we are on the same page.
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> Also, this patch introduce a driver internal function
> >>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> >>>>>>>>>>>
> >>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>>>>>>>>> Cc: stable@dpdk.org
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> >>>>>>>>>>> +++++++++++++++----------
> >>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
> >>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> index c145a9066c..e380ff135a 100644
> >>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> >>>>>>>>>>> in PASSIVE mode,
> >>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> >>>>>>>>>>>   and only a small amount of work is required for the application.
> >>>>>>>>>>>
> >>>>>>>>>>> -During error detection and automatic recovery,
> >>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
> >>>>>>>>>>> -(which will prevent the crash),
> >>>>>>>>>>> -and also make sure the control path operations fail with a return
> >>>>>>>>>>> code ``-EBUSY``.
> >>>>>>>>>>> -
> >>>>>>>>>>> -Because the PMD recovers automatically,
> >>>>>>>>>>> -the application can only sense that the data flow is disconnected
> >>>>>>>>>>> for a while
> >>>>>>>>>>> -and the control API returns an error in this period.
> >>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> >>>>>>>>>>> data path
> >>>>>>>>>>> +pointers to dummy functions and also make sure the control path
> >>>>>>>>>>> operations
> >>>>>>>>>>> +failed with a return code ``-EBUSY``.
> >>>>>>>>>>>
> >>>>>>>>>>>   In order to sense the error happening/recovering,
> >>>>>>>>>>>   as well as to restore some additional configuration,
> >>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>>>>>>>>      Notify the application that an error is detected
> >>>>>>>>>>> -   and the recovery is being started.
> >>>>>>>>>>> +   and the recovery is about to start.
> >>>>>>>>>>>      Upon receiving the event, the application should not invoke
> >>>>>>>>>>> -   any control path function until receiving
> >>>>>>>>>>> +   any control and data path API until receiving
> >>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>>>>>>>>
> >>>>>>>>>>>   .. note::
> >>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>>>>>>>>      Notify the application that the recovery from error is successful,
> >>>>>>>>>>> -   the PMD already re-configures the port,
> >>>>>>>>>>> -   and the effect is the same as a restart operation.
> >>>>>>>>>>> +   the PMD already re-configures the port.
> >>>>>>>>>>> +   The application should restore some additional configuration,
> >>>>>>>>>>> and then
> >>>>>>>>>>> +   enable data path API invocation.
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>>>>>>>>      Notify the application that the recovery from error failed,
> >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> >>>>>>>>>>> *dev, const char *ring_name,
> >>>>>>>>>>>       return rc;
> >>>>>>>>>>>   }
> >>>>>>>>>>>
> >>>>>>>>>>> +void
> >>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> >>>>>>>>>>> +{
> >>>>>>>>>>> +    if (dev == NULL)
> >>>>>>>>>>> +        return;
> >>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> >>>>>>>>>>> +}
> >>>>>>>>>>> +
> >>>>>>>>>>>   const struct rte_memzone *
> >>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> >>>>>>>>>>> *ring_name,
> >>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
> >>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>>>>>>>>> char *name,
> >>>>>>>>>>>            uint16_t queue_id);
> >>>>>>>>>>>
> >>>>>>>>>>> +/**
> >>>>>>>>>>> + * @internal
> >>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
> >>>>>>>>>>> + *
> >>>>>>>>>>> + * @param dev
> >>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
> >>>>>>>>>>> + */
> >>>>>>>>>>> +__rte_internal
> >>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>>>>>>>>> +
> >>>>>>>>>>>   /**
> >>>>>>>>>>>    * @internal
> >>>>>>>>>>>    * Atomically set the link status for the specific device.
> >>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> index 049641d57c..44ee7229c1 100644
> >>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
> >>>>>>>>>>> -     * If PMD supports proactive error recovery,
> >>>>>>>>>>> -     * it should trigger this event to notify application
> >>>>>>>>>>> -     * that it detected an error and the recovery is being started.
> >>>>>>>>>>> -     * Upon receiving the event, the application should not invoke
> >>>>>>>>>>> any control path API
> >>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> >>>>>>>>>>> receiving
> >>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
> >>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> >>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>> -     * It means that the application cannot send or receive any
> >>>>>>>>>>> packets
> >>>>>>>>>>> -     * during this period.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
> >>>>>>>>>>> this
> >>>>>>>>>>> +     * event to notify application that it detected an error and the
> >>>>>>>>>>> +     * recovery is about to start.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * Upon receiving the event, the application should not invoke any
> >>>>>>>>>>> +     * control and data path API until receiving
> >>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> >>>>>>>>>>> +     * event.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
> >>>>>>>>>>> pointers
> >>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> >>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>>>>>>>>> event.
> >>>>>>>>>>> +     *
> >>>>>>>>>>>        * @note Before the PMD reports the recovery result,
> >>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>>>>>>>>> again,
> >>>>>>>>>>>        * because a larger error may occur during the recovery.
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> >>>>>>>>>>>       /** Port recovers successfully from the error.
> >>>>>>>>>>> -     * The PMD already re-configured the port,
> >>>>>>>>>>> -     * and the effect is the same as a restart operation.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * The PMD already re-configured the port:
> >>>>>>>>>>>        * a) The following operation will be retained: (alphabetically)
> >>>>>>>>>>>        *    - DCB configuration
> >>>>>>>>>>>        *    - FEC configuration
> >>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>>>>>>>>        * c) Any other configuration will not be stored
> >>>>>>>>>>>        *    and will need to be re-configured.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * The application should restore some additional configuration
> >>>>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>>>>>>>>       /** Port recovery failed.
> >>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> >>>>>>>>>>> --- a/lib/ethdev/version.map
> >>>>>>>>>>> +++ b/lib/ethdev/version.map
> >>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>>>>>>>>       rte_eth_devices;
> >>>>>>>>>>>       rte_eth_dma_zone_free;
> >>>>>>>>>>>       rte_eth_dma_zone_reserve;
> >>>>>>>>>>> +    rte_eth_fp_ops_setup;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> >>>>>>>>>>> --
> >>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >>>>>>>>>>
> >>>>>>>>>>> 2.17.1
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >
> > .
> >

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07  8:39             ` fengchengwen
@ 2023-03-08  1:09               ` Honnappa Nagarahalli
  2023-03-09  0:59                 ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-08  1:09 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, dev, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd

<snip>

> >>>>>
> >>>
> >>> Is there any reason not to design this in the same way as
> >> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> >>
> >> I suppose it is a question for the authors of original patch...
> > Appreciate if the authors could comment on this.
> 
> The main cause is that the hardware implementation limit, I will try to explain
> from hns3 PMD's view.
> For a global reset, all the function need responsed within a centain period of
> time. otherwise, the reset will fail. and also the reset requirement a few steps (all
> may take a long time).
> 
> When with multiple functions in one DPDK, and trigger a global reset, the
> rte_eth_dev_reset will not cover this scene:
> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
> 2. then invoke application callback, but due to the same thread, and each
>     port's recover will take a long time, so later port will reset failed.
If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover, what problems do you see?

> 
> >
> >>
> >>> We could have a similar API 'rte_eth_dev_recover' to do the recovery
> >> functionality.
> >>
> >> I suppose such approach is also possible.
> >> Personally I am fine with both ways: either existing one or what you
> >> propose, as long as we'll fix existing race-condition.
> >> What is good with what you suggest - that way we probably don't need
> >> to worry how to allow user to enable/disable auto-recovery inside PMD.
> >>
> >> Konstantin
> >>
> >

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-08  1:09               ` Honnappa Nagarahalli
@ 2023-03-09  0:59                 ` fengchengwen
  2023-03-09  3:03                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-09  0:59 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev, dev, thomas,
	Ferruh Yigit, Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd



On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
> <snip>
> 
>>>>>>>
>>>>>
>>>>> Is there any reason not to design this in the same way as
>>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
>>>>
>>>> I suppose it is a question for the authors of original patch...
>>> Appreciate if the authors could comment on this.
>>
>> The main cause is that the hardware implementation limit, I will try to explain
>> from hns3 PMD's view.
>> For a global reset, all the function need responsed within a centain period of
>> time. otherwise, the reset will fail. and also the reset requirement a few steps (all
>> may take a long time).
>>
>> When with multiple functions in one DPDK, and trigger a global reset, the
>> rte_eth_dev_reset will not cover this scene:
>> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
>> 2. then invoke application callback, but due to the same thread, and each
>>     port's recover will take a long time, so later port will reset failed.
> If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover, what problems do you see?

I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has no difference with
RTE_ETH_EVENT_INTR_RESET mechanism.
Could you detail more?

> 
>>
>>>
>>>>
>>>>> We could have a similar API 'rte_eth_dev_recover' to do the recovery
>>>> functionality.
>>>>
>>>> I suppose such approach is also possible.
>>>> Personally I am fine with both ways: either existing one or what you
>>>> propose, as long as we'll fix existing race-condition.
>>>> What is good with what you suggest - that way we probably don't need
>>>> to worry how to allow user to enable/disable auto-recovery inside PMD.
>>>>
>>>> Konstantin
>>>>
>>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-07 12:39                         ` Konstantin Ananyev
@ 2023-03-09  2:05                           ` Ajit Khaparde
  0 siblings, 0 replies; 85+ messages in thread
From: Ajit Khaparde @ 2023-03-09  2:05 UTC (permalink / raw)
  To: Konstantin Ananyev
  Cc: Fengchengwen, Ferruh Yigit, Konstantin Ananyev, Thomas Monjalon,
	Andrew Rybchenko, dev

[-- Attachment #1: Type: text/plain, Size: 18696 bytes --]

On Tue, Mar 7, 2023 at 4:40 AM Konstantin Ananyev
<konstantin.ananyev@huawei.com> wrote:
>
>
>
> > >>>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
> > >>>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
> > >>>>>>>>>>> application may still invoking data path API. This will introduce a
> > >>>>>>>>>>> race-condition with data path which may lead to crash [1].
> > >>>>>>>>>>>
> > >>>>>>>>>>> Although the PMD added delay after setting data path pointers to cover
> > >>>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
> > >>>>>>>>>>> solve the problem.
> > >>>>>>>>>>>
> > >>>>>>>>>>> To solve the race-condition problem fundamentally, the following
> > >>>>>>>>>>> requirements are added:
> > >>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
> > >>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>>>> 2. The application should stop data path API invocation when process
> > >>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions before
> > >>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>>>> 4. The application should enable data path API invocation when process
> > >>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> How this is solving the race-condition, by pushing responsibility to
> > >>>>>>>>> stop data path to application?
> > >>>>>>>>
> > >>>>>>>> Exactly, it becomes application responsibility to make sure data-path is
> > >>>>>>>> stopped/suspended before recovery will continue.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> From documentation of the feature:
> > >>>>>>>
> > >>>>>>> ``
> > >>>>>>> Because the PMD recovers automatically,
> > >>>>>>> the application can only sense that the data flow is disconnected for a
> > >>>>>>> while and the control API returns an error in this period.
> > >>>>>>>
> > >>>>>>> In order to sense the error happening/recovering, as well as to restore
> > >>>>>>> some additional configuration, three events are available:
> > >>>>>>> ``
> > >>>>>>>
> > >>>>>>> It looks like initial design is to use events mainly inform application
> > >>>>>>> about what happened and mainly for re-configuration.
> > >>>>>>>
> > >>>>>>> Although I am don't disagree to involve the application, I am not sure
> > >>>>>>> that is part of current design.
> > >>>>>>
> > >>>>>> I thought we all agreed that initial design contain some fallacies that
> > >>>>>> need to fixed, no?
> > >>>>>> Statement that with current rte_ethdev design error recovery can be done
> > >>>>>> without interaction with the app (to stop/suspend data/control path)
> > >>>>>> is the main one I think.
> > >>>>>> It needs some interaction with app layer, one way or another.
> > >>>>>>
> > >>>>>>>>>
> > >>>>>>>>> What if application is not interested in recovery modes at all and not
> > >>>>>>>>> registered any callback for the recovery?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Are you saying there is no way for application to disable
> > >>>>>>>> automatic recovery in PMD if it is not interested
> > >>>>>>>> (or can't full-fill per-requesties for it)?
> > >>>>>>>> If so, then yes it is a problem and we need to fix it.
> > >>>>>>>> I assumed that such mechanism to disable unwanted events already exists,
> > >>>>>>>> but I can't find anything.
> > >>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
> > >>>>>>>> based on callback return value, or do we need a new API to
> > >>>>>>>> enable/disable callbacks, or ...?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> As far as I can see automatic recovery is not configurable by app.
> > >>>>>>>
> > >>>>>>> But that is not all, PMD sends events to application but PMD can't know
> > >>>>>>> if application is handling them or not, so with current design PMD can't
> > >>>>>>> rely on to app.
> > >>>>>>
> > >>>>>> Well, PMD invokes user provided callback.
> > >>>>>> One way to fix that problem - if there is no callback provided,
> > >>>>>> or callback returns an error code - PMD can assume that recovery
> > >>>>>> should not be done.
> > >>>>>> That is probably not the best design choice, but at least it will allow
> > >>>>>> to fix the problem without too many changes and introducing new API.
> > >>>>>> That could be sort of a 'quick fix'.
> > >>>>>> In a meanwhile we can think about new/better approach for that.
> > >>>>>>
> > >>>>>
> > >>>>> -rc2 for 23.03 is a few days away.
> > >>>>>
> > >>>>> What do you think to have 'quick fix' as modifying how driver updates
> > >>>>> burst ops to prevent the race condition, for this release?
> > >>
> > >> The 'quick fix', do you mean only update function pointer (without rxq setting) ?
> > >> Currently the PMDs which announced support "proactive error handling mode" already
> > >> do this.
> > >>
> > >
> > > Yes.
> > > I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
> > > updates all fields in 'rte_eth_fp_ops' but only function pointer seems
> > > changed in the driver, resulting only function pointers to be updated.
> > >
> > > The discussion about race condition started with patch [1], which
> > > mentions a crash because of a race condition. Later in discussions,
> > > recovery event given as a sample for where the race can occur, that is
> > > why we are here.
> > >
> > > But after above info, although there is race condition and a bigger
> > > update (that needs application involvement) is required for recovery
> > > mechanism, there is no crash and NO 'quick fix' is required for recovery.
> > >
> > > @Konstantin, @Chengwen, can you please confirm above understanding is
> > > correct?
> >
> > Yes, that's what.
>
> Yes, I think with Chengwen patch the race condition problem should be fixed.
> Though for that user has to provide a properly implemented callback.
> What is not currently addressed - user can not disable this auto-recovery procedure on his will.
> So if user will not provide a proper call-back the recovery can still proceed and race can happen.
Ideally the user or the application should participate in the recovery
to prevent more catastrophic results which may need a system reboot.
Not all scenarios are recoverable, but based on implementation that
could be a very small percentage.
But the application awareness and participation as an end goal is a
good idea nevertheless.

>
> >
> > >
> > >
> > >
> > > [1]
> > > https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> > >
> > >>>>>
> > >>>>> And plan a design update for the next release?
> > >>>> +1 on the overall approach.
> > >>>
> > >>> Yep, agree.
> > >>
> > >> Hope for better solution.
> > >> And also, I notice only the openvswitch (from all open-source software which based-on DPDK)
> > >> registers RTE_ETH_EVENT_INTR_RESET callback .
> > >>
> > >> Therefore, hope we build a recovery framework at the DPDK SDK level and be compatible
> > >> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> > >>
> > >>>
> > >>>>
> > >>>>>
> > >>>>>
> > >>>>>>>
> > >>>>>>>>> I think driver should not rely on application for this, unless
> > >>>>>>>>> application explicitly says (to driver) that it is handling recovery,
> > >>>>>>>>> right now there is no way for driver to know this.
> > >>>>>>>>
> > >>>>>>>> I think it is visa-versa:
> > >>>>>>>> application should not enable auto-recovery if it can't meet
> > >>>>>>>> per-requeststies for it (provide appropriate callback).
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> I agree on above, we are saying similar thing in different perspective.
> > >>>>>>
> > >>>>>> Ok, that's good we are on the same page.
> > >>>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>>> Also, this patch introduce a driver internal function
> > >>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]
> > >>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> > >>>>>>>>>>>
> > >>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> > >>>>>>>>>>> Cc: stable@dpdk.org
> > >>>>>>>>>>>
> > >>>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>>>>>>>>>> ---
> > >>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> > >>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> > >>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> > >>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> > >>>>>>>>>>> +++++++++++++++----------
> > >>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
> > >>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> > >>>>>>>>>>>
> > >>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>>>> index c145a9066c..e380ff135a 100644
> > >>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
> > >>>>>>>>>>> in PASSIVE mode,
> > >>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> > >>>>>>>>>>>   and only a small amount of work is required for the application.
> > >>>>>>>>>>>
> > >>>>>>>>>>> -During error detection and automatic recovery,
> > >>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
> > >>>>>>>>>>> -(which will prevent the crash),
> > >>>>>>>>>>> -and also make sure the control path operations fail with a return
> > >>>>>>>>>>> code ``-EBUSY``.
> > >>>>>>>>>>> -
> > >>>>>>>>>>> -Because the PMD recovers automatically,
> > >>>>>>>>>>> -the application can only sense that the data flow is disconnected
> > >>>>>>>>>>> for a while
> > >>>>>>>>>>> -and the control API returns an error in this period.
> > >>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> > >>>>>>>>>>> data path
> > >>>>>>>>>>> +pointers to dummy functions and also make sure the control path
> > >>>>>>>>>>> operations
> > >>>>>>>>>>> +failed with a return code ``-EBUSY``.
> > >>>>>>>>>>>
> > >>>>>>>>>>>   In order to sense the error happening/recovering,
> > >>>>>>>>>>>   as well as to restore some additional configuration,
> > >>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> > >>>>>>>>>>>
> > >>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> > >>>>>>>>>>>      Notify the application that an error is detected
> > >>>>>>>>>>> -   and the recovery is being started.
> > >>>>>>>>>>> +   and the recovery is about to start.
> > >>>>>>>>>>>      Upon receiving the event, the application should not invoke
> > >>>>>>>>>>> -   any control path function until receiving
> > >>>>>>>>>>> +   any control and data path API until receiving
> > >>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> > >>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> > >>>>>>>>>>>
> > >>>>>>>>>>>   .. note::
> > >>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> > >>>>>>>>>>>
> > >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> > >>>>>>>>>>>      Notify the application that the recovery from error is successful,
> > >>>>>>>>>>> -   the PMD already re-configures the port,
> > >>>>>>>>>>> -   and the effect is the same as a restart operation.
> > >>>>>>>>>>> +   the PMD already re-configures the port.
> > >>>>>>>>>>> +   The application should restore some additional configuration,
> > >>>>>>>>>>> and then
> > >>>>>>>>>>> +   enable data path API invocation.
> > >>>>>>>>>>>
> > >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> > >>>>>>>>>>>      Notify the application that the recovery from error failed,
> > >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> > >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> > >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
> > >>>>>>>>>>> *dev, const char *ring_name,
> > >>>>>>>>>>>       return rc;
> > >>>>>>>>>>>   }
> > >>>>>>>>>>>
> > >>>>>>>>>>> +void
> > >>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> > >>>>>>>>>>> +{
> > >>>>>>>>>>> +    if (dev == NULL)
> > >>>>>>>>>>> +        return;
> > >>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> > >>>>>>>>>>> +}
> > >>>>>>>>>>> +
> > >>>>>>>>>>>   const struct rte_memzone *
> > >>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
> > >>>>>>>>>>> *ring_name,
> > >>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> > >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> > >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> > >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
> > >>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> > >>>>>>>>>>> char *name,
> > >>>>>>>>>>>            uint16_t queue_id);
> > >>>>>>>>>>>
> > >>>>>>>>>>> +/**
> > >>>>>>>>>>> + * @internal
> > >>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
> > >>>>>>>>>>> + *
> > >>>>>>>>>>> + * @param dev
> > >>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
> > >>>>>>>>>>> + */
> > >>>>>>>>>>> +__rte_internal
> > >>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> > >>>>>>>>>>> +
> > >>>>>>>>>>>   /**
> > >>>>>>>>>>>    * @internal
> > >>>>>>>>>>>    * Atomically set the link status for the specific device.
> > >>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>>>> index 049641d57c..44ee7229c1 100644
> > >>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> > >>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> > >>>>>>>>>>>        */
> > >>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> > >>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
> > >>>>>>>>>>> -     * If PMD supports proactive error recovery,
> > >>>>>>>>>>> -     * it should trigger this event to notify application
> > >>>>>>>>>>> -     * that it detected an error and the recovery is being started.
> > >>>>>>>>>>> -     * Upon receiving the event, the application should not invoke
> > >>>>>>>>>>> any control path API
> > >>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> > >>>>>>>>>>> receiving
> > >>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> > >>>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
> > >>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> > >>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>>>> -     * It means that the application cannot send or receive any
> > >>>>>>>>>>> packets
> > >>>>>>>>>>> -     * during this period.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
> > >>>>>>>>>>> this
> > >>>>>>>>>>> +     * event to notify application that it detected an error and the
> > >>>>>>>>>>> +     * recovery is about to start.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>> +     * Upon receiving the event, the application should not invoke any
> > >>>>>>>>>>> +     * control and data path API until receiving
> > >>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> > >>>>>>>>>>> +     * event.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
> > >>>>>>>>>>> pointers
> > >>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to valid
> > >>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> > >>>>>>>>>>> event.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>>        * @note Before the PMD reports the recovery result,
> > >>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> > >>>>>>>>>>> again,
> > >>>>>>>>>>>        * because a larger error may occur during the recovery.
> > >>>>>>>>>>>        */
> > >>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> > >>>>>>>>>>>       /** Port recovers successfully from the error.
> > >>>>>>>>>>> -     * The PMD already re-configured the port,
> > >>>>>>>>>>> -     * and the effect is the same as a restart operation.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>> +     * The PMD already re-configured the port:
> > >>>>>>>>>>>        * a) The following operation will be retained: (alphabetically)
> > >>>>>>>>>>>        *    - DCB configuration
> > >>>>>>>>>>>        *    - FEC configuration
> > >>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> > >>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> > >>>>>>>>>>>        * c) Any other configuration will not be stored
> > >>>>>>>>>>>        *    and will need to be re-configured.
> > >>>>>>>>>>> +     *
> > >>>>>>>>>>> +     * The application should restore some additional configuration
> > >>>>>>>>>>> +     * (see above case b/c), and then enable data path API invocation.
> > >>>>>>>>>>>        */
> > >>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> > >>>>>>>>>>>       /** Port recovery failed.
> > >>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > >>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> > >>>>>>>>>>> --- a/lib/ethdev/version.map
> > >>>>>>>>>>> +++ b/lib/ethdev/version.map
> > >>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> > >>>>>>>>>>>       rte_eth_devices;
> > >>>>>>>>>>>       rte_eth_dma_zone_free;
> > >>>>>>>>>>>       rte_eth_dma_zone_reserve;
> > >>>>>>>>>>> +    rte_eth_fp_ops_setup;
> > >>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> > >>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> > >>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> > >>>>>>>>>>> --
> > >>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > >>>>>>>>>>
> > >>>>>>>>>>> 2.17.1
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >
> > > .
> > >

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-09  0:59                 ` fengchengwen
@ 2023-03-09  3:03                   ` Honnappa Nagarahalli
  2023-03-09 11:30                     ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-09  3:03 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, dev, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd



> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Wednesday, March 8, 2023 7:00 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> anakkur.purayil@broadcom.com>; Ajit Khaparde
> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> Cc: nd <nd@arm.com>
> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> 
> 
> On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
> > <snip>
> >
> >>>>>>>
> >>>>>
> >>>>> Is there any reason not to design this in the same way as
> >>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> >>>>
> >>>> I suppose it is a question for the authors of original patch...
> >>> Appreciate if the authors could comment on this.
> >>
> >> The main cause is that the hardware implementation limit, I will try
> >> to explain from hns3 PMD's view.
> >> For a global reset, all the function need responsed within a centain
> >> period of time. otherwise, the reset will fail. and also the reset
> >> requirement a few steps (all may take a long time).
> >>
> >> When with multiple functions in one DPDK, and trigger a global reset,
> >> the rte_eth_dev_reset will not cover this scene:
> >> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
> >> 2. then invoke application callback, but due to the same thread, and each
> >>     port's recover will take a long time, so later port will reset failed.
I am reading this again. What you are saying is, a single thread running the recovery process in sequence for multiple ports will not meet the required time limits. Hence, the recovery process needs to run in multiple threads simultaneously. This way each thread could run the recovery for a different port. Do I understand this correctly?

(Assuming my understanding is correct) The current implementation is running the recovery process in the context of data plane threads and not in the interrupt thread. Is this correct?

> > If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and
> rte_eth_dev_recover, what problems do you see?
> 
> I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has no
> difference with RTE_ETH_EVENT_INTR_RESET mechanism.
> Could you detail more?
> 
> >
> >>
> >>>
> >>>>
> >>>>> We could have a similar API 'rte_eth_dev_recover' to do the
> >>>>> recovery
> >>>> functionality.
> >>>>
> >>>> I suppose such approach is also possible.
> >>>> Personally I am fine with both ways: either existing one or what
> >>>> you propose, as long as we'll fix existing race-condition.
> >>>> What is good with what you suggest - that way we probably don't
> >>>> need to worry how to allow user to enable/disable auto-recovery inside
> PMD.
> >>>>
> >>>> Konstantin
> >>>>
> >>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-09  3:03                   ` Honnappa Nagarahalli
@ 2023-03-09 11:30                     ` fengchengwen
  2023-03-10  3:25                       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-03-09 11:30 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev, dev, thomas,
	Ferruh Yigit, Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd



On 2023/3/9 11:03, Honnappa Nagarahalli wrote:
> 
> 
>> -----Original Message-----
>> From: fengchengwen <fengchengwen@huawei.com>
>> Sent: Wednesday, March 8, 2023 7:00 PM
>> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
>> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
>> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
>> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
>> anakkur.purayil@broadcom.com>; Ajit Khaparde
>> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
>> Cc: nd <nd@arm.com>
>> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
>> mode
>>
>>
>>
>> On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
>>> <snip>
>>>
>>>>>>>>>
>>>>>>>
>>>>>>> Is there any reason not to design this in the same way as
>>>>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
>>>>>>
>>>>>> I suppose it is a question for the authors of original patch...
>>>>> Appreciate if the authors could comment on this.
>>>>
>>>> The main cause is that the hardware implementation limit, I will try
>>>> to explain from hns3 PMD's view.
>>>> For a global reset, all the function need responsed within a centain
>>>> period of time. otherwise, the reset will fail. and also the reset
>>>> requirement a few steps (all may take a long time).
>>>>
>>>> When with multiple functions in one DPDK, and trigger a global reset,
>>>> the rte_eth_dev_reset will not cover this scene:
>>>> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
>>>> 2. then invoke application callback, but due to the same thread, and each
>>>>     port's recover will take a long time, so later port will reset failed.
> I am reading this again. What you are saying is, a single thread running the recovery process in sequence for multiple ports will not meet the required time limits. Hence, the recovery process needs to run in multiple threads simultaneously. This way each thread could run the recovery for a different port. Do I understand this correctly?

No
It's not realistic to have threads on every port.

> 
> (Assuming my understanding is correct) The current implementation is running the recovery process in the context of data plane threads and not in the interrupt thread. Is this correct?

No, the recovery process is running in the interrupt thread.

> 
>>> If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and
>> rte_eth_dev_recover, what problems do you see?
>>
>> I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has no
>> difference with RTE_ETH_EVENT_INTR_RESET mechanism.
>> Could you detail more?
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>> We could have a similar API 'rte_eth_dev_recover' to do the
>>>>>>> recovery
>>>>>> functionality.
>>>>>>
>>>>>> I suppose such approach is also possible.
>>>>>> Personally I am fine with both ways: either existing one or what
>>>>>> you propose, as long as we'll fix existing race-condition.
>>>>>> What is good with what you suggest - that way we probably don't
>>>>>> need to worry how to allow user to enable/disable auto-recovery inside
>> PMD.
>>>>>>
>>>>>> Konstantin
>>>>>>
>>>>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
  2023-03-09 11:30                     ` fengchengwen
@ 2023-03-10  3:25                       ` Honnappa Nagarahalli
  0 siblings, 0 replies; 85+ messages in thread
From: Honnappa Nagarahalli @ 2023-03-10  3:25 UTC (permalink / raw)
  To: fengchengwen, Konstantin Ananyev, dev, thomas, Ferruh Yigit,
	Andrew Rybchenko, Kalesh AP,
	Ajit Khaparde (ajit.khaparde@broadcom.com)
  Cc: nd, nd



> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Thursday, March 9, 2023 5:31 AM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> anakkur.purayil@broadcom.com>; Ajit Khaparde
> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> Cc: nd <nd@arm.com>
> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> 
> 
> On 2023/3/9 11:03, Honnappa Nagarahalli wrote:
> >
> >
> >> -----Original Message-----
> >> From: fengchengwen <fengchengwen@huawei.com>
> >> Sent: Wednesday, March 8, 2023 7:00 PM
> >> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> Konstantin
> >> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
> >> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> >> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> >> anakkur.purayil@broadcom.com>; Ajit Khaparde
> >> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> >> Cc: nd <nd@arm.com>
> >> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive
> >> error handling mode
> >>
> >>
> >>
> >> On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
> >>> <snip>
> >>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>> Is there any reason not to design this in the same way as
> >>>>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> >>>>>>
> >>>>>> I suppose it is a question for the authors of original patch...
> >>>>> Appreciate if the authors could comment on this.
> >>>>
> >>>> The main cause is that the hardware implementation limit, I will
> >>>> try to explain from hns3 PMD's view.
> >>>> For a global reset, all the function need responsed within a
> >>>> centain period of time. otherwise, the reset will fail. and also
> >>>> the reset requirement a few steps (all may take a long time).
> >>>>
> >>>> When with multiple functions in one DPDK, and trigger a global
> >>>> reset, the rte_eth_dev_reset will not cover this scene:
> >>>> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt
> thread.
> >>>> 2. then invoke application callback, but due to the same thread, and
> each
> >>>>     port's recover will take a long time, so later port will reset failed.
> > I am reading this again. What you are saying is, a single thread running the
> recovery process in sequence for multiple ports will not meet the required
> time limits. Hence, the recovery process needs to run in multiple threads
> simultaneously. This way each thread could run the recovery for a different
> port. Do I understand this correctly?
> 
> No
> It's not realistic to have threads on every port.
> 
> >
> > (Assuming my understanding is correct) The current implementation is
> running the recovery process in the context of data plane threads and not in
> the interrupt thread. Is this correct?
> 
> No, the recovery process is running in the interrupt thread.
Ok.

> 
> >
> >>> If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and
> >> rte_eth_dev_recover, what problems do you see?
> >>
> >> I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has
> no
> >> difference with RTE_ETH_EVENT_INTR_RESET mechanism.
> >> Could you detail more?
They are similar. i.e. we use RTE_ETH_EVENT_INTR_RECOVER to indicate that it is a recovery interrupt (not a reset event). The recovery process is called through new rte_eth_dev_recover API. What problems do you see with it?
I am unable to understand the problems you have described above.

> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>> We could have a similar API 'rte_eth_dev_recover' to do the
> >>>>>>> recovery
> >>>>>> functionality.
> >>>>>>
> >>>>>> I suppose such approach is also possible.
> >>>>>> Personally I am fine with both ways: either existing one or what
> >>>>>> you propose, as long as we'll fix existing race-condition.
> >>>>>> What is good with what you suggest - that way we probably don't
> >>>>>> need to worry how to allow user to enable/disable auto-recovery
> >>>>>> inside
> >> PMD.
> >>>>>>
> >>>>>> Konstantin
> >>>>>>
> >>>>>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5] fix race-condition of proactive error handling mode
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
                   ` (4 preceding siblings ...)
  2023-03-01  3:06 ` [PATCH 5/5] app/testpmd: add error recovery usage demo Chengwen Feng
@ 2023-09-21 11:12 ` Ferruh Yigit
  2023-10-07  2:32   ` fengchengwen
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
  7 siblings, 1 reply; 85+ messages in thread
From: Ferruh Yigit @ 2023-09-21 11:12 UTC (permalink / raw)
  To: Chengwen Feng, thomas, konstantin.ananyev
  Cc: dev, Ajit Khaparde, Honnappa Nagarahalli

On 3/1/2023 3:06 AM, Chengwen Feng wrote:
> This patch fixes race-condition of proactive error handling mode, the
> discussion thread [1].
> 
> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> 
> Chengwen Feng (5):
>   ethdev: fix race-condition of proactive error handling mode
>   net/hns3: replace fp ops config function
>   net/bnxt: fix race-condition when report error recovery
>   net/bnxt: use fp ops setup function
>   app/testpmd: add error recovery usage demo
> 

Hi Chengwen,

This patch is old and as discussion get longer it became hard to
follow/manage.

If the issue is valid, can you please refresh the patchset?
Sorry for the inconvenience.


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5] fix race-condition of proactive error handling mode
  2023-09-21 11:12 ` [PATCH 0/5] fix race-condition of proactive error handling mode Ferruh Yigit
@ 2023-10-07  2:32   ` fengchengwen
  0 siblings, 0 replies; 85+ messages in thread
From: fengchengwen @ 2023-10-07  2:32 UTC (permalink / raw)
  To: Ferruh Yigit, thomas, konstantin.ananyev
  Cc: dev, Ajit Khaparde, Honnappa Nagarahalli

Hi Ferruh,

Thanks for the reminder.

I will send a new version as soon as possible.

Thanks.

On 2023/9/21 19:12, Ferruh Yigit wrote:
> On 3/1/2023 3:06 AM, Chengwen Feng wrote:
>> This patch fixes race-condition of proactive error handling mode, the
>> discussion thread [1].
>>
>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>
>> Chengwen Feng (5):
>>   ethdev: fix race-condition of proactive error handling mode
>>   net/hns3: replace fp ops config function
>>   net/bnxt: fix race-condition when report error recovery
>>   net/bnxt: use fp ops setup function
>>   app/testpmd: add error recovery usage demo
>>
> 
> Hi Chengwen,
> 
> This patch is old and as discussion get longer it became hard to
> follow/manage.
> 
> If the issue is valid, can you please refresh the patchset?
> Sorry for the inconvenience.
> 
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 0/7] fix race-condition of proactive error handling mode
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
                   ` (5 preceding siblings ...)
  2023-09-21 11:12 ` [PATCH 0/5] fix race-condition of proactive error handling mode Ferruh Yigit
@ 2023-10-20 10:07 ` Chengwen Feng
  2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
                     ` (7 more replies)
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
  7 siblings, 8 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch fixes race-condition of proactive error handling mode, the
discussion thread [1].

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Chengwen Feng (7):
  ethdev: fix race-condition of proactive error handling mode
  net/hns3: replace fp ops config function
  net/bnxt: fix race-condition when report error recovery
  net/bnxt: use fp ops setup function
  app/testpmd: add error recovery usage demo
  app/testpmd: extract event handling to event.c
  doc: testpmd support event handling section

---
v2: 
- extract event handling to event.c and document it, which address
  Ferruh's comment.
- add ack-by from Konstantin Ananyev and Dongdong Liu.

 app/test-pmd/event.c                         | 390 +++++++++++++++++++
 app/test-pmd/meson.build                     |   1 +
 app/test-pmd/parameters.c                    |  36 +-
 app/test-pmd/testpmd.c                       | 247 +-----------
 app/test-pmd/testpmd.h                       |  10 +-
 doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
 doc/guides/testpmd_app_ug/event_handling.rst |  80 ++++
 doc/guides/testpmd_app_ug/index.rst          |   1 +
 drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
 drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
 drivers/net/hns3/hns3_rxtx.c                 |  21 +-
 lib/ethdev/ethdev_driver.c                   |   8 +
 lib/ethdev/ethdev_driver.h                   |  10 +
 lib/ethdev/rte_ethdev.h                      |  32 +-
 lib/ethdev/version.map                       |   1 +
 15 files changed, 551 insertions(+), 333 deletions(-)
 create mode 100644 app/test-pmd/event.c
 create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst

-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 1/7] ethdev: fix race-condition of proactive error handling mode
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-01  3:39     ` lihuisong (C)
  2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Andrew Rybchenko, Somnath Kotur, Kalesh AP
  Cc: dev, Honnappa.Nagarahalli

In the proactive error handling mode, the PMD will set the data path
pointers to dummy functions and then try recovery, in this period the
application may still invoking data path API. This will introduce a
race-condition with data path which may lead to crash [1].

Although the PMD added delay after setting data path pointers to cover
the above race-condition, it reduces the probability, but it doesn't
solve the problem.

To solve the race-condition problem fundamentally, the following
requirements are added:
1. The PMD should set the data path pointers to dummy functions after
   report RTE_ETH_EVENT_ERR_RECOVERING event.
2. The application should stop data path API invocation when process
   the RTE_ETH_EVENT_ERR_RECOVERING event.
3. The PMD should set the data path pointers to valid functions before
   report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
4. The application should enable data path API invocation when process
   the RTE_ETH_EVENT_RECOVERY_SUCCESS event.

Also, this patch introduce a driver internal function
rte_eth_fp_ops_setup which used as an help function for PMD.

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
 lib/ethdev/ethdev_driver.c              |  8 +++++++
 lib/ethdev/ethdev_driver.h              | 10 ++++++++
 lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
 lib/ethdev/version.map                  |  1 +
 5 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index c145a9066c..e380ff135a 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
 the PMD automatically recovers from error in PROACTIVE mode,
 and only a small amount of work is required for the application.
 
-During error detection and automatic recovery,
-the PMD sets the data path pointers to dummy functions
-(which will prevent the crash),
-and also make sure the control path operations fail with a return code ``-EBUSY``.
-
-Because the PMD recovers automatically,
-the application can only sense that the data flow is disconnected for a while
-and the control API returns an error in this period.
+During error detection and automatic recovery, the PMD sets the data path
+pointers to dummy functions and also make sure the control path operations
+failed with a return code ``-EBUSY``.
 
 In order to sense the error happening/recovering,
 as well as to restore some additional configuration,
@@ -653,9 +648,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_ERR_RECOVERING``
    Notify the application that an error is detected
-   and the recovery is being started.
+   and the recovery is about to start.
    Upon receiving the event, the application should not invoke
-   any control path function until receiving
+   any control and data path API until receiving
    ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
 
 .. note::
@@ -666,8 +661,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
    Notify the application that the recovery from error is successful,
-   the PMD already re-configures the port,
-   and the effect is the same as a restart operation.
+   the PMD already re-configures the port.
+   The application should restore some additional configuration, and then
+   enable data path API invocation.
 
 ``RTE_ETH_EVENT_RECOVERY_FAILED``
    Notify the application that the recovery from error failed,
diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index fff4b7b4cd..65ead7b910 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -537,6 +537,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
 	return rc;
 }
 
+void
+rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
+{
+	if (dev == NULL)
+		return;
+	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
+}
+
 const struct rte_memzone *
 rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
 			 uint16_t queue_id, size_t size, unsigned int align,
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index deb23ada18..8567b96f53 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1636,6 +1636,16 @@ int
 rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
 		 uint16_t queue_id);
 
+/**
+ * @internal
+ * Setup eth fast-path API to ethdev values.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ */
+__rte_internal
+void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
+
 /**
  * @internal
  * Atomically set the link status for the specific device.
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 85b9af7a02..dbe2d9c745 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -3994,25 +3994,28 @@ enum rte_eth_event_type {
 	 */
 	RTE_ETH_EVENT_RX_AVAIL_THRESH,
 	/** Port recovering from a hardware or firmware error.
-	 * If PMD supports proactive error recovery,
-	 * it should trigger this event to notify application
-	 * that it detected an error and the recovery is being started.
-	 * Upon receiving the event, the application should not invoke any control path API
-	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
-	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
-	 * The PMD will set the data path pointers to dummy functions,
-	 * and re-set the data path pointers to non-dummy functions
-	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
-	 * It means that the application cannot send or receive any packets
-	 * during this period.
+	 *
+	 * If PMD supports proactive error recovery, it should trigger this
+	 * event to notify application that it detected an error and the
+	 * recovery is about to start.
+	 *
+	 * Upon receiving the event, the application should not invoke any
+	 * control and data path API until receiving
+	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
+	 * event.
+	 *
+	 * Once this event is reported, the PMD will set the data path pointers
+	 * to dummy functions, and re-set the data path pointers to valid
+	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
+	 *
 	 * @note Before the PMD reports the recovery result,
 	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
 	 * because a larger error may occur during the recovery.
 	 */
 	RTE_ETH_EVENT_ERR_RECOVERING,
 	/** Port recovers successfully from the error.
-	 * The PMD already re-configured the port,
-	 * and the effect is the same as a restart operation.
+	 *
+	 * The PMD already re-configured the port:
 	 * a) The following operation will be retained: (alphabetically)
 	 *    - DCB configuration
 	 *    - FEC configuration
@@ -4039,6 +4042,9 @@ enum rte_eth_event_type {
 	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
 	 * c) Any other configuration will not be stored
 	 *    and will need to be re-configured.
+	 *
+	 * The application should restore some additional configuration
+	 * (see above case b/c), and then enable data path API invocation.
 	 */
 	RTE_ETH_EVENT_RECOVERY_SUCCESS,
 	/** Port recovery failed.
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 919ba5b8e6..1e6ee0a6f1 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -338,6 +338,7 @@ INTERNAL {
 	rte_eth_devices;
 	rte_eth_dma_zone_free;
 	rte_eth_dma_zone_reserve;
+	rte_eth_fp_ops_setup;
 	rte_eth_hairpin_queue_peer_bind;
 	rte_eth_hairpin_queue_peer_unbind;
 	rte_eth_hairpin_queue_peer_update;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 2/7] net/hns3: replace fp ops config function
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
  2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-01  3:40     ` lihuisong (C)
  2023-11-02 10:34     ` Konstantin Ananyev
  2023-10-20 10:07   ` [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
                     ` (5 subsequent siblings)
  7 siblings, 2 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde, Jie Hai,
	Yisen Zhuang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch replace hns3_eth_dev_fp_ops_config() with
rte_eth_fp_ops_setup().

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index f3c3b38c55..f43f1eb9ad 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -4434,21 +4434,6 @@ hns3_trace_rxtx_function(struct rte_eth_dev *dev)
 		 rx_mode.info, tx_mode.info);
 }
 
-static void
-hns3_eth_dev_fp_ops_config(const struct rte_eth_dev *dev)
-{
-	struct rte_eth_fp_ops *fpo = rte_eth_fp_ops;
-	uint16_t port_id = dev->data->port_id;
-
-	fpo[port_id].rx_pkt_burst = dev->rx_pkt_burst;
-	fpo[port_id].tx_pkt_burst = dev->tx_pkt_burst;
-	fpo[port_id].tx_pkt_prepare = dev->tx_pkt_prepare;
-	fpo[port_id].rx_descriptor_status = dev->rx_descriptor_status;
-	fpo[port_id].tx_descriptor_status = dev->tx_descriptor_status;
-	fpo[port_id].rxq.data = dev->data->rx_queues;
-	fpo[port_id].txq.data = dev->data->tx_queues;
-}
-
 void
 hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 {
@@ -4471,7 +4456,7 @@ hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 	}
 
 	hns3_trace_rxtx_function(eth_dev);
-	hns3_eth_dev_fp_ops_config(eth_dev);
+	rte_eth_fp_ops_setup(eth_dev);
 }
 
 void
@@ -4824,7 +4809,7 @@ hns3_stop_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 	dev->tx_pkt_prepare = NULL;
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
@@ -4841,7 +4826,7 @@ hns3_start_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = hns3_get_tx_function(dev);
 	dev->tx_pkt_prepare = hns3_get_tx_prepare(dev);
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
  2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
  2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-02 16:28     ` Ajit Khaparde
  2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Somnath Kotur, Kalesh AP
  Cc: dev, andrew.rybchenko, Honnappa.Nagarahalli

If set data path functions to dummy functions before reports error
recovering event, there maybe a race-condition with data path threads,
this patch fixes it by setting data path functions to dummy functions
only after reports such event.

Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
 drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 0733cf4df2..d8947d5b5f 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
 		PMD_DRV_LOG(INFO, "Port conn async event\n");
 		break;
 	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
-		/*
-		 * Avoid any rx/tx packet processing during firmware reset
-		 * operation.
-		 */
-		bnxt_stop_rxtx(bp->eth_dev);
-
 		/* Ignore reset notify async events when stopping the port */
 		if (!bp->eth_dev->data->dev_started) {
+			bnxt_stop_rxtx(bp->eth_dev);
 			bp->flags |= BNXT_FLAG_FATAL_ERROR;
 			return;
 		}
@@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
 					     RTE_ETH_EVENT_ERR_RECOVERING,
 					     NULL);
 
+		/*
+		 * Avoid any rx/tx packet processing during firmware reset
+		 * operation.
+		 */
+		bnxt_stop_rxtx(bp->eth_dev);
+
 		pthread_mutex_lock(&bp->err_recovery_lock);
 		event_data = data1;
 		/* timestamp_lo/hi values are in units of 100ms */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 5c4d96d4b1..003a6eec11 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4616,14 +4616,14 @@ static void bnxt_check_fw_health(void *arg)
 	bp->flags |= BNXT_FLAG_FATAL_ERROR;
 	bp->flags |= BNXT_FLAG_FW_RESET;
 
-	bnxt_stop_rxtx(bp->eth_dev);
-
 	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
 
 	rte_eth_dev_callback_process(bp->eth_dev,
 				     RTE_ETH_EVENT_ERR_RECOVERING,
 				     NULL);
 
+	bnxt_stop_rxtx(bp->eth_dev);
+
 	if (bnxt_is_primary_func(bp))
 		wait_msec = info->primary_func_wait_period;
 	else
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 4/7] net/bnxt: use fp ops setup function
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
                     ` (2 preceding siblings ...)
  2023-10-20 10:07   ` [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-01  3:48     ` lihuisong (C)
  2023-11-02 10:34     ` Konstantin Ananyev
  2023-10-20 10:07   ` [PATCH v2 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
                     ` (3 subsequent siblings)
  7 siblings, 2 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde, Somnath Kotur
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Use rte_eth_fp_ops_setup() instead of directly manipulating
rte_eth_fp_ops variable.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 5 +----
 drivers/net/bnxt/bnxt_ethdev.c | 5 +----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index d8947d5b5f..3a08028331 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
 	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
 	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 
-	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
-		eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
-		eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(eth_dev);
 	rte_mb();
 
 	/* Allow time for threads to exit the real burst functions. */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 003a6eec11..9d9b9ae8cf 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4428,10 +4428,7 @@ static void bnxt_dev_recover(void *arg)
 	if (rc)
 		goto err_start;
 
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
-		bp->eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
-		bp->eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(bp->eth_dev);
 	rte_mb();
 
 	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 5/7] app/testpmd: add error recovery usage demo
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
                     ` (3 preceding siblings ...)
  2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-01  4:08     ` lihuisong (C)
  2023-10-20 10:07   ` [PATCH v2 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch adds error recovery usage demo which will:
1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
   is received.
2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
   event is received.
3. prompt the ports that fail to recovery and need to be removed when
   the RTE_ETH_EVENT_RECOVERY_FAILED event is received.

In addition, a message is added to the printed information, requiring
no command to be executed during the error recovery.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h |  4 ++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 595b77748c..39a25238e5 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3942,6 +3942,77 @@ rmv_port_callback(void *arg)
 		start_packet_forwarding(0);
 }
 
+static int need_start_when_recovery_over;
+
+static bool
+has_port_in_err_recovering(void)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->err_recovering)
+			return true;
+	}
+
+	return false;
+}
+
+static void
+err_recovering_callback(portid_t port_id)
+{
+	if (!has_port_in_err_recovering())
+		printf("Please stop executing any commands until recovery result events are received!\n");
+
+	ports[port_id].err_recovering = 1;
+	ports[port_id].recover_failed = 0;
+
+	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
+	if (!test_done) {
+		printf("Stop packet forwarding because some ports are in error recovering!\n");
+		stop_packet_forwarding();
+		need_start_when_recovery_over = 1;
+	}
+}
+
+static void
+recover_success_callback(portid_t port_id)
+{
+	ports[port_id].err_recovering = 0;
+	if (has_port_in_err_recovering())
+		return;
+
+	if (need_start_when_recovery_over) {
+		printf("Recovery success! Restart packet forwarding!\n");
+		start_packet_forwarding(0);
+		need_start_when_recovery_over = 0;
+	} else {
+		printf("Recovery success!\n");
+	}
+}
+
+static void
+recover_failed_callback(portid_t port_id)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	ports[port_id].err_recovering = 0;
+	ports[port_id].recover_failed = 1;
+	if (has_port_in_err_recovering())
+		return;
+
+	need_start_when_recovery_over = 0;
+	printf("The ports:");
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->recover_failed)
+			printf(" %u", pid);
+	}
+	printf(" recovery failed! Please remove them!\n");
+}
+
 /* This function is used by the interrupt thread */
 static int
 eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
@@ -3997,6 +4068,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
 		}
 		break;
 	}
+	case RTE_ETH_EVENT_ERR_RECOVERING:
+		err_recovering_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
+		recover_success_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_FAILED:
+		recover_failed_callback(port_id);
+		break;
 	default:
 		break;
 	}
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 09a36b90b8..42782d5a05 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -342,7 +342,9 @@ struct rte_port {
 	uint8_t                 member_flag : 1, /**< bonding member port */
 				bond_flag : 1, /**< port is bond device */
 				fwd_mac_swap : 1, /**< swap packet MAC before forward */
-				update_conf : 1; /**< need to update bonding device configuration */
+				update_conf : 1, /**< need to update bonding device configuration */
+				err_recovering : 1, /**< port is in error recovering */
+				recover_failed : 1; /**< port recover failed */
 	struct port_template    *pattern_templ_list; /**< Pattern templates. */
 	struct port_template    *actions_templ_list; /**< Actions templates. */
 	struct port_table       *table_list; /**< Flow tables. */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 6/7] app/testpmd: extract event handling to event.c
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
                     ` (4 preceding siblings ...)
  2023-10-20 10:07   ` [PATCH v2 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-01  4:09     ` lihuisong (C)
  2023-10-20 10:07   ` [PATCH v2 7/7] doc: testpmd support event handling section Chengwen Feng
  2023-11-06  1:35   ` [PATCH v2 0/7] fix race-condition of proactive error handling mode fengchengwen
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch extract event handling (including eth-event and dev-event)
to a new file 'event.c'.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 app/test-pmd/event.c      | 390 ++++++++++++++++++++++++++++++++++++++
 app/test-pmd/meson.build  |   1 +
 app/test-pmd/parameters.c |  36 +---
 app/test-pmd/testpmd.c    | 327 +-------------------------------
 app/test-pmd/testpmd.h    |   6 +
 5 files changed, 407 insertions(+), 353 deletions(-)
 create mode 100644 app/test-pmd/event.c

diff --git a/app/test-pmd/event.c b/app/test-pmd/event.c
new file mode 100644
index 0000000000..8393e105d7
--- /dev/null
+++ b/app/test-pmd/event.c
@@ -0,0 +1,390 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 HiSilicon Limited
+ */
+
+#include <stdint.h>
+
+#include <rte_alarm.h>
+#include <rte_ethdev.h>
+#include <rte_dev.h>
+#include <rte_log.h>
+#ifdef RTE_NET_MLX5
+#include "mlx5_testpmd.h"
+#endif
+
+#include "testpmd.h"
+
+/* Pretty printing of ethdev events */
+static const char * const eth_event_desc[] = {
+	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
+	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
+	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
+	[RTE_ETH_EVENT_INTR_RESET] = "reset",
+	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
+	[RTE_ETH_EVENT_IPSEC] = "IPsec",
+	[RTE_ETH_EVENT_MACSEC] = "MACsec",
+	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
+	[RTE_ETH_EVENT_NEW] = "device probed",
+	[RTE_ETH_EVENT_DESTROY] = "device released",
+	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
+	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
+	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
+	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
+	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
+	[RTE_ETH_EVENT_MAX] = NULL,
+};
+
+/*
+ * Display or mask ether events
+ * Default to all events except VF_MBOX
+ */
+uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
+
+int
+get_event_name_mask(const char *name, uint32_t *mask)
+{
+	if (!strcmp(name, "unknown"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
+	else if (!strcmp(name, "intr_lsc"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
+	else if (!strcmp(name, "queue_state"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
+	else if (!strcmp(name, "intr_reset"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
+	else if (!strcmp(name, "vf_mbox"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
+	else if (!strcmp(name, "ipsec"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
+	else if (!strcmp(name, "macsec"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
+	else if (!strcmp(name, "intr_rmv"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
+	else if (!strcmp(name, "dev_probed"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
+	else if (!strcmp(name, "dev_released"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
+	else if (!strcmp(name, "flow_aged"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
+	else if (!strcmp(name, "err_recovering"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
+	else if (!strcmp(name, "recovery_success"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
+	else if (!strcmp(name, "recovery_failed"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
+	else if (!strcmp(name, "all"))
+		*mask = ~UINT32_C(0);
+	else
+		return -1;
+
+	return 0;
+}
+
+static void
+rmv_port_callback(void *arg)
+{
+	int need_to_start = 0;
+	int org_no_link_check = no_link_check;
+	portid_t port_id = (intptr_t)arg;
+	struct rte_eth_dev_info dev_info;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_RET(port_id);
+
+	if (!test_done && port_is_forwarding(port_id)) {
+		need_to_start = 1;
+		stop_packet_forwarding();
+	}
+	no_link_check = 1;
+	stop_port(port_id);
+	no_link_check = org_no_link_check;
+
+	ret = eth_dev_info_get_print_err(port_id, &dev_info);
+	if (ret != 0)
+		TESTPMD_LOG(ERR,
+			"Failed to get device info for port %d, not detaching\n",
+			port_id);
+	else {
+		struct rte_device *device = dev_info.device;
+		close_port(port_id);
+		detach_device(device); /* might be already removed or have more ports */
+	}
+	if (need_to_start)
+		start_packet_forwarding(0);
+}
+
+static int need_start_when_recovery_over;
+
+static bool
+has_port_in_err_recovering(void)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->err_recovering)
+			return true;
+	}
+
+	return false;
+}
+
+static void
+err_recovering_callback(portid_t port_id)
+{
+	if (!has_port_in_err_recovering())
+		printf("Please stop executing any commands until recovery result events are received!\n");
+
+	ports[port_id].err_recovering = 1;
+	ports[port_id].recover_failed = 0;
+
+	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
+	if (!test_done) {
+		printf("Stop packet forwarding because some ports are in error recovering!\n");
+		stop_packet_forwarding();
+		need_start_when_recovery_over = 1;
+	}
+}
+
+static void
+recover_success_callback(portid_t port_id)
+{
+	ports[port_id].err_recovering = 0;
+	if (has_port_in_err_recovering())
+		return;
+
+	if (need_start_when_recovery_over) {
+		printf("Recovery success! Restart packet forwarding!\n");
+		start_packet_forwarding(0);
+		need_start_when_recovery_over = 0;
+	} else {
+		printf("Recovery success!\n");
+	}
+}
+
+static void
+recover_failed_callback(portid_t port_id)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	ports[port_id].err_recovering = 0;
+	ports[port_id].recover_failed = 1;
+	if (has_port_in_err_recovering())
+		return;
+
+	need_start_when_recovery_over = 0;
+	printf("The ports:");
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->recover_failed)
+			printf(" %u", pid);
+	}
+	printf(" recovery failed! Please remove them!\n");
+}
+
+/* This function is used by the interrupt thread */
+static int
+eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
+		  void *ret_param)
+{
+	RTE_SET_USED(param);
+	RTE_SET_USED(ret_param);
+
+	if (type >= RTE_ETH_EVENT_MAX) {
+		fprintf(stderr,
+			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
+			port_id, __func__, type);
+		fflush(stderr);
+	} else if (event_print_mask & (UINT32_C(1) << type)) {
+		printf("\nPort %" PRIu16 ": %s event\n", port_id,
+			eth_event_desc[type]);
+		fflush(stdout);
+	}
+
+	switch (type) {
+	case RTE_ETH_EVENT_NEW:
+		ports[port_id].need_setup = 1;
+		ports[port_id].port_status = RTE_PORT_HANDLING;
+		break;
+	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, DISABLED_WARN))
+			break;
+		if (rte_eal_alarm_set(100000,
+				rmv_port_callback, (void *)(intptr_t)port_id))
+			fprintf(stderr,
+				"Could not set up deferred device removal\n");
+		break;
+	case RTE_ETH_EVENT_DESTROY:
+		ports[port_id].port_status = RTE_PORT_CLOSED;
+		printf("Port %u is closed\n", port_id);
+		break;
+	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
+		uint16_t rxq_id;
+		int ret;
+
+		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
+		for (rxq_id = 0; ; rxq_id++) {
+			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
+							    NULL);
+			if (ret <= 0)
+				break;
+			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
+			       port_id, rxq_id);
+
+#ifdef RTE_NET_MLX5
+			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
+#endif
+		}
+		break;
+	}
+	case RTE_ETH_EVENT_ERR_RECOVERING:
+		err_recovering_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
+		recover_success_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_FAILED:
+		recover_failed_callback(port_id);
+		break;
+	default:
+		break;
+	}
+	return 0;
+}
+
+int
+register_eth_event_callback(void)
+{
+	int ret;
+	enum rte_eth_event_type event;
+
+	for (event = RTE_ETH_EVENT_UNKNOWN;
+			event < RTE_ETH_EVENT_MAX; event++) {
+		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
+				event,
+				eth_event_callback,
+				NULL);
+		if (ret != 0) {
+			TESTPMD_LOG(ERR, "Failed to register callback for "
+					"%s event\n", eth_event_desc[event]);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+int
+unregister_eth_event_callback(void)
+{
+	int ret;
+	enum rte_eth_event_type event;
+
+	for (event = RTE_ETH_EVENT_UNKNOWN;
+			event < RTE_ETH_EVENT_MAX; event++) {
+		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
+				event,
+				eth_event_callback,
+				NULL);
+		if (ret != 0) {
+			TESTPMD_LOG(ERR, "Failed to unregister callback for "
+					"%s event\n", eth_event_desc[event]);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/* This function is used by the interrupt thread */
+static void
+dev_event_callback(const char *device_name, enum rte_dev_event_type type,
+			     __rte_unused void *arg)
+{
+	uint16_t port_id;
+	int ret;
+
+	if (type >= RTE_DEV_EVENT_MAX) {
+		fprintf(stderr, "%s called upon invalid event %d\n",
+			__func__, type);
+		fflush(stderr);
+	}
+
+	switch (type) {
+	case RTE_DEV_EVENT_REMOVE:
+		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
+			device_name);
+		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
+		if (ret) {
+			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
+				device_name);
+			return;
+		}
+		/*
+		 * Because the user's callback is invoked in eal interrupt
+		 * callback, the interrupt callback need to be finished before
+		 * it can be unregistered when detaching device. So finish
+		 * callback soon and use a deferred removal to detach device
+		 * is need. It is a workaround, once the device detaching be
+		 * moved into the eal in the future, the deferred removal could
+		 * be deleted.
+		 */
+		if (rte_eal_alarm_set(100000,
+				rmv_port_callback, (void *)(intptr_t)port_id))
+			RTE_LOG(ERR, EAL,
+				"Could not set up deferred device removal\n");
+		break;
+	case RTE_DEV_EVENT_ADD:
+		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
+			device_name);
+		/* TODO: After finish kernel driver binding,
+		 * begin to attach port.
+		 */
+		break;
+	default:
+		break;
+	}
+}
+
+int
+register_dev_event_callback(void)
+{
+	int ret;
+
+	ret = rte_dev_event_callback_register(NULL,
+		dev_event_callback, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL,
+			"fail  to register device event callback\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+unregister_dev_event_callback(void)
+{
+	int ret;
+
+	ret = rte_dev_event_callback_unregister(NULL,
+		dev_event_callback, NULL);
+	if (ret < 0) {
+		RTE_LOG(ERR, EAL,
+			"fail to unregister device event callback.\n");
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 719f875be0..b7860f3ab0 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -14,6 +14,7 @@ sources = files(
         'cmd_flex_item.c',
         'config.c',
         'csumonly.c',
+        'event.c',
         'flowgen.c',
         'icmpecho.c',
         'ieee1588fwd.c',
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index a9ca58339d..504315da8b 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -434,45 +434,19 @@ static int
 parse_event_printing_config(const char *optarg, int enable)
 {
 	uint32_t mask = 0;
+	int ret;
 
-	if (!strcmp(optarg, "unknown"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
-	else if (!strcmp(optarg, "intr_lsc"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
-	else if (!strcmp(optarg, "queue_state"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
-	else if (!strcmp(optarg, "intr_reset"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
-	else if (!strcmp(optarg, "vf_mbox"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
-	else if (!strcmp(optarg, "ipsec"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
-	else if (!strcmp(optarg, "macsec"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
-	else if (!strcmp(optarg, "intr_rmv"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
-	else if (!strcmp(optarg, "dev_probed"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
-	else if (!strcmp(optarg, "dev_released"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
-	else if (!strcmp(optarg, "flow_aged"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
-	else if (!strcmp(optarg, "err_recovering"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
-	else if (!strcmp(optarg, "recovery_success"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
-	else if (!strcmp(optarg, "recovery_failed"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
-	else if (!strcmp(optarg, "all"))
-		mask = ~UINT32_C(0);
-	else {
+	ret = get_event_name_mask(optarg, &mask);
+	if (ret != 0) {
 		fprintf(stderr, "Invalid event: %s\n", optarg);
 		return -1;
 	}
+
 	if (enable)
 		event_print_mask |= mask;
 	else
 		event_print_mask &= ~mask;
+
 	return 0;
 }
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 39a25238e5..3a664fec66 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -435,41 +435,6 @@ uint8_t clear_ptypes = true;
 /* Hairpin ports configuration mode. */
 uint32_t hairpin_mode;
 
-/* Pretty printing of ethdev events */
-static const char * const eth_event_desc[] = {
-	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
-	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
-	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
-	[RTE_ETH_EVENT_INTR_RESET] = "reset",
-	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
-	[RTE_ETH_EVENT_IPSEC] = "IPsec",
-	[RTE_ETH_EVENT_MACSEC] = "MACsec",
-	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
-	[RTE_ETH_EVENT_NEW] = "device probed",
-	[RTE_ETH_EVENT_DESTROY] = "device released",
-	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
-	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
-	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
-	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
-	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
-	[RTE_ETH_EVENT_MAX] = NULL,
-};
-
-/*
- * Display or mask ether events
- * Default to all events except VF_MBOX
- */
-uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
 /*
  * Decide if all memory are locked for performance.
  */
@@ -701,12 +666,6 @@ eth_dev_set_mtu_mp(uint16_t port_id, uint16_t mtu)
 /* Forward function declarations */
 static void setup_attached_port(portid_t pi);
 static void check_all_ports_link_status(uint32_t port_mask);
-static int eth_event_callback(portid_t port_id,
-			      enum rte_eth_event_type type,
-			      void *param, void *ret_param);
-static void dev_event_callback(const char *device_name,
-				enum rte_dev_event_type type,
-				void *param);
 static void fill_xstats_display_info(void);
 
 /*
@@ -3672,7 +3631,7 @@ setup_attached_port(portid_t pi)
 	printf("Done\n");
 }
 
-static void
+void
 detach_device(struct rte_device *dev)
 {
 	portid_t sibling;
@@ -3818,13 +3777,9 @@ pmd_test_exit(void)
 			return;
 		}
 
-		ret = rte_dev_event_callback_unregister(NULL,
-			dev_event_callback, NULL);
-		if (ret < 0) {
-			RTE_LOG(ERR, EAL,
-				"fail to unregister device event callback.\n");
+		ret = unregister_dev_event_callback();
+		if (ret != 0)
 			return;
-		}
 
 		ret = rte_dev_hotplug_handle_disable();
 		if (ret) {
@@ -3909,274 +3864,6 @@ check_all_ports_link_status(uint32_t port_mask)
 	}
 }
 
-static void
-rmv_port_callback(void *arg)
-{
-	int need_to_start = 0;
-	int org_no_link_check = no_link_check;
-	portid_t port_id = (intptr_t)arg;
-	struct rte_eth_dev_info dev_info;
-	int ret;
-
-	RTE_ETH_VALID_PORTID_OR_RET(port_id);
-
-	if (!test_done && port_is_forwarding(port_id)) {
-		need_to_start = 1;
-		stop_packet_forwarding();
-	}
-	no_link_check = 1;
-	stop_port(port_id);
-	no_link_check = org_no_link_check;
-
-	ret = eth_dev_info_get_print_err(port_id, &dev_info);
-	if (ret != 0)
-		TESTPMD_LOG(ERR,
-			"Failed to get device info for port %d, not detaching\n",
-			port_id);
-	else {
-		struct rte_device *device = dev_info.device;
-		close_port(port_id);
-		detach_device(device); /* might be already removed or have more ports */
-	}
-	if (need_to_start)
-		start_packet_forwarding(0);
-}
-
-static int need_start_when_recovery_over;
-
-static bool
-has_port_in_err_recovering(void)
-{
-	struct rte_port *port;
-	portid_t pid;
-
-	RTE_ETH_FOREACH_DEV(pid) {
-		port = &ports[pid];
-		if (port->err_recovering)
-			return true;
-	}
-
-	return false;
-}
-
-static void
-err_recovering_callback(portid_t port_id)
-{
-	if (!has_port_in_err_recovering())
-		printf("Please stop executing any commands until recovery result events are received!\n");
-
-	ports[port_id].err_recovering = 1;
-	ports[port_id].recover_failed = 0;
-
-	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
-	if (!test_done) {
-		printf("Stop packet forwarding because some ports are in error recovering!\n");
-		stop_packet_forwarding();
-		need_start_when_recovery_over = 1;
-	}
-}
-
-static void
-recover_success_callback(portid_t port_id)
-{
-	ports[port_id].err_recovering = 0;
-	if (has_port_in_err_recovering())
-		return;
-
-	if (need_start_when_recovery_over) {
-		printf("Recovery success! Restart packet forwarding!\n");
-		start_packet_forwarding(0);
-		need_start_when_recovery_over = 0;
-	} else {
-		printf("Recovery success!\n");
-	}
-}
-
-static void
-recover_failed_callback(portid_t port_id)
-{
-	struct rte_port *port;
-	portid_t pid;
-
-	ports[port_id].err_recovering = 0;
-	ports[port_id].recover_failed = 1;
-	if (has_port_in_err_recovering())
-		return;
-
-	need_start_when_recovery_over = 0;
-	printf("The ports:");
-	RTE_ETH_FOREACH_DEV(pid) {
-		port = &ports[pid];
-		if (port->recover_failed)
-			printf(" %u", pid);
-	}
-	printf(" recovery failed! Please remove them!\n");
-}
-
-/* This function is used by the interrupt thread */
-static int
-eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
-		  void *ret_param)
-{
-	RTE_SET_USED(param);
-	RTE_SET_USED(ret_param);
-
-	if (type >= RTE_ETH_EVENT_MAX) {
-		fprintf(stderr,
-			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
-			port_id, __func__, type);
-		fflush(stderr);
-	} else if (event_print_mask & (UINT32_C(1) << type)) {
-		printf("\nPort %" PRIu16 ": %s event\n", port_id,
-			eth_event_desc[type]);
-		fflush(stdout);
-	}
-
-	switch (type) {
-	case RTE_ETH_EVENT_NEW:
-		ports[port_id].need_setup = 1;
-		ports[port_id].port_status = RTE_PORT_HANDLING;
-		break;
-	case RTE_ETH_EVENT_INTR_RMV:
-		if (port_id_is_invalid(port_id, DISABLED_WARN))
-			break;
-		if (rte_eal_alarm_set(100000,
-				rmv_port_callback, (void *)(intptr_t)port_id))
-			fprintf(stderr,
-				"Could not set up deferred device removal\n");
-		break;
-	case RTE_ETH_EVENT_DESTROY:
-		ports[port_id].port_status = RTE_PORT_CLOSED;
-		printf("Port %u is closed\n", port_id);
-		break;
-	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
-		uint16_t rxq_id;
-		int ret;
-
-		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
-		for (rxq_id = 0; ; rxq_id++) {
-			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
-							    NULL);
-			if (ret <= 0)
-				break;
-			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
-			       port_id, rxq_id);
-
-#ifdef RTE_NET_MLX5
-			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
-#endif
-		}
-		break;
-	}
-	case RTE_ETH_EVENT_ERR_RECOVERING:
-		err_recovering_callback(port_id);
-		break;
-	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
-		recover_success_callback(port_id);
-		break;
-	case RTE_ETH_EVENT_RECOVERY_FAILED:
-		recover_failed_callback(port_id);
-		break;
-	default:
-		break;
-	}
-	return 0;
-}
-
-static int
-register_eth_event_callback(void)
-{
-	int ret;
-	enum rte_eth_event_type event;
-
-	for (event = RTE_ETH_EVENT_UNKNOWN;
-			event < RTE_ETH_EVENT_MAX; event++) {
-		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
-				event,
-				eth_event_callback,
-				NULL);
-		if (ret != 0) {
-			TESTPMD_LOG(ERR, "Failed to register callback for "
-					"%s event\n", eth_event_desc[event]);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
-static int
-unregister_eth_event_callback(void)
-{
-	int ret;
-	enum rte_eth_event_type event;
-
-	for (event = RTE_ETH_EVENT_UNKNOWN;
-			event < RTE_ETH_EVENT_MAX; event++) {
-		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
-				event,
-				eth_event_callback,
-				NULL);
-		if (ret != 0) {
-			TESTPMD_LOG(ERR, "Failed to unregister callback for "
-					"%s event\n", eth_event_desc[event]);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
-/* This function is used by the interrupt thread */
-static void
-dev_event_callback(const char *device_name, enum rte_dev_event_type type,
-			     __rte_unused void *arg)
-{
-	uint16_t port_id;
-	int ret;
-
-	if (type >= RTE_DEV_EVENT_MAX) {
-		fprintf(stderr, "%s called upon invalid event %d\n",
-			__func__, type);
-		fflush(stderr);
-	}
-
-	switch (type) {
-	case RTE_DEV_EVENT_REMOVE:
-		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
-			device_name);
-		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
-		if (ret) {
-			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
-				device_name);
-			return;
-		}
-		/*
-		 * Because the user's callback is invoked in eal interrupt
-		 * callback, the interrupt callback need to be finished before
-		 * it can be unregistered when detaching device. So finish
-		 * callback soon and use a deferred removal to detach device
-		 * is need. It is a workaround, once the device detaching be
-		 * moved into the eal in the future, the deferred removal could
-		 * be deleted.
-		 */
-		if (rte_eal_alarm_set(100000,
-				rmv_port_callback, (void *)(intptr_t)port_id))
-			RTE_LOG(ERR, EAL,
-				"Could not set up deferred device removal\n");
-		break;
-	case RTE_DEV_EVENT_ADD:
-		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
-			device_name);
-		/* TODO: After finish kernel driver binding,
-		 * begin to attach port.
-		 */
-		break;
-	default:
-		break;
-	}
-}
-
 static void
 rxtx_port_config(portid_t pid)
 {
@@ -4725,13 +4412,9 @@ main(int argc, char** argv)
 			return -1;
 		}
 
-		ret = rte_dev_event_callback_register(NULL,
-			dev_event_callback, NULL);
-		if (ret) {
-			RTE_LOG(ERR, EAL,
-				"fail  to register device event callback\n");
+		ret = register_dev_event_callback();
+		if (ret != 0)
 			return -1;
-		}
 	}
 
 	if (!no_device_start && start_port(RTE_PORT_ALL) != 0) {
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 42782d5a05..5c8a052b43 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -1109,6 +1109,11 @@ void set_nb_pkt_per_burst(uint16_t pkt_burst);
 char *list_pkt_forwarding_modes(void);
 char *list_pkt_forwarding_retry_modes(void);
 void set_pkt_forwarding_mode(const char *fwd_mode);
+int get_event_name_mask(const char *name, uint32_t *mask);
+int register_eth_event_callback(void);
+int unregister_eth_event_callback(void);
+int register_dev_event_callback(void);
+int unregister_dev_event_callback(void);
 void start_packet_forwarding(int with_tx_first);
 void fwd_stats_display(void);
 void fwd_stats_reset(void);
@@ -1128,6 +1133,7 @@ void stop_port(portid_t pid);
 void close_port(portid_t pid);
 void reset_port(portid_t pid);
 void attach_port(char *identifier);
+void detach_device(struct rte_device *dev);
 void detach_devargs(char *identifier);
 void detach_port_device(portid_t port_id);
 int all_ports_stopped(void);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 7/7] doc: testpmd support event handling section
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
                     ` (5 preceding siblings ...)
  2023-10-20 10:07   ` [PATCH v2 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
@ 2023-10-20 10:07   ` Chengwen Feng
  2023-11-06  9:28     ` lihuisong (C)
  2023-11-06  1:35   ` [PATCH v2 0/7] fix race-condition of proactive error handling mode fengchengwen
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-10-20 10:07 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Add new section of event handling, which documented the ethdev and
device events.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/testpmd_app_ug/event_handling.rst | 80 ++++++++++++++++++++
 doc/guides/testpmd_app_ug/index.rst          |  1 +
 2 files changed, 81 insertions(+)
 create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst

diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
new file mode 100644
index 0000000000..c116753ad0
--- /dev/null
+++ b/doc/guides/testpmd_app_ug/event_handling.rst
@@ -0,0 +1,80 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2023 HiSilicon Limited.
+
+Event Handling
+==============
+
+The ``testpmd`` application supports following two type event handling:
+
+ethdev events
+-------------
+
+The ``testpmd`` provide options "--print-event" and "--mask-event" to control
+whether display such as "Port x y event" when received "y" event on port "x".
+This is named as default processing.
+
+This section details the support events, unless otherwise specified, only the
+default processing is support.
+
+- ``RTE_ETH_EVENT_INTR_LSC``:
+  If device started with lsc enabled, the PMD will launch this event when it
+  detect link status changes.
+
+- ``RTE_ETH_EVENT_QUEUE_STATE``:
+  Used only within vhost PMD to report vring whether enabled.
+
+- ``RTE_ETH_EVENT_INTR_RESET``:
+  Used to report reset interrupt happened, this event only reported when the
+  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
+
+- ``RTE_ETH_EVENT_VF_MBOX``:
+  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
+
+- ``RTE_ETH_EVENT_INTR_RMV``:
+  Used to report device removal event. The ``testpmd`` will remove the port
+  later.
+
+- ``RTE_ETH_EVENT_NEW``:
+  Used to report port was probed event. The ``testpmd`` will setup the port
+  later.
+
+- ``RTE_ETH_EVENT_DESTROY``:
+  Used to report port was released event. The ``testpmd`` will changes the
+  port's status.
+
+- ``RTE_ETH_EVENT_MACSEC``:
+  Used to report MACsec offload related event.
+
+- ``RTE_ETH_EVENT_IPSEC``:
+  Used to report IPsec offload related event.
+
+- ``RTE_ETH_EVENT_FLOW_AGED``:
+  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
+
+- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
+  Used to report available Rx descriptors was smaller than the threshold. Only
+  valid with mlx5 PMD.
+
+- ``RTE_ETH_EVENT_ERR_RECOVERING``:
+  Used to report error happened, and PMD will do recover after report this
+  event. The ``testpmd`` will stop packet forwarding when received the event.
+
+- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
+  Used to report error recovery success. The ``testpmd`` will restart packet
+  forwarding when received the event.
+
+- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
+  Used to report error recovery failed. The ``testpmd`` will display one
+  message to show which ports failed.
+
+.. note::
+
+   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
+   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
+   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
+
+device events
+-------------
+
+Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
+enabled only when the ``testpmd`` stated with options "--hot-plug".
diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
index 1ac0d25d57..3c09448c4e 100644
--- a/doc/guides/testpmd_app_ug/index.rst
+++ b/doc/guides/testpmd_app_ug/index.rst
@@ -14,3 +14,4 @@ Testpmd Application User Guide
     build_app
     run_app
     testpmd_funcs
+    event_handling
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 1/7] ethdev: fix race-condition of proactive error handling mode
  2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
@ 2023-11-01  3:39     ` lihuisong (C)
  0 siblings, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-01  3:39 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Andrew Rybchenko, Somnath Kotur, Kalesh AP
  Cc: dev, Honnappa.Nagarahalli

lgtm,
Acked-by: Huisong Li <lihuisong@huawei.com>

在 2023/10/20 18:07, Chengwen Feng 写道:
> In the proactive error handling mode, the PMD will set the data path
> pointers to dummy functions and then try recovery, in this period the
> application may still invoking data path API. This will introduce a
> race-condition with data path which may lead to crash [1].
>
> Although the PMD added delay after setting data path pointers to cover
> the above race-condition, it reduces the probability, but it doesn't
> solve the problem.
>
> To solve the race-condition problem fundamentally, the following
> requirements are added:
> 1. The PMD should set the data path pointers to dummy functions after
>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> 2. The application should stop data path API invocation when process
>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> 3. The PMD should set the data path pointers to valid functions before
>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> 4. The application should enable data path API invocation when process
>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>
> Also, this patch introduce a driver internal function
> rte_eth_fp_ops_setup which used as an help function for PMD.
Agreed with adding this internal interface to use for PMD.
otherwise driver have to operate the global variable.
>
> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>
> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> ---
>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>   lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>   lib/ethdev/version.map                  |  1 +
>   5 files changed, 46 insertions(+), 25 deletions(-)
>
> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> index c145a9066c..e380ff135a 100644
> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> @@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
>   the PMD automatically recovers from error in PROACTIVE mode,
>   and only a small amount of work is required for the application.
>   
> -During error detection and automatic recovery,
> -the PMD sets the data path pointers to dummy functions
> -(which will prevent the crash),
> -and also make sure the control path operations fail with a return code ``-EBUSY``.
> -
> -Because the PMD recovers automatically,
> -the application can only sense that the data flow is disconnected for a while
> -and the control API returns an error in this period.
> +During error detection and automatic recovery, the PMD sets the data path
> +pointers to dummy functions and also make sure the control path operations
> +failed with a return code ``-EBUSY``.
>   
>   In order to sense the error happening/recovering,
>   as well as to restore some additional configuration,
> @@ -653,9 +648,9 @@ three events are available:
>   
>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>      Notify the application that an error is detected
> -   and the recovery is being started.
> +   and the recovery is about to start.
>      Upon receiving the event, the application should not invoke
> -   any control path function until receiving
> +   any control and data path API until receiving
>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>   
>   .. note::
> @@ -666,8 +661,9 @@ three events are available:
>   
>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>      Notify the application that the recovery from error is successful,
> -   the PMD already re-configures the port,
> -   and the effect is the same as a restart operation.
> +   the PMD already re-configures the port.
> +   The application should restore some additional configuration, and then
> +   enable data path API invocation.
>   
>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>      Notify the application that the recovery from error failed,
> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> index fff4b7b4cd..65ead7b910 100644
> --- a/lib/ethdev/ethdev_driver.c
> +++ b/lib/ethdev/ethdev_driver.c
> @@ -537,6 +537,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
>   	return rc;
>   }
>   
> +void
> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> +{
> +	if (dev == NULL)
> +		return;
> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
> +}
> +
>   const struct rte_memzone *
>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>   			 uint16_t queue_id, size_t size, unsigned int align,
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index deb23ada18..8567b96f53 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1636,6 +1636,16 @@ int
>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>   		 uint16_t queue_id);
>   
> +/**
> + * @internal
> + * Setup eth fast-path API to ethdev values.
> + *
> + * @param dev
> + *  Pointer to struct rte_eth_dev.
> + */
> +__rte_internal
> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> +
>   /**
>    * @internal
>    * Atomically set the link status for the specific device.
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 85b9af7a02..dbe2d9c745 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -3994,25 +3994,28 @@ enum rte_eth_event_type {
>   	 */
>   	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>   	/** Port recovering from a hardware or firmware error.
> -	 * If PMD supports proactive error recovery,
> -	 * it should trigger this event to notify application
> -	 * that it detected an error and the recovery is being started.
> -	 * Upon receiving the event, the application should not invoke any control path API
> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
> -	 * The PMD will set the data path pointers to dummy functions,
> -	 * and re-set the data path pointers to non-dummy functions
> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> -	 * It means that the application cannot send or receive any packets
> -	 * during this period.
> +	 *
> +	 * If PMD supports proactive error recovery, it should trigger this
> +	 * event to notify application that it detected an error and the
> +	 * recovery is about to start.
> +	 *
> +	 * Upon receiving the event, the application should not invoke any
> +	 * control and data path API until receiving
> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
> +	 * event.
> +	 *
> +	 * Once this event is reported, the PMD will set the data path pointers
> +	 * to dummy functions, and re-set the data path pointers to valid
> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> +	 *
>   	 * @note Before the PMD reports the recovery result,
>   	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>   	 * because a larger error may occur during the recovery.
>   	 */
>   	RTE_ETH_EVENT_ERR_RECOVERING,
>   	/** Port recovers successfully from the error.
> -	 * The PMD already re-configured the port,
> -	 * and the effect is the same as a restart operation.
> +	 *
> +	 * The PMD already re-configured the port:
>   	 * a) The following operation will be retained: (alphabetically)
>   	 *    - DCB configuration
>   	 *    - FEC configuration
> @@ -4039,6 +4042,9 @@ enum rte_eth_event_type {
>   	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>   	 * c) Any other configuration will not be stored
>   	 *    and will need to be re-configured.
> +	 *
> +	 * The application should restore some additional configuration
> +	 * (see above case b/c), and then enable data path API invocation.
>   	 */
>   	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>   	/** Port recovery failed.
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 919ba5b8e6..1e6ee0a6f1 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -338,6 +338,7 @@ INTERNAL {
>   	rte_eth_devices;
>   	rte_eth_dma_zone_free;
>   	rte_eth_dma_zone_reserve;
> +	rte_eth_fp_ops_setup;
>   	rte_eth_hairpin_queue_peer_bind;
>   	rte_eth_hairpin_queue_peer_unbind;
>   	rte_eth_hairpin_queue_peer_update;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 2/7] net/hns3: replace fp ops config function
  2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
@ 2023-11-01  3:40     ` lihuisong (C)
  2023-11-02 10:34     ` Konstantin Ananyev
  1 sibling, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-01  3:40 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Jie Hai, Yisen Zhuang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

lgtm,
Acked-by: Huisong Li <lihuisong@huawei.com>

在 2023/10/20 18:07, Chengwen Feng 写道:
> This patch replace hns3_eth_dev_fp_ops_config() with
> rte_eth_fp_ops_setup().
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>   drivers/net/hns3/hns3_rxtx.c | 21 +++------------------
>   1 file changed, 3 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
> index f3c3b38c55..f43f1eb9ad 100644
> --- a/drivers/net/hns3/hns3_rxtx.c
> +++ b/drivers/net/hns3/hns3_rxtx.c
> @@ -4434,21 +4434,6 @@ hns3_trace_rxtx_function(struct rte_eth_dev *dev)
>   		 rx_mode.info, tx_mode.info);
>   }
>   
> -static void
> -hns3_eth_dev_fp_ops_config(const struct rte_eth_dev *dev)
> -{
> -	struct rte_eth_fp_ops *fpo = rte_eth_fp_ops;
> -	uint16_t port_id = dev->data->port_id;
> -
> -	fpo[port_id].rx_pkt_burst = dev->rx_pkt_burst;
> -	fpo[port_id].tx_pkt_burst = dev->tx_pkt_burst;
> -	fpo[port_id].tx_pkt_prepare = dev->tx_pkt_prepare;
> -	fpo[port_id].rx_descriptor_status = dev->rx_descriptor_status;
> -	fpo[port_id].tx_descriptor_status = dev->tx_descriptor_status;
> -	fpo[port_id].rxq.data = dev->data->rx_queues;
> -	fpo[port_id].txq.data = dev->data->tx_queues;
> -}
> -
>   void
>   hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
>   {
> @@ -4471,7 +4456,7 @@ hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
>   	}
>   
>   	hns3_trace_rxtx_function(eth_dev);
> -	hns3_eth_dev_fp_ops_config(eth_dev);
> +	rte_eth_fp_ops_setup(eth_dev);
>   }
>   
>   void
> @@ -4824,7 +4809,7 @@ hns3_stop_tx_datapath(struct rte_eth_dev *dev)
>   {
>   	dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
>   	dev->tx_pkt_prepare = NULL;
> -	hns3_eth_dev_fp_ops_config(dev);
> +	rte_eth_fp_ops_setup(dev);
>   
>   	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>   		return;
> @@ -4841,7 +4826,7 @@ hns3_start_tx_datapath(struct rte_eth_dev *dev)
>   {
>   	dev->tx_pkt_burst = hns3_get_tx_function(dev);
>   	dev->tx_pkt_prepare = hns3_get_tx_prepare(dev);
> -	hns3_eth_dev_fp_ops_config(dev);
> +	rte_eth_fp_ops_setup(dev);
>   
>   	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>   		return;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 4/7] net/bnxt: use fp ops setup function
  2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
@ 2023-11-01  3:48     ` lihuisong (C)
  2023-11-02 10:34     ` Konstantin Ananyev
  1 sibling, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-01  3:48 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Somnath Kotur
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

+1 use new api to modify rte_eth_fp_ops[]
Acked-by: Huisong Li <lihuisong@huawei.com>


在 2023/10/20 18:07, Chengwen Feng 写道:
> Use rte_eth_fp_ops_setup() instead of directly manipulating
> rte_eth_fp_ops variable.
>
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>   drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>   drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>   2 files changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index d8947d5b5f..3a08028331 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>   	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>   	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
>   
> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> -		eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> -		eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(eth_dev);
>   	rte_mb();
>   
>   	/* Allow time for threads to exit the real burst functions. */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 003a6eec11..9d9b9ae8cf 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4428,10 +4428,7 @@ static void bnxt_dev_recover(void *arg)
>   	if (rc)
>   		goto err_start;
>   
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> -		bp->eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> -		bp->eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(bp->eth_dev);
>   	rte_mb();
>   
>   	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 5/7] app/testpmd: add error recovery usage demo
  2023-10-20 10:07   ` [PATCH v2 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
@ 2023-11-01  4:08     ` lihuisong (C)
  2023-11-06 13:01       ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: lihuisong (C) @ 2023-11-01  4:08 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli


在 2023/10/20 18:07, Chengwen Feng 写道:
> This patch adds error recovery usage demo which will:
> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>     is received.
> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>     event is received.
> 3. prompt the ports that fail to recovery and need to be removed when
>     the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
Why not suggest that try to call dev_reset() or other way to recovery?
>
> In addition, a message is added to the printed information, requiring
> no command to be executed during the error recovery.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> ---
>   app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>   app/test-pmd/testpmd.h |  4 ++-
>   2 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 595b77748c..39a25238e5 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3942,6 +3942,77 @@ rmv_port_callback(void *arg)
>   		start_packet_forwarding(0);
>   }
>   
> +static int need_start_when_recovery_over;
> +
> +static bool
> +has_port_in_err_recovering(void)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->err_recovering)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +static void
> +err_recovering_callback(portid_t port_id)
> +{
> +	if (!has_port_in_err_recovering())
> +		printf("Please stop executing any commands until recovery result events are received!\n");
> +
> +	ports[port_id].err_recovering = 1;
> +	ports[port_id].recover_failed = 0;
> +
> +	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
> +	if (!test_done) {
> +		printf("Stop packet forwarding because some ports are in error recovering!\n");
> +		stop_packet_forwarding();
> +		need_start_when_recovery_over = 1;
> +	}
> +}
> +
> +static void
> +recover_success_callback(portid_t port_id)
> +{
> +	ports[port_id].err_recovering = 0;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	if (need_start_when_recovery_over) {
> +		printf("Recovery success! Restart packet forwarding!\n");
> +		start_packet_forwarding(0);
s/start_packet_forwarding(0)/start_packet_forwarding() ?
> +		need_start_when_recovery_over = 0;
> +	} else {
> +		printf("Recovery success!\n");
> +	}
> +}
> +
> +static void
> +recover_failed_callback(portid_t port_id)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	ports[port_id].err_recovering = 0;
> +	ports[port_id].recover_failed = 1;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	need_start_when_recovery_over = 0;
> +	printf("The ports:");
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->recover_failed)
> +			printf(" %u", pid);
> +	}
> +	printf(" recovery failed! Please remove them!\n");
> +}
> +
>   /* This function is used by the interrupt thread */
>   static int
>   eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
> @@ -3997,6 +4068,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>   		}
>   		break;
>   	}
> +	case RTE_ETH_EVENT_ERR_RECOVERING:
> +		err_recovering_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
> +		recover_success_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_FAILED:
> +		recover_failed_callback(port_id);
> +		break;
>   	default:
>   		break;
>   	}
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 09a36b90b8..42782d5a05 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -342,7 +342,9 @@ struct rte_port {
>   	uint8_t                 member_flag : 1, /**< bonding member port */
>   				bond_flag : 1, /**< port is bond device */
>   				fwd_mac_swap : 1, /**< swap packet MAC before forward */
> -				update_conf : 1; /**< need to update bonding device configuration */
> +				update_conf : 1, /**< need to update bonding device configuration */
> +				err_recovering : 1, /**< port is in error recovering */
> +				recover_failed : 1; /**< port recover failed */
>   	struct port_template    *pattern_templ_list; /**< Pattern templates. */
>   	struct port_template    *actions_templ_list; /**< Actions templates. */
>   	struct port_table       *table_list; /**< Flow tables. */

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 6/7] app/testpmd: extract event handling to event.c
  2023-10-20 10:07   ` [PATCH v2 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
@ 2023-11-01  4:09     ` lihuisong (C)
  0 siblings, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-01  4:09 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

lgtm,
Acked-by: Huisong Li <lihuisong@huawei.com>

在 2023/10/20 18:07, Chengwen Feng 写道:
> This patch extract event handling (including eth-event and dev-event)
> to a new file 'event.c'.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>   app/test-pmd/event.c      | 390 ++++++++++++++++++++++++++++++++++++++
>   app/test-pmd/meson.build  |   1 +
>   app/test-pmd/parameters.c |  36 +---
>   app/test-pmd/testpmd.c    | 327 +-------------------------------
>   app/test-pmd/testpmd.h    |   6 +
>   5 files changed, 407 insertions(+), 353 deletions(-)
>   create mode 100644 app/test-pmd/event.c
>
> diff --git a/app/test-pmd/event.c b/app/test-pmd/event.c
> new file mode 100644
> index 0000000000..8393e105d7
> --- /dev/null
> +++ b/app/test-pmd/event.c
> @@ -0,0 +1,390 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 HiSilicon Limited
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_alarm.h>
> +#include <rte_ethdev.h>
> +#include <rte_dev.h>
> +#include <rte_log.h>
> +#ifdef RTE_NET_MLX5
> +#include "mlx5_testpmd.h"
> +#endif
> +
> +#include "testpmd.h"
> +
> +/* Pretty printing of ethdev events */
> +static const char * const eth_event_desc[] = {
> +	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
> +	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
> +	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
> +	[RTE_ETH_EVENT_INTR_RESET] = "reset",
> +	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
> +	[RTE_ETH_EVENT_IPSEC] = "IPsec",
> +	[RTE_ETH_EVENT_MACSEC] = "MACsec",
> +	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
> +	[RTE_ETH_EVENT_NEW] = "device probed",
> +	[RTE_ETH_EVENT_DESTROY] = "device released",
> +	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> +	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
> +	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
> +	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
> +	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
> +	[RTE_ETH_EVENT_MAX] = NULL,
> +};
> +
> +/*
> + * Display or mask ether events
> + * Default to all events except VF_MBOX
> + */
> +uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
> +
> +int
> +get_event_name_mask(const char *name, uint32_t *mask)
> +{
> +	if (!strcmp(name, "unknown"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
> +	else if (!strcmp(name, "intr_lsc"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
> +	else if (!strcmp(name, "queue_state"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
> +	else if (!strcmp(name, "intr_reset"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
> +	else if (!strcmp(name, "vf_mbox"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
> +	else if (!strcmp(name, "ipsec"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
> +	else if (!strcmp(name, "macsec"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
> +	else if (!strcmp(name, "intr_rmv"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
> +	else if (!strcmp(name, "dev_probed"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
> +	else if (!strcmp(name, "dev_released"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
> +	else if (!strcmp(name, "flow_aged"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
> +	else if (!strcmp(name, "err_recovering"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
> +	else if (!strcmp(name, "recovery_success"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
> +	else if (!strcmp(name, "recovery_failed"))
> +		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
> +	else if (!strcmp(name, "all"))
> +		*mask = ~UINT32_C(0);
> +	else
> +		return -1;
> +
> +	return 0;
> +}
> +
> +static void
> +rmv_port_callback(void *arg)
> +{
> +	int need_to_start = 0;
> +	int org_no_link_check = no_link_check;
> +	portid_t port_id = (intptr_t)arg;
> +	struct rte_eth_dev_info dev_info;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_RET(port_id);
> +
> +	if (!test_done && port_is_forwarding(port_id)) {
> +		need_to_start = 1;
> +		stop_packet_forwarding();
> +	}
> +	no_link_check = 1;
> +	stop_port(port_id);
> +	no_link_check = org_no_link_check;
> +
> +	ret = eth_dev_info_get_print_err(port_id, &dev_info);
> +	if (ret != 0)
> +		TESTPMD_LOG(ERR,
> +			"Failed to get device info for port %d, not detaching\n",
> +			port_id);
> +	else {
> +		struct rte_device *device = dev_info.device;
> +		close_port(port_id);
> +		detach_device(device); /* might be already removed or have more ports */
> +	}
> +	if (need_to_start)
> +		start_packet_forwarding(0);
> +}
> +
> +static int need_start_when_recovery_over;
> +
> +static bool
> +has_port_in_err_recovering(void)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->err_recovering)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +static void
> +err_recovering_callback(portid_t port_id)
> +{
> +	if (!has_port_in_err_recovering())
> +		printf("Please stop executing any commands until recovery result events are received!\n");
> +
> +	ports[port_id].err_recovering = 1;
> +	ports[port_id].recover_failed = 0;
> +
> +	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
> +	if (!test_done) {
> +		printf("Stop packet forwarding because some ports are in error recovering!\n");
> +		stop_packet_forwarding();
> +		need_start_when_recovery_over = 1;
> +	}
> +}
> +
> +static void
> +recover_success_callback(portid_t port_id)
> +{
> +	ports[port_id].err_recovering = 0;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	if (need_start_when_recovery_over) {
> +		printf("Recovery success! Restart packet forwarding!\n");
> +		start_packet_forwarding(0);
> +		need_start_when_recovery_over = 0;
> +	} else {
> +		printf("Recovery success!\n");
> +	}
> +}
> +
> +static void
> +recover_failed_callback(portid_t port_id)
> +{
> +	struct rte_port *port;
> +	portid_t pid;
> +
> +	ports[port_id].err_recovering = 0;
> +	ports[port_id].recover_failed = 1;
> +	if (has_port_in_err_recovering())
> +		return;
> +
> +	need_start_when_recovery_over = 0;
> +	printf("The ports:");
> +	RTE_ETH_FOREACH_DEV(pid) {
> +		port = &ports[pid];
> +		if (port->recover_failed)
> +			printf(" %u", pid);
> +	}
> +	printf(" recovery failed! Please remove them!\n");
> +}
> +
> +/* This function is used by the interrupt thread */
> +static int
> +eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
> +		  void *ret_param)
> +{
> +	RTE_SET_USED(param);
> +	RTE_SET_USED(ret_param);
> +
> +	if (type >= RTE_ETH_EVENT_MAX) {
> +		fprintf(stderr,
> +			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
> +			port_id, __func__, type);
> +		fflush(stderr);
> +	} else if (event_print_mask & (UINT32_C(1) << type)) {
> +		printf("\nPort %" PRIu16 ": %s event\n", port_id,
> +			eth_event_desc[type]);
> +		fflush(stdout);
> +	}
> +
> +	switch (type) {
> +	case RTE_ETH_EVENT_NEW:
> +		ports[port_id].need_setup = 1;
> +		ports[port_id].port_status = RTE_PORT_HANDLING;
> +		break;
> +	case RTE_ETH_EVENT_INTR_RMV:
> +		if (port_id_is_invalid(port_id, DISABLED_WARN))
> +			break;
> +		if (rte_eal_alarm_set(100000,
> +				rmv_port_callback, (void *)(intptr_t)port_id))
> +			fprintf(stderr,
> +				"Could not set up deferred device removal\n");
> +		break;
> +	case RTE_ETH_EVENT_DESTROY:
> +		ports[port_id].port_status = RTE_PORT_CLOSED;
> +		printf("Port %u is closed\n", port_id);
> +		break;
> +	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
> +		uint16_t rxq_id;
> +		int ret;
> +
> +		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
> +		for (rxq_id = 0; ; rxq_id++) {
> +			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
> +							    NULL);
> +			if (ret <= 0)
> +				break;
> +			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
> +			       port_id, rxq_id);
> +
> +#ifdef RTE_NET_MLX5
> +			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
> +#endif
> +		}
> +		break;
> +	}
> +	case RTE_ETH_EVENT_ERR_RECOVERING:
> +		err_recovering_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
> +		recover_success_callback(port_id);
> +		break;
> +	case RTE_ETH_EVENT_RECOVERY_FAILED:
> +		recover_failed_callback(port_id);
> +		break;
> +	default:
> +		break;
> +	}
> +	return 0;
> +}
> +
> +int
> +register_eth_event_callback(void)
> +{
> +	int ret;
> +	enum rte_eth_event_type event;
> +
> +	for (event = RTE_ETH_EVENT_UNKNOWN;
> +			event < RTE_ETH_EVENT_MAX; event++) {
> +		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
> +				event,
> +				eth_event_callback,
> +				NULL);
> +		if (ret != 0) {
> +			TESTPMD_LOG(ERR, "Failed to register callback for "
> +					"%s event\n", eth_event_desc[event]);
> +			return -1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +unregister_eth_event_callback(void)
> +{
> +	int ret;
> +	enum rte_eth_event_type event;
> +
> +	for (event = RTE_ETH_EVENT_UNKNOWN;
> +			event < RTE_ETH_EVENT_MAX; event++) {
> +		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
> +				event,
> +				eth_event_callback,
> +				NULL);
> +		if (ret != 0) {
> +			TESTPMD_LOG(ERR, "Failed to unregister callback for "
> +					"%s event\n", eth_event_desc[event]);
> +			return -1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/* This function is used by the interrupt thread */
> +static void
> +dev_event_callback(const char *device_name, enum rte_dev_event_type type,
> +			     __rte_unused void *arg)
> +{
> +	uint16_t port_id;
> +	int ret;
> +
> +	if (type >= RTE_DEV_EVENT_MAX) {
> +		fprintf(stderr, "%s called upon invalid event %d\n",
> +			__func__, type);
> +		fflush(stderr);
> +	}
> +
> +	switch (type) {
> +	case RTE_DEV_EVENT_REMOVE:
> +		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
> +			device_name);
> +		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
> +		if (ret) {
> +			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
> +				device_name);
> +			return;
> +		}
> +		/*
> +		 * Because the user's callback is invoked in eal interrupt
> +		 * callback, the interrupt callback need to be finished before
> +		 * it can be unregistered when detaching device. So finish
> +		 * callback soon and use a deferred removal to detach device
> +		 * is need. It is a workaround, once the device detaching be
> +		 * moved into the eal in the future, the deferred removal could
> +		 * be deleted.
> +		 */
> +		if (rte_eal_alarm_set(100000,
> +				rmv_port_callback, (void *)(intptr_t)port_id))
> +			RTE_LOG(ERR, EAL,
> +				"Could not set up deferred device removal\n");
> +		break;
> +	case RTE_DEV_EVENT_ADD:
> +		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
> +			device_name);
> +		/* TODO: After finish kernel driver binding,
> +		 * begin to attach port.
> +		 */
> +		break;
> +	default:
> +		break;
> +	}
> +}
> +
> +int
> +register_dev_event_callback(void)
> +{
> +	int ret;
> +
> +	ret = rte_dev_event_callback_register(NULL,
> +		dev_event_callback, NULL);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, EAL,
> +			"fail  to register device event callback\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +unregister_dev_event_callback(void)
> +{
> +	int ret;
> +
> +	ret = rte_dev_event_callback_unregister(NULL,
> +		dev_event_callback, NULL);
> +	if (ret < 0) {
> +		RTE_LOG(ERR, EAL,
> +			"fail to unregister device event callback.\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
> index 719f875be0..b7860f3ab0 100644
> --- a/app/test-pmd/meson.build
> +++ b/app/test-pmd/meson.build
> @@ -14,6 +14,7 @@ sources = files(
>           'cmd_flex_item.c',
>           'config.c',
>           'csumonly.c',
> +        'event.c',
>           'flowgen.c',
>           'icmpecho.c',
>           'ieee1588fwd.c',
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index a9ca58339d..504315da8b 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -434,45 +434,19 @@ static int
>   parse_event_printing_config(const char *optarg, int enable)
>   {
>   	uint32_t mask = 0;
> +	int ret;
>   
> -	if (!strcmp(optarg, "unknown"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
> -	else if (!strcmp(optarg, "intr_lsc"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
> -	else if (!strcmp(optarg, "queue_state"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
> -	else if (!strcmp(optarg, "intr_reset"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
> -	else if (!strcmp(optarg, "vf_mbox"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
> -	else if (!strcmp(optarg, "ipsec"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
> -	else if (!strcmp(optarg, "macsec"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
> -	else if (!strcmp(optarg, "intr_rmv"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
> -	else if (!strcmp(optarg, "dev_probed"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
> -	else if (!strcmp(optarg, "dev_released"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
> -	else if (!strcmp(optarg, "flow_aged"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
> -	else if (!strcmp(optarg, "err_recovering"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
> -	else if (!strcmp(optarg, "recovery_success"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
> -	else if (!strcmp(optarg, "recovery_failed"))
> -		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
> -	else if (!strcmp(optarg, "all"))
> -		mask = ~UINT32_C(0);
> -	else {
> +	ret = get_event_name_mask(optarg, &mask);
> +	if (ret != 0) {
>   		fprintf(stderr, "Invalid event: %s\n", optarg);
>   		return -1;
>   	}
> +
>   	if (enable)
>   		event_print_mask |= mask;
>   	else
>   		event_print_mask &= ~mask;
> +
>   	return 0;
>   }
>   
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 39a25238e5..3a664fec66 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -435,41 +435,6 @@ uint8_t clear_ptypes = true;
>   /* Hairpin ports configuration mode. */
>   uint32_t hairpin_mode;
>   
> -/* Pretty printing of ethdev events */
> -static const char * const eth_event_desc[] = {
> -	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
> -	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
> -	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
> -	[RTE_ETH_EVENT_INTR_RESET] = "reset",
> -	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
> -	[RTE_ETH_EVENT_IPSEC] = "IPsec",
> -	[RTE_ETH_EVENT_MACSEC] = "MACsec",
> -	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
> -	[RTE_ETH_EVENT_NEW] = "device probed",
> -	[RTE_ETH_EVENT_DESTROY] = "device released",
> -	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> -	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
> -	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
> -	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
> -	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
> -	[RTE_ETH_EVENT_MAX] = NULL,
> -};
> -
> -/*
> - * Display or mask ether events
> - * Default to all events except VF_MBOX
> - */
> -uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
>   /*
>    * Decide if all memory are locked for performance.
>    */
> @@ -701,12 +666,6 @@ eth_dev_set_mtu_mp(uint16_t port_id, uint16_t mtu)
>   /* Forward function declarations */
>   static void setup_attached_port(portid_t pi);
>   static void check_all_ports_link_status(uint32_t port_mask);
> -static int eth_event_callback(portid_t port_id,
> -			      enum rte_eth_event_type type,
> -			      void *param, void *ret_param);
> -static void dev_event_callback(const char *device_name,
> -				enum rte_dev_event_type type,
> -				void *param);
>   static void fill_xstats_display_info(void);
>   
>   /*
> @@ -3672,7 +3631,7 @@ setup_attached_port(portid_t pi)
>   	printf("Done\n");
>   }
>   
> -static void
> +void
>   detach_device(struct rte_device *dev)
>   {
>   	portid_t sibling;
> @@ -3818,13 +3777,9 @@ pmd_test_exit(void)
>   			return;
>   		}
>   
> -		ret = rte_dev_event_callback_unregister(NULL,
> -			dev_event_callback, NULL);
> -		if (ret < 0) {
> -			RTE_LOG(ERR, EAL,
> -				"fail to unregister device event callback.\n");
> +		ret = unregister_dev_event_callback();
> +		if (ret != 0)
>   			return;
> -		}
>   
>   		ret = rte_dev_hotplug_handle_disable();
>   		if (ret) {
> @@ -3909,274 +3864,6 @@ check_all_ports_link_status(uint32_t port_mask)
>   	}
>   }
>   
> -static void
> -rmv_port_callback(void *arg)
> -{
> -	int need_to_start = 0;
> -	int org_no_link_check = no_link_check;
> -	portid_t port_id = (intptr_t)arg;
> -	struct rte_eth_dev_info dev_info;
> -	int ret;
> -
> -	RTE_ETH_VALID_PORTID_OR_RET(port_id);
> -
> -	if (!test_done && port_is_forwarding(port_id)) {
> -		need_to_start = 1;
> -		stop_packet_forwarding();
> -	}
> -	no_link_check = 1;
> -	stop_port(port_id);
> -	no_link_check = org_no_link_check;
> -
> -	ret = eth_dev_info_get_print_err(port_id, &dev_info);
> -	if (ret != 0)
> -		TESTPMD_LOG(ERR,
> -			"Failed to get device info for port %d, not detaching\n",
> -			port_id);
> -	else {
> -		struct rte_device *device = dev_info.device;
> -		close_port(port_id);
> -		detach_device(device); /* might be already removed or have more ports */
> -	}
> -	if (need_to_start)
> -		start_packet_forwarding(0);
> -}
> -
> -static int need_start_when_recovery_over;
> -
> -static bool
> -has_port_in_err_recovering(void)
> -{
> -	struct rte_port *port;
> -	portid_t pid;
> -
> -	RTE_ETH_FOREACH_DEV(pid) {
> -		port = &ports[pid];
> -		if (port->err_recovering)
> -			return true;
> -	}
> -
> -	return false;
> -}
> -
> -static void
> -err_recovering_callback(portid_t port_id)
> -{
> -	if (!has_port_in_err_recovering())
> -		printf("Please stop executing any commands until recovery result events are received!\n");
> -
> -	ports[port_id].err_recovering = 1;
> -	ports[port_id].recover_failed = 0;
> -
> -	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
> -	if (!test_done) {
> -		printf("Stop packet forwarding because some ports are in error recovering!\n");
> -		stop_packet_forwarding();
> -		need_start_when_recovery_over = 1;
> -	}
> -}
> -
> -static void
> -recover_success_callback(portid_t port_id)
> -{
> -	ports[port_id].err_recovering = 0;
> -	if (has_port_in_err_recovering())
> -		return;
> -
> -	if (need_start_when_recovery_over) {
> -		printf("Recovery success! Restart packet forwarding!\n");
> -		start_packet_forwarding(0);
> -		need_start_when_recovery_over = 0;
> -	} else {
> -		printf("Recovery success!\n");
> -	}
> -}
> -
> -static void
> -recover_failed_callback(portid_t port_id)
> -{
> -	struct rte_port *port;
> -	portid_t pid;
> -
> -	ports[port_id].err_recovering = 0;
> -	ports[port_id].recover_failed = 1;
> -	if (has_port_in_err_recovering())
> -		return;
> -
> -	need_start_when_recovery_over = 0;
> -	printf("The ports:");
> -	RTE_ETH_FOREACH_DEV(pid) {
> -		port = &ports[pid];
> -		if (port->recover_failed)
> -			printf(" %u", pid);
> -	}
> -	printf(" recovery failed! Please remove them!\n");
> -}
> -
> -/* This function is used by the interrupt thread */
> -static int
> -eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
> -		  void *ret_param)
> -{
> -	RTE_SET_USED(param);
> -	RTE_SET_USED(ret_param);
> -
> -	if (type >= RTE_ETH_EVENT_MAX) {
> -		fprintf(stderr,
> -			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
> -			port_id, __func__, type);
> -		fflush(stderr);
> -	} else if (event_print_mask & (UINT32_C(1) << type)) {
> -		printf("\nPort %" PRIu16 ": %s event\n", port_id,
> -			eth_event_desc[type]);
> -		fflush(stdout);
> -	}
> -
> -	switch (type) {
> -	case RTE_ETH_EVENT_NEW:
> -		ports[port_id].need_setup = 1;
> -		ports[port_id].port_status = RTE_PORT_HANDLING;
> -		break;
> -	case RTE_ETH_EVENT_INTR_RMV:
> -		if (port_id_is_invalid(port_id, DISABLED_WARN))
> -			break;
> -		if (rte_eal_alarm_set(100000,
> -				rmv_port_callback, (void *)(intptr_t)port_id))
> -			fprintf(stderr,
> -				"Could not set up deferred device removal\n");
> -		break;
> -	case RTE_ETH_EVENT_DESTROY:
> -		ports[port_id].port_status = RTE_PORT_CLOSED;
> -		printf("Port %u is closed\n", port_id);
> -		break;
> -	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
> -		uint16_t rxq_id;
> -		int ret;
> -
> -		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
> -		for (rxq_id = 0; ; rxq_id++) {
> -			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
> -							    NULL);
> -			if (ret <= 0)
> -				break;
> -			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
> -			       port_id, rxq_id);
> -
> -#ifdef RTE_NET_MLX5
> -			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
> -#endif
> -		}
> -		break;
> -	}
> -	case RTE_ETH_EVENT_ERR_RECOVERING:
> -		err_recovering_callback(port_id);
> -		break;
> -	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
> -		recover_success_callback(port_id);
> -		break;
> -	case RTE_ETH_EVENT_RECOVERY_FAILED:
> -		recover_failed_callback(port_id);
> -		break;
> -	default:
> -		break;
> -	}
> -	return 0;
> -}
> -
> -static int
> -register_eth_event_callback(void)
> -{
> -	int ret;
> -	enum rte_eth_event_type event;
> -
> -	for (event = RTE_ETH_EVENT_UNKNOWN;
> -			event < RTE_ETH_EVENT_MAX; event++) {
> -		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
> -				event,
> -				eth_event_callback,
> -				NULL);
> -		if (ret != 0) {
> -			TESTPMD_LOG(ERR, "Failed to register callback for "
> -					"%s event\n", eth_event_desc[event]);
> -			return -1;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -unregister_eth_event_callback(void)
> -{
> -	int ret;
> -	enum rte_eth_event_type event;
> -
> -	for (event = RTE_ETH_EVENT_UNKNOWN;
> -			event < RTE_ETH_EVENT_MAX; event++) {
> -		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
> -				event,
> -				eth_event_callback,
> -				NULL);
> -		if (ret != 0) {
> -			TESTPMD_LOG(ERR, "Failed to unregister callback for "
> -					"%s event\n", eth_event_desc[event]);
> -			return -1;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
> -/* This function is used by the interrupt thread */
> -static void
> -dev_event_callback(const char *device_name, enum rte_dev_event_type type,
> -			     __rte_unused void *arg)
> -{
> -	uint16_t port_id;
> -	int ret;
> -
> -	if (type >= RTE_DEV_EVENT_MAX) {
> -		fprintf(stderr, "%s called upon invalid event %d\n",
> -			__func__, type);
> -		fflush(stderr);
> -	}
> -
> -	switch (type) {
> -	case RTE_DEV_EVENT_REMOVE:
> -		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
> -			device_name);
> -		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
> -		if (ret) {
> -			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
> -				device_name);
> -			return;
> -		}
> -		/*
> -		 * Because the user's callback is invoked in eal interrupt
> -		 * callback, the interrupt callback need to be finished before
> -		 * it can be unregistered when detaching device. So finish
> -		 * callback soon and use a deferred removal to detach device
> -		 * is need. It is a workaround, once the device detaching be
> -		 * moved into the eal in the future, the deferred removal could
> -		 * be deleted.
> -		 */
> -		if (rte_eal_alarm_set(100000,
> -				rmv_port_callback, (void *)(intptr_t)port_id))
> -			RTE_LOG(ERR, EAL,
> -				"Could not set up deferred device removal\n");
> -		break;
> -	case RTE_DEV_EVENT_ADD:
> -		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
> -			device_name);
> -		/* TODO: After finish kernel driver binding,
> -		 * begin to attach port.
> -		 */
> -		break;
> -	default:
> -		break;
> -	}
> -}
> -
>   static void
>   rxtx_port_config(portid_t pid)
>   {
> @@ -4725,13 +4412,9 @@ main(int argc, char** argv)
>   			return -1;
>   		}
>   
> -		ret = rte_dev_event_callback_register(NULL,
> -			dev_event_callback, NULL);
> -		if (ret) {
> -			RTE_LOG(ERR, EAL,
> -				"fail  to register device event callback\n");
> +		ret = register_dev_event_callback();
> +		if (ret != 0)
>   			return -1;
> -		}
>   	}
>   
>   	if (!no_device_start && start_port(RTE_PORT_ALL) != 0) {
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 42782d5a05..5c8a052b43 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -1109,6 +1109,11 @@ void set_nb_pkt_per_burst(uint16_t pkt_burst);
>   char *list_pkt_forwarding_modes(void);
>   char *list_pkt_forwarding_retry_modes(void);
>   void set_pkt_forwarding_mode(const char *fwd_mode);
> +int get_event_name_mask(const char *name, uint32_t *mask);
> +int register_eth_event_callback(void);
> +int unregister_eth_event_callback(void);
> +int register_dev_event_callback(void);
> +int unregister_dev_event_callback(void);
>   void start_packet_forwarding(int with_tx_first);
>   void fwd_stats_display(void);
>   void fwd_stats_reset(void);
> @@ -1128,6 +1133,7 @@ void stop_port(portid_t pid);
>   void close_port(portid_t pid);
>   void reset_port(portid_t pid);
>   void attach_port(char *identifier);
> +void detach_device(struct rte_device *dev);
>   void detach_devargs(char *identifier);
>   void detach_port_device(portid_t port_id);
>   int all_ports_stopped(void);

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH v2 2/7] net/hns3: replace fp ops config function
  2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
  2023-11-01  3:40     ` lihuisong (C)
@ 2023-11-02 10:34     ` Konstantin Ananyev
  1 sibling, 0 replies; 85+ messages in thread
From: Konstantin Ananyev @ 2023-11-02 10:34 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, ajit.khaparde, haijie,
	Zhuangyuzeng (Yisen)
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli



> -----Original Message-----
> From: Fengchengwen <fengchengwen@huawei.com>
> Sent: Friday, October 20, 2023 11:08 AM
> To: thomas@monjalon.net; ferruh.yigit@amd.com; Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> ajit.khaparde@broadcom.com; haijie <haijie1@huawei.com>; Zhuangyuzeng (Yisen) <yisen.zhuang@huawei.com>
> Cc: dev@dpdk.org; andrew.rybchenko@oktetlabs.ru; kalesh-anakkur.purayil@broadcom.com; Honnappa.Nagarahalli@arm.com
> Subject: [PATCH v2 2/7] net/hns3: replace fp ops config function
> 
> This patch replace hns3_eth_dev_fp_ops_config() with
> rte_eth_fp_ops_setup().
> 
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/net/hns3/hns3_rxtx.c | 21 +++------------------
>  1 file changed, 3 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
> index f3c3b38c55..f43f1eb9ad 100644
> --- a/drivers/net/hns3/hns3_rxtx.c
> +++ b/drivers/net/hns3/hns3_rxtx.c
> @@ -4434,21 +4434,6 @@ hns3_trace_rxtx_function(struct rte_eth_dev *dev)
>  		 rx_mode.info, tx_mode.info);
>  }
> 
> -static void
> -hns3_eth_dev_fp_ops_config(const struct rte_eth_dev *dev)
> -{
> -	struct rte_eth_fp_ops *fpo = rte_eth_fp_ops;
> -	uint16_t port_id = dev->data->port_id;
> -
> -	fpo[port_id].rx_pkt_burst = dev->rx_pkt_burst;
> -	fpo[port_id].tx_pkt_burst = dev->tx_pkt_burst;
> -	fpo[port_id].tx_pkt_prepare = dev->tx_pkt_prepare;
> -	fpo[port_id].rx_descriptor_status = dev->rx_descriptor_status;
> -	fpo[port_id].tx_descriptor_status = dev->tx_descriptor_status;
> -	fpo[port_id].rxq.data = dev->data->rx_queues;
> -	fpo[port_id].txq.data = dev->data->tx_queues;
> -}
> -
>  void
>  hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
>  {
> @@ -4471,7 +4456,7 @@ hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
>  	}
> 
>  	hns3_trace_rxtx_function(eth_dev);
> -	hns3_eth_dev_fp_ops_config(eth_dev);
> +	rte_eth_fp_ops_setup(eth_dev);
>  }
> 
>  void
> @@ -4824,7 +4809,7 @@ hns3_stop_tx_datapath(struct rte_eth_dev *dev)
>  {
>  	dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
>  	dev->tx_pkt_prepare = NULL;
> -	hns3_eth_dev_fp_ops_config(dev);
> +	rte_eth_fp_ops_setup(dev);
> 
>  	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>  		return;
> @@ -4841,7 +4826,7 @@ hns3_start_tx_datapath(struct rte_eth_dev *dev)
>  {
>  	dev->tx_pkt_burst = hns3_get_tx_function(dev);
>  	dev->tx_pkt_prepare = hns3_get_tx_prepare(dev);
> -	hns3_eth_dev_fp_ops_config(dev);
> +	rte_eth_fp_ops_setup(dev);
> 
>  	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>  		return;
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> 


> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: [PATCH v2 4/7] net/bnxt: use fp ops setup function
  2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
  2023-11-01  3:48     ` lihuisong (C)
@ 2023-11-02 10:34     ` Konstantin Ananyev
  2023-11-02 16:29       ` Ajit Khaparde
  1 sibling, 1 reply; 85+ messages in thread
From: Konstantin Ananyev @ 2023-11-02 10:34 UTC (permalink / raw)
  To: Fengchengwen, thomas, ferruh.yigit, ajit.khaparde, Somnath Kotur
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli



> Use rte_eth_fp_ops_setup() instead of directly manipulating
> rte_eth_fp_ops variable.
> 
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 5 +----
>  drivers/net/bnxt/bnxt_ethdev.c | 5 +----
>  2 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index d8947d5b5f..3a08028331 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
>  	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
>  	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> 
> -	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> -		eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> -		eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(eth_dev);
>  	rte_mb();
> 
>  	/* Allow time for threads to exit the real burst functions. */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 003a6eec11..9d9b9ae8cf 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4428,10 +4428,7 @@ static void bnxt_dev_recover(void *arg)
>  	if (rc)
>  		goto err_start;
> 
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> -		bp->eth_dev->rx_pkt_burst;
> -	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> -		bp->eth_dev->tx_pkt_burst;
> +	rte_eth_fp_ops_setup(bp->eth_dev);
>  	rte_mb();
> 
>  	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
 

> 2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery
  2023-10-20 10:07   ` [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
@ 2023-11-02 16:28     ` Ajit Khaparde
  0 siblings, 0 replies; 85+ messages in thread
From: Ajit Khaparde @ 2023-11-02 16:28 UTC (permalink / raw)
  To: Chengwen Feng
  Cc: thomas, ferruh.yigit, konstantin.ananyev, Somnath Kotur,
	Kalesh AP, dev, andrew.rybchenko, Honnappa.Nagarahalli

[-- Attachment #1: Type: text/plain, Size: 3129 bytes --]

On Fri, Oct 20, 2023 at 3:11 AM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> If set data path functions to dummy functions before reports error
> recovering event, there maybe a race-condition with data path threads,
> this patch fixes it by setting data path functions to dummy functions
> only after reports such event.
>
> Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
>  drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
>  2 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index 0733cf4df2..d8947d5b5f 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
>                 PMD_DRV_LOG(INFO, "Port conn async event\n");
>                 break;
>         case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
> -               /*
> -                * Avoid any rx/tx packet processing during firmware reset
> -                * operation.
> -                */
> -               bnxt_stop_rxtx(bp->eth_dev);
> -
>                 /* Ignore reset notify async events when stopping the port */
>                 if (!bp->eth_dev->data->dev_started) {
> +                       bnxt_stop_rxtx(bp->eth_dev);
>                         bp->flags |= BNXT_FLAG_FATAL_ERROR;
>                         return;
>                 }
> @@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
>                                              RTE_ETH_EVENT_ERR_RECOVERING,
>                                              NULL);
>
> +               /*
> +                * Avoid any rx/tx packet processing during firmware reset
> +                * operation.
> +                */
> +               bnxt_stop_rxtx(bp->eth_dev);
> +
>                 pthread_mutex_lock(&bp->err_recovery_lock);
>                 event_data = data1;
>                 /* timestamp_lo/hi values are in units of 100ms */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 5c4d96d4b1..003a6eec11 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4616,14 +4616,14 @@ static void bnxt_check_fw_health(void *arg)
>         bp->flags |= BNXT_FLAG_FATAL_ERROR;
>         bp->flags |= BNXT_FLAG_FW_RESET;
>
> -       bnxt_stop_rxtx(bp->eth_dev);
> -
>         PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
>
>         rte_eth_dev_callback_process(bp->eth_dev,
>                                      RTE_ETH_EVENT_ERR_RECOVERING,
>                                      NULL);
>
> +       bnxt_stop_rxtx(bp->eth_dev);
> +
>         if (bnxt_is_primary_func(bp))
>                 wait_msec = info->primary_func_wait_period;
>         else
> --
> 2.17.1
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 4/7] net/bnxt: use fp ops setup function
  2023-11-02 10:34     ` Konstantin Ananyev
@ 2023-11-02 16:29       ` Ajit Khaparde
  0 siblings, 0 replies; 85+ messages in thread
From: Ajit Khaparde @ 2023-11-02 16:29 UTC (permalink / raw)
  To: Konstantin Ananyev
  Cc: Fengchengwen, thomas, ferruh.yigit, Somnath Kotur, dev,
	andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

[-- Attachment #1: Type: text/plain, Size: 2061 bytes --]

On Thu, Nov 2, 2023 at 3:35 AM Konstantin Ananyev
<konstantin.ananyev@huawei.com> wrote:
>
>
>
> > Use rte_eth_fp_ops_setup() instead of directly manipulating
> > rte_eth_fp_ops variable.
> >
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > ---
> >  drivers/net/bnxt/bnxt_cpr.c    | 5 +----
> >  drivers/net/bnxt/bnxt_ethdev.c | 5 +----
> >  2 files changed, 2 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> > index d8947d5b5f..3a08028331 100644
> > --- a/drivers/net/bnxt/bnxt_cpr.c
> > +++ b/drivers/net/bnxt/bnxt_cpr.c
> > @@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
> >       eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
> >       eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
> >
> > -     rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
> > -             eth_dev->rx_pkt_burst;
> > -     rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
> > -             eth_dev->tx_pkt_burst;
> > +     rte_eth_fp_ops_setup(eth_dev);
> >       rte_mb();
> >
> >       /* Allow time for threads to exit the real burst functions. */
> > diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> > index 003a6eec11..9d9b9ae8cf 100644
> > --- a/drivers/net/bnxt/bnxt_ethdev.c
> > +++ b/drivers/net/bnxt/bnxt_ethdev.c
> > @@ -4428,10 +4428,7 @@ static void bnxt_dev_recover(void *arg)
> >       if (rc)
> >               goto err_start;
> >
> > -     rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> > -             bp->eth_dev->rx_pkt_burst;
> > -     rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> > -             bp->eth_dev->tx_pkt_burst;
> > +     rte_eth_fp_ops_setup(bp->eth_dev);
> >       rte_mb();
> >
> >       PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> > --
>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

>
>
> > 2.17.1
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 0/7] fix race-condition of proactive error handling mode
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
                     ` (6 preceding siblings ...)
  2023-10-20 10:07   ` [PATCH v2 7/7] doc: testpmd support event handling section Chengwen Feng
@ 2023-11-06  1:35   ` fengchengwen
  7 siblings, 0 replies; 85+ messages in thread
From: fengchengwen @ 2023-11-06  1:35 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Friendly ping.

On 2023/10/20 18:07, Chengwen Feng wrote:
> This patch fixes race-condition of proactive error handling mode, the
> discussion thread [1].
> 
> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> 
> Chengwen Feng (7):
>   ethdev: fix race-condition of proactive error handling mode
>   net/hns3: replace fp ops config function
>   net/bnxt: fix race-condition when report error recovery
>   net/bnxt: use fp ops setup function
>   app/testpmd: add error recovery usage demo
>   app/testpmd: extract event handling to event.c
>   doc: testpmd support event handling section
> 
> ---
> v2: 
> - extract event handling to event.c and document it, which address
>   Ferruh's comment.
> - add ack-by from Konstantin Ananyev and Dongdong Liu.
> 
>  app/test-pmd/event.c                         | 390 +++++++++++++++++++
>  app/test-pmd/meson.build                     |   1 +
>  app/test-pmd/parameters.c                    |  36 +-
>  app/test-pmd/testpmd.c                       | 247 +-----------
>  app/test-pmd/testpmd.h                       |  10 +-
>  doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
>  doc/guides/testpmd_app_ug/event_handling.rst |  80 ++++
>  doc/guides/testpmd_app_ug/index.rst          |   1 +
>  drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
>  drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
>  drivers/net/hns3/hns3_rxtx.c                 |  21 +-
>  lib/ethdev/ethdev_driver.c                   |   8 +
>  lib/ethdev/ethdev_driver.h                   |  10 +
>  lib/ethdev/rte_ethdev.h                      |  32 +-
>  lib/ethdev/version.map                       |   1 +
>  15 files changed, 551 insertions(+), 333 deletions(-)
>  create mode 100644 app/test-pmd/event.c
>  create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 7/7] doc: testpmd support event handling section
  2023-10-20 10:07   ` [PATCH v2 7/7] doc: testpmd support event handling section Chengwen Feng
@ 2023-11-06  9:28     ` lihuisong (C)
  2023-11-06 12:39       ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: lihuisong (C) @ 2023-11-06  9:28 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli


在 2023/10/20 18:07, Chengwen Feng 写道:
> Add new section of event handling, which documented the ethdev and
> device events.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>   doc/guides/testpmd_app_ug/event_handling.rst | 80 ++++++++++++++++++++
>   doc/guides/testpmd_app_ug/index.rst          |  1 +
>   2 files changed, 81 insertions(+)
>   create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>
> diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
> new file mode 100644
> index 0000000000..c116753ad0
> --- /dev/null
> +++ b/doc/guides/testpmd_app_ug/event_handling.rst
> @@ -0,0 +1,80 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2023 HiSilicon Limited.
> +
> +Event Handling
> +==============
> +
> +The ``testpmd`` application supports following two type event handling:
> +
> +ethdev events
> +-------------
> +
> +The ``testpmd`` provide options "--print-event" and "--mask-event" to control
> +whether display such as "Port x y event" when received "y" event on port "x".
> +This is named as default processing.
> +
> +This section details the support events, unless otherwise specified, only the
> +default processing is support.
> +
> +- ``RTE_ETH_EVENT_INTR_LSC``:
> +  If device started with lsc enabled, the PMD will launch this event when it
> +  detect link status changes.
> +
> +- ``RTE_ETH_EVENT_QUEUE_STATE``:
> +  Used only within vhost PMD to report vring whether enabled.
Used only within vhost PMD? it seems that this is only used by vhost.
but ethdev lib says:
/** queue state event (enabled/disabled) */
     RTE_ETH_EVENT_QUEUE_STATE,
testpmd is also a demo for user, so suggest that change this commnts to 
avoid the confuesed by that.
> +
> +- ``RTE_ETH_EVENT_INTR_RESET``:
> +  Used to report reset interrupt happened, this event only reported when the
> +  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
> +
> +- ``RTE_ETH_EVENT_VF_MBOX``:
> +  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
> +
> +- ``RTE_ETH_EVENT_INTR_RMV``:
> +  Used to report device removal event. The ``testpmd`` will remove the port
> +  later.
> +
> +- ``RTE_ETH_EVENT_NEW``:
> +  Used to report port was probed event. The ``testpmd`` will setup the port
> +  later.
> +
> +- ``RTE_ETH_EVENT_DESTROY``:
> +  Used to report port was released event. The ``testpmd`` will changes the
> +  port's status.
> +
> +- ``RTE_ETH_EVENT_MACSEC``:
> +  Used to report MACsec offload related event.
> +
> +- ``RTE_ETH_EVENT_IPSEC``:
> +  Used to report IPsec offload related event.
> +
> +- ``RTE_ETH_EVENT_FLOW_AGED``:
> +  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
> +
> +- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
> +  Used to report available Rx descriptors was smaller than the threshold. Only
> +  valid with mlx5 PMD.
> +
> +- ``RTE_ETH_EVENT_ERR_RECOVERING``:
> +  Used to report error happened, and PMD will do recover after report this
> +  event. The ``testpmd`` will stop packet forwarding when received the event.
> +
> +- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
> +  Used to report error recovery success. The ``testpmd`` will restart packet
> +  forwarding when received the event.
> +
> +- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
> +  Used to report error recovery failed. The ``testpmd`` will display one
> +  message to show which ports failed.
> +
> +.. note::
> +
> +   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
> +   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
> +   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
> +
> +device events
> +-------------
> +
> +Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
> +enabled only when the ``testpmd`` stated with options "--hot-plug".
> diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
> index 1ac0d25d57..3c09448c4e 100644
> --- a/doc/guides/testpmd_app_ug/index.rst
> +++ b/doc/guides/testpmd_app_ug/index.rst
> @@ -14,3 +14,4 @@ Testpmd Application User Guide
>       build_app
>       run_app
>       testpmd_funcs
> +    event_handling

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 7/7] doc: testpmd support event handling section
  2023-11-06  9:28     ` lihuisong (C)
@ 2023-11-06 12:39       ` fengchengwen
  2023-11-08  3:02         ` lihuisong (C)
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-11-06 12:39 UTC (permalink / raw)
  To: lihuisong (C),
	thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Hi Huisong,

On 2023/11/6 17:28, lihuisong (C) wrote:
> 
> 在 2023/10/20 18:07, Chengwen Feng 写道:
>> Add new section of event handling, which documented the ethdev and
>> device events.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>   doc/guides/testpmd_app_ug/event_handling.rst | 80 ++++++++++++++++++++
>>   doc/guides/testpmd_app_ug/index.rst          |  1 +
>>   2 files changed, 81 insertions(+)
>>   create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>>
>> diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
>> new file mode 100644
>> index 0000000000..c116753ad0
>> --- /dev/null
>> +++ b/doc/guides/testpmd_app_ug/event_handling.rst
>> @@ -0,0 +1,80 @@
>> +..  SPDX-License-Identifier: BSD-3-Clause
>> +    Copyright(c) 2023 HiSilicon Limited.
>> +
>> +Event Handling
>> +==============
>> +
>> +The ``testpmd`` application supports following two type event handling:
>> +
>> +ethdev events
>> +-------------
>> +
>> +The ``testpmd`` provide options "--print-event" and "--mask-event" to control
>> +whether display such as "Port x y event" when received "y" event on port "x".
>> +This is named as default processing.
>> +
>> +This section details the support events, unless otherwise specified, only the
>> +default processing is support.
>> +
>> +- ``RTE_ETH_EVENT_INTR_LSC``:
>> +  If device started with lsc enabled, the PMD will launch this event when it
>> +  detect link status changes.
>> +
>> +- ``RTE_ETH_EVENT_QUEUE_STATE``:
>> +  Used only within vhost PMD to report vring whether enabled.
> Used only within vhost PMD? it seems that this is only used by vhost.
> but ethdev lib says:
> /** queue state event (enabled/disabled) */
>     RTE_ETH_EVENT_QUEUE_STATE,
> testpmd is also a demo for user, so suggest that change this commnts to avoid the confuesed by that.

Ok, I think vhost could as example, e.g.
Used when notify queue state event changed, for example: vhost PMD use this event report vring whether enabled.

Thanks
Chengwen

>> +
>> +- ``RTE_ETH_EVENT_INTR_RESET``:
>> +  Used to report reset interrupt happened, this event only reported when the
>> +  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
>> +
>> +- ``RTE_ETH_EVENT_VF_MBOX``:
>> +  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
>> +
>> +- ``RTE_ETH_EVENT_INTR_RMV``:
>> +  Used to report device removal event. The ``testpmd`` will remove the port
>> +  later.
>> +
>> +- ``RTE_ETH_EVENT_NEW``:
>> +  Used to report port was probed event. The ``testpmd`` will setup the port
>> +  later.
>> +
>> +- ``RTE_ETH_EVENT_DESTROY``:
>> +  Used to report port was released event. The ``testpmd`` will changes the
>> +  port's status.
>> +
>> +- ``RTE_ETH_EVENT_MACSEC``:
>> +  Used to report MACsec offload related event.
>> +
>> +- ``RTE_ETH_EVENT_IPSEC``:
>> +  Used to report IPsec offload related event.
>> +
>> +- ``RTE_ETH_EVENT_FLOW_AGED``:
>> +  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
>> +
>> +- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
>> +  Used to report available Rx descriptors was smaller than the threshold. Only
>> +  valid with mlx5 PMD.
>> +
>> +- ``RTE_ETH_EVENT_ERR_RECOVERING``:
>> +  Used to report error happened, and PMD will do recover after report this
>> +  event. The ``testpmd`` will stop packet forwarding when received the event.
>> +
>> +- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
>> +  Used to report error recovery success. The ``testpmd`` will restart packet
>> +  forwarding when received the event.
>> +
>> +- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
>> +  Used to report error recovery failed. The ``testpmd`` will display one
>> +  message to show which ports failed.
>> +
>> +.. note::
>> +
>> +   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
>> +   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
>> +   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
>> +
>> +device events
>> +-------------
>> +
>> +Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
>> +enabled only when the ``testpmd`` stated with options "--hot-plug".
>> diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
>> index 1ac0d25d57..3c09448c4e 100644
>> --- a/doc/guides/testpmd_app_ug/index.rst
>> +++ b/doc/guides/testpmd_app_ug/index.rst
>> @@ -14,3 +14,4 @@ Testpmd Application User Guide
>>       build_app
>>       run_app
>>       testpmd_funcs
>> +    event_handling
> .

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 5/7] app/testpmd: add error recovery usage demo
  2023-11-01  4:08     ` lihuisong (C)
@ 2023-11-06 13:01       ` fengchengwen
  0 siblings, 0 replies; 85+ messages in thread
From: fengchengwen @ 2023-11-06 13:01 UTC (permalink / raw)
  To: lihuisong (C),
	thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Hi Huisong,

On 2023/11/1 12:08, lihuisong (C) wrote:
> 
> 在 2023/10/20 18:07, Chengwen Feng 写道:
>> This patch adds error recovery usage demo which will:
>> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>>     is received.
>> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>>     event is received.
>> 3. prompt the ports that fail to recovery and need to be removed when
>>     the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
> Why not suggest that try to call dev_reset() or other way to recovery?

It was already discussed many times, which is the reason why introduced the
RTE_ETH_EVENT_RECOVERY_XXX event, please refer previous thread.

>>
>> In addition, a message is added to the printed information, requiring
>> no command to be executed during the error recovery.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>> ---
>>   app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>>   app/test-pmd/testpmd.h |  4 ++-
>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 595b77748c..39a25238e5 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -3942,6 +3942,77 @@ rmv_port_callback(void *arg)
>>           start_packet_forwarding(0);
>>   }
>>   +static int need_start_when_recovery_over;
>> +
>> +static bool
>> +has_port_in_err_recovering(void)
>> +{
>> +    struct rte_port *port;
>> +    portid_t pid;
>> +
>> +    RTE_ETH_FOREACH_DEV(pid) {
>> +        port = &ports[pid];
>> +        if (port->err_recovering)
>> +            return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +static void
>> +err_recovering_callback(portid_t port_id)
>> +{
>> +    if (!has_port_in_err_recovering())
>> +        printf("Please stop executing any commands until recovery result events are received!\n");
>> +
>> +    ports[port_id].err_recovering = 1;
>> +    ports[port_id].recover_failed = 0;
>> +
>> +    /* To simplify implementation, stop forwarding regardless of whether the port is used. */
>> +    if (!test_done) {
>> +        printf("Stop packet forwarding because some ports are in error recovering!\n");
>> +        stop_packet_forwarding();
>> +        need_start_when_recovery_over = 1;
>> +    }
>> +}
>> +
>> +static void
>> +recover_success_callback(portid_t port_id)
>> +{
>> +    ports[port_id].err_recovering = 0;
>> +    if (has_port_in_err_recovering())
>> +        return;
>> +
>> +    if (need_start_when_recovery_over) {
>> +        printf("Recovery success! Restart packet forwarding!\n");
>> +        start_packet_forwarding(0);
> s/start_packet_forwarding(0)/start_packet_forwarding() ?

start_packet_forwarding must have one parameter, 0 is proper use for here.

Thanks
Chengwen

>> +        need_start_when_recovery_over = 0;
>> +    } else {
>> +        printf("Recovery success!\n");
>> +    }
>> +}
>> +
>> +static void
>> +recover_failed_callback(portid_t port_id)
>> +{
>> +    struct rte_port *port;
>> +    portid_t pid;
>> +
>> +    ports[port_id].err_recovering = 0;
>> +    ports[port_id].recover_failed = 1;
>> +    if (has_port_in_err_recovering())
>> +        return;
>> +
>> +    need_start_when_recovery_over = 0;
>> +    printf("The ports:");
>> +    RTE_ETH_FOREACH_DEV(pid) {
>> +        port = &ports[pid];
>> +        if (port->recover_failed)
>> +            printf(" %u", pid);
>> +    }
>> +    printf(" recovery failed! Please remove them!\n");
>> +}
>> +
>>   /* This function is used by the interrupt thread */
>>   static int
>>   eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>> @@ -3997,6 +4068,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>>           }
>>           break;
>>       }
>> +    case RTE_ETH_EVENT_ERR_RECOVERING:
>> +        err_recovering_callback(port_id);
>> +        break;
>> +    case RTE_ETH_EVENT_RECOVERY_SUCCESS:
>> +        recover_success_callback(port_id);
>> +        break;
>> +    case RTE_ETH_EVENT_RECOVERY_FAILED:
>> +        recover_failed_callback(port_id);
>> +        break;
>>       default:
>>           break;
>>       }
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index 09a36b90b8..42782d5a05 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -342,7 +342,9 @@ struct rte_port {
>>       uint8_t                 member_flag : 1, /**< bonding member port */
>>                   bond_flag : 1, /**< port is bond device */
>>                   fwd_mac_swap : 1, /**< swap packet MAC before forward */
>> -                update_conf : 1; /**< need to update bonding device configuration */
>> +                update_conf : 1, /**< need to update bonding device configuration */
>> +                err_recovering : 1, /**< port is in error recovering */
>> +                recover_failed : 1; /**< port recover failed */
>>       struct port_template    *pattern_templ_list; /**< Pattern templates. */
>>       struct port_template    *actions_templ_list; /**< Actions templates. */
>>       struct port_table       *table_list; /**< Flow tables. */
> .

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 0/7] fix race-condition of proactive error handling mode
  2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
                   ` (6 preceding siblings ...)
  2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
@ 2023-11-06 13:11 ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 1/7] ethdev: " Chengwen Feng
                     ` (7 more replies)
  7 siblings, 8 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch fixes race-condition of proactive error handling mode, the
discussion thread [1].

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Chengwen Feng (7):
  ethdev: fix race-condition of proactive error handling mode
  net/hns3: replace fp ops config function
  net/bnxt: fix race-condition when report error recovery
  net/bnxt: use fp ops setup function
  app/testpmd: add error recovery usage demo
  app/testpmd: extract event handling to event.c
  doc: testpmd support event handling section

---
v3:
- adjust the usage of RTE_ETH_EVENT_QUEUE_STATE in 7/7 commit.
- add ack-by from Konstantin Ananyev, Ajit Khaparde and Huisong Li.
v2:
- extract event handling to event.c and document it, which address
  Ferruh's comment.
- add ack-by from Konstantin Ananyev and Dongdong Liu.

 app/test-pmd/event.c                         | 390 +++++++++++++++++++
 app/test-pmd/meson.build                     |   1 +
 app/test-pmd/parameters.c                    |  36 +-
 app/test-pmd/testpmd.c                       | 247 +-----------
 app/test-pmd/testpmd.h                       |  10 +-
 doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
 doc/guides/testpmd_app_ug/event_handling.rst |  81 ++++
 doc/guides/testpmd_app_ug/index.rst          |   1 +
 drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
 drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
 drivers/net/hns3/hns3_rxtx.c                 |  21 +-
 lib/ethdev/ethdev_driver.c                   |   8 +
 lib/ethdev/ethdev_driver.h                   |  10 +
 lib/ethdev/rte_ethdev.h                      |  32 +-
 lib/ethdev/version.map                       |   1 +
 15 files changed, 552 insertions(+), 333 deletions(-)
 create mode 100644 app/test-pmd/event.c
 create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst

-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 1/7] ethdev: fix race-condition of proactive error handling mode
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 2/7] net/hns3: replace fp ops config function Chengwen Feng
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Andrew Rybchenko, Somnath Kotur, Kalesh AP
  Cc: dev, Honnappa.Nagarahalli

In the proactive error handling mode, the PMD will set the data path
pointers to dummy functions and then try recovery, in this period the
application may still invoking data path API. This will introduce a
race-condition with data path which may lead to crash [1].

Although the PMD added delay after setting data path pointers to cover
the above race-condition, it reduces the probability, but it doesn't
solve the problem.

To solve the race-condition problem fundamentally, the following
requirements are added:
1. The PMD should set the data path pointers to dummy functions after
   report RTE_ETH_EVENT_ERR_RECOVERING event.
2. The application should stop data path API invocation when process
   the RTE_ETH_EVENT_ERR_RECOVERING event.
3. The PMD should set the data path pointers to valid functions before
   report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
4. The application should enable data path API invocation when process
   the RTE_ETH_EVENT_RECOVERY_SUCCESS event.

Also, this patch introduce a driver internal function
rte_eth_fp_ops_setup which used as an help function for PMD.

[1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/

Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
 lib/ethdev/ethdev_driver.c              |  8 +++++++
 lib/ethdev/ethdev_driver.h              | 10 ++++++++
 lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
 lib/ethdev/version.map                  |  1 +
 5 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index c145a9066c..e380ff135a 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -638,14 +638,9 @@ different from the application invokes recovery in PASSIVE mode,
 the PMD automatically recovers from error in PROACTIVE mode,
 and only a small amount of work is required for the application.
 
-During error detection and automatic recovery,
-the PMD sets the data path pointers to dummy functions
-(which will prevent the crash),
-and also make sure the control path operations fail with a return code ``-EBUSY``.
-
-Because the PMD recovers automatically,
-the application can only sense that the data flow is disconnected for a while
-and the control API returns an error in this period.
+During error detection and automatic recovery, the PMD sets the data path
+pointers to dummy functions and also make sure the control path operations
+failed with a return code ``-EBUSY``.
 
 In order to sense the error happening/recovering,
 as well as to restore some additional configuration,
@@ -653,9 +648,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_ERR_RECOVERING``
    Notify the application that an error is detected
-   and the recovery is being started.
+   and the recovery is about to start.
    Upon receiving the event, the application should not invoke
-   any control path function until receiving
+   any control and data path API until receiving
    ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
 
 .. note::
@@ -666,8 +661,9 @@ three events are available:
 
 ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
    Notify the application that the recovery from error is successful,
-   the PMD already re-configures the port,
-   and the effect is the same as a restart operation.
+   the PMD already re-configures the port.
+   The application should restore some additional configuration, and then
+   enable data path API invocation.
 
 ``RTE_ETH_EVENT_RECOVERY_FAILED``
    Notify the application that the recovery from error failed,
diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index fff4b7b4cd..65ead7b910 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -537,6 +537,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, const char *ring_name,
 	return rc;
 }
 
+void
+rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
+{
+	if (dev == NULL)
+		return;
+	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
+}
+
 const struct rte_memzone *
 rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
 			 uint16_t queue_id, size_t size, unsigned int align,
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index b482cd12bb..eaf2c9ca6d 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1636,6 +1636,16 @@ int
 rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
 		 uint16_t queue_id);
 
+/**
+ * @internal
+ * Setup eth fast-path API to ethdev values.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ */
+__rte_internal
+void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
+
 /**
  * @internal
  * Atomically set the link status for the specific device.
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 77331ce652..89de57f214 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4034,25 +4034,28 @@ enum rte_eth_event_type {
 	 */
 	RTE_ETH_EVENT_RX_AVAIL_THRESH,
 	/** Port recovering from a hardware or firmware error.
-	 * If PMD supports proactive error recovery,
-	 * it should trigger this event to notify application
-	 * that it detected an error and the recovery is being started.
-	 * Upon receiving the event, the application should not invoke any control path API
-	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
-	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event.
-	 * The PMD will set the data path pointers to dummy functions,
-	 * and re-set the data path pointers to non-dummy functions
-	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
-	 * It means that the application cannot send or receive any packets
-	 * during this period.
+	 *
+	 * If PMD supports proactive error recovery, it should trigger this
+	 * event to notify application that it detected an error and the
+	 * recovery is about to start.
+	 *
+	 * Upon receiving the event, the application should not invoke any
+	 * control and data path API until receiving
+	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
+	 * event.
+	 *
+	 * Once this event is reported, the PMD will set the data path pointers
+	 * to dummy functions, and re-set the data path pointers to valid
+	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
+	 *
 	 * @note Before the PMD reports the recovery result,
 	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
 	 * because a larger error may occur during the recovery.
 	 */
 	RTE_ETH_EVENT_ERR_RECOVERING,
 	/** Port recovers successfully from the error.
-	 * The PMD already re-configured the port,
-	 * and the effect is the same as a restart operation.
+	 *
+	 * The PMD already re-configured the port:
 	 * a) The following operation will be retained: (alphabetically)
 	 *    - DCB configuration
 	 *    - FEC configuration
@@ -4079,6 +4082,9 @@ enum rte_eth_event_type {
 	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
 	 * c) Any other configuration will not be stored
 	 *    and will need to be re-configured.
+	 *
+	 * The application should restore some additional configuration
+	 * (see above case b/c), and then enable data path API invocation.
 	 */
 	RTE_ETH_EVENT_RECOVERY_SUCCESS,
 	/** Port recovery failed.
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 9336522b71..0a4ea077ac 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -339,6 +339,7 @@ INTERNAL {
 	rte_eth_devices;
 	rte_eth_dma_zone_free;
 	rte_eth_dma_zone_reserve;
+	rte_eth_fp_ops_setup;
 	rte_eth_hairpin_queue_peer_bind;
 	rte_eth_hairpin_queue_peer_unbind;
 	rte_eth_hairpin_queue_peer_update;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 2/7] net/hns3: replace fp ops config function
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 1/7] ethdev: " Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde, Jie Hai,
	Yisen Zhuang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch replace hns3_eth_dev_fp_ops_config() with
rte_eth_fp_ops_setup().

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Dongdong Liu <liudongdong3@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
 drivers/net/hns3/hns3_rxtx.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 09b7e90c70..ecee74cf11 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -4443,21 +4443,6 @@ hns3_trace_rxtx_function(struct rte_eth_dev *dev)
 		 rx_mode.info, tx_mode.info);
 }
 
-static void
-hns3_eth_dev_fp_ops_config(const struct rte_eth_dev *dev)
-{
-	struct rte_eth_fp_ops *fpo = rte_eth_fp_ops;
-	uint16_t port_id = dev->data->port_id;
-
-	fpo[port_id].rx_pkt_burst = dev->rx_pkt_burst;
-	fpo[port_id].tx_pkt_burst = dev->tx_pkt_burst;
-	fpo[port_id].tx_pkt_prepare = dev->tx_pkt_prepare;
-	fpo[port_id].rx_descriptor_status = dev->rx_descriptor_status;
-	fpo[port_id].tx_descriptor_status = dev->tx_descriptor_status;
-	fpo[port_id].rxq.data = dev->data->rx_queues;
-	fpo[port_id].txq.data = dev->data->tx_queues;
-}
-
 void
 hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 {
@@ -4480,7 +4465,7 @@ hns3_set_rxtx_function(struct rte_eth_dev *eth_dev)
 	}
 
 	hns3_trace_rxtx_function(eth_dev);
-	hns3_eth_dev_fp_ops_config(eth_dev);
+	rte_eth_fp_ops_setup(eth_dev);
 }
 
 void
@@ -4833,7 +4818,7 @@ hns3_stop_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 	dev->tx_pkt_prepare = NULL;
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
@@ -4850,7 +4835,7 @@ hns3_start_tx_datapath(struct rte_eth_dev *dev)
 {
 	dev->tx_pkt_burst = hns3_get_tx_function(dev);
 	dev->tx_pkt_prepare = hns3_get_tx_prepare(dev);
-	hns3_eth_dev_fp_ops_config(dev);
+	rte_eth_fp_ops_setup(dev);
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 3/7] net/bnxt: fix race-condition when report error recovery
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 1/7] ethdev: " Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 2/7] net/hns3: replace fp ops config function Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 4/7] net/bnxt: use fp ops setup function Chengwen Feng
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Somnath Kotur, Kalesh AP
  Cc: dev, andrew.rybchenko, Honnappa.Nagarahalli

If set data path functions to dummy functions before reports error
recovering event, there maybe a race-condition with data path threads,
this patch fixes it by setting data path functions to dummy functions
only after reports such event.

Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
 drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 0733cf4df2..d8947d5b5f 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
 		PMD_DRV_LOG(INFO, "Port conn async event\n");
 		break;
 	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
-		/*
-		 * Avoid any rx/tx packet processing during firmware reset
-		 * operation.
-		 */
-		bnxt_stop_rxtx(bp->eth_dev);
-
 		/* Ignore reset notify async events when stopping the port */
 		if (!bp->eth_dev->data->dev_started) {
+			bnxt_stop_rxtx(bp->eth_dev);
 			bp->flags |= BNXT_FLAG_FATAL_ERROR;
 			return;
 		}
@@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
 					     RTE_ETH_EVENT_ERR_RECOVERING,
 					     NULL);
 
+		/*
+		 * Avoid any rx/tx packet processing during firmware reset
+		 * operation.
+		 */
+		bnxt_stop_rxtx(bp->eth_dev);
+
 		pthread_mutex_lock(&bp->err_recovery_lock);
 		event_data = data1;
 		/* timestamp_lo/hi values are in units of 100ms */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 5c4d96d4b1..003a6eec11 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4616,14 +4616,14 @@ static void bnxt_check_fw_health(void *arg)
 	bp->flags |= BNXT_FLAG_FATAL_ERROR;
 	bp->flags |= BNXT_FLAG_FW_RESET;
 
-	bnxt_stop_rxtx(bp->eth_dev);
-
 	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
 
 	rte_eth_dev_callback_process(bp->eth_dev,
 				     RTE_ETH_EVENT_ERR_RECOVERING,
 				     NULL);
 
+	bnxt_stop_rxtx(bp->eth_dev);
+
 	if (bnxt_is_primary_func(bp))
 		wait_msec = info->primary_func_wait_period;
 	else
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 4/7] net/bnxt: use fp ops setup function
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
                     ` (2 preceding siblings ...)
  2023-11-06 13:11   ` [PATCH v3 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde, Somnath Kotur
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Use rte_eth_fp_ops_setup() instead of directly manipulating
rte_eth_fp_ops variable.

Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 5 +----
 drivers/net/bnxt/bnxt_ethdev.c | 5 +----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index d8947d5b5f..3a08028331 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -416,10 +416,7 @@ void bnxt_stop_rxtx(struct rte_eth_dev *eth_dev)
 	eth_dev->rx_pkt_burst = rte_eth_pkt_burst_dummy;
 	eth_dev->tx_pkt_burst = rte_eth_pkt_burst_dummy;
 
-	rte_eth_fp_ops[eth_dev->data->port_id].rx_pkt_burst =
-		eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[eth_dev->data->port_id].tx_pkt_burst =
-		eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(eth_dev);
 	rte_mb();
 
 	/* Allow time for threads to exit the real burst functions. */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 003a6eec11..9d9b9ae8cf 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4428,10 +4428,7 @@ static void bnxt_dev_recover(void *arg)
 	if (rc)
 		goto err_start;
 
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
-		bp->eth_dev->rx_pkt_burst;
-	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
-		bp->eth_dev->tx_pkt_burst;
+	rte_eth_fp_ops_setup(bp->eth_dev);
 	rte_mb();
 
 	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 5/7] app/testpmd: add error recovery usage demo
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
                     ` (3 preceding siblings ...)
  2023-11-06 13:11   ` [PATCH v3 4/7] net/bnxt: use fp ops setup function Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch adds error recovery usage demo which will:
1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
   is received.
2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
   event is received.
3. prompt the ports that fail to recovery and need to be removed when
   the RTE_ETH_EVENT_RECOVERY_FAILED event is received.

In addition, a message is added to the printed information, requiring
no command to be executed during the error recovery.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h |  4 ++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9e4e99e53b..a45c411398 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3941,6 +3941,77 @@ rmv_port_callback(void *arg)
 		start_packet_forwarding(0);
 }
 
+static int need_start_when_recovery_over;
+
+static bool
+has_port_in_err_recovering(void)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->err_recovering)
+			return true;
+	}
+
+	return false;
+}
+
+static void
+err_recovering_callback(portid_t port_id)
+{
+	if (!has_port_in_err_recovering())
+		printf("Please stop executing any commands until recovery result events are received!\n");
+
+	ports[port_id].err_recovering = 1;
+	ports[port_id].recover_failed = 0;
+
+	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
+	if (!test_done) {
+		printf("Stop packet forwarding because some ports are in error recovering!\n");
+		stop_packet_forwarding();
+		need_start_when_recovery_over = 1;
+	}
+}
+
+static void
+recover_success_callback(portid_t port_id)
+{
+	ports[port_id].err_recovering = 0;
+	if (has_port_in_err_recovering())
+		return;
+
+	if (need_start_when_recovery_over) {
+		printf("Recovery success! Restart packet forwarding!\n");
+		start_packet_forwarding(0);
+		need_start_when_recovery_over = 0;
+	} else {
+		printf("Recovery success!\n");
+	}
+}
+
+static void
+recover_failed_callback(portid_t port_id)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	ports[port_id].err_recovering = 0;
+	ports[port_id].recover_failed = 1;
+	if (has_port_in_err_recovering())
+		return;
+
+	need_start_when_recovery_over = 0;
+	printf("The ports:");
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->recover_failed)
+			printf(" %u", pid);
+	}
+	printf(" recovery failed! Please remove them!\n");
+}
+
 /* This function is used by the interrupt thread */
 static int
 eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
@@ -3996,6 +4067,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
 		}
 		break;
 	}
+	case RTE_ETH_EVENT_ERR_RECOVERING:
+		err_recovering_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
+		recover_success_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_FAILED:
+		recover_failed_callback(port_id);
+		break;
 	default:
 		break;
 	}
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9b10a9ea1c..b8a0a4715a 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -342,7 +342,9 @@ struct rte_port {
 	uint8_t                 member_flag : 1, /**< bonding member port */
 				bond_flag : 1, /**< port is bond device */
 				fwd_mac_swap : 1, /**< swap packet MAC before forward */
-				update_conf : 1; /**< need to update bonding device configuration */
+				update_conf : 1, /**< need to update bonding device configuration */
+				err_recovering : 1, /**< port is in error recovering */
+				recover_failed : 1; /**< port recover failed */
 	struct port_template    *pattern_templ_list; /**< Pattern templates. */
 	struct port_template    *actions_templ_list; /**< Actions templates. */
 	struct port_table       *table_list; /**< Flow tables. */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 6/7] app/testpmd: extract event handling to event.c
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
                     ` (4 preceding siblings ...)
  2023-11-06 13:11   ` [PATCH v3 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-06 13:11   ` [PATCH v3 7/7] doc: testpmd support event handling section Chengwen Feng
  2023-12-05  2:30   ` [PATCH v3 0/7] fix race-condition of proactive error handling mode fengchengwen
  7 siblings, 0 replies; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

This patch extract event handling (including eth-event and dev-event)
to a new file 'event.c'.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
 app/test-pmd/event.c      | 390 ++++++++++++++++++++++++++++++++++++++
 app/test-pmd/meson.build  |   1 +
 app/test-pmd/parameters.c |  36 +---
 app/test-pmd/testpmd.c    | 327 +-------------------------------
 app/test-pmd/testpmd.h    |   6 +
 5 files changed, 407 insertions(+), 353 deletions(-)
 create mode 100644 app/test-pmd/event.c

diff --git a/app/test-pmd/event.c b/app/test-pmd/event.c
new file mode 100644
index 0000000000..8393e105d7
--- /dev/null
+++ b/app/test-pmd/event.c
@@ -0,0 +1,390 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 HiSilicon Limited
+ */
+
+#include <stdint.h>
+
+#include <rte_alarm.h>
+#include <rte_ethdev.h>
+#include <rte_dev.h>
+#include <rte_log.h>
+#ifdef RTE_NET_MLX5
+#include "mlx5_testpmd.h"
+#endif
+
+#include "testpmd.h"
+
+/* Pretty printing of ethdev events */
+static const char * const eth_event_desc[] = {
+	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
+	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
+	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
+	[RTE_ETH_EVENT_INTR_RESET] = "reset",
+	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
+	[RTE_ETH_EVENT_IPSEC] = "IPsec",
+	[RTE_ETH_EVENT_MACSEC] = "MACsec",
+	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
+	[RTE_ETH_EVENT_NEW] = "device probed",
+	[RTE_ETH_EVENT_DESTROY] = "device released",
+	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
+	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
+	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
+	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
+	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
+	[RTE_ETH_EVENT_MAX] = NULL,
+};
+
+/*
+ * Display or mask ether events
+ * Default to all events except VF_MBOX
+ */
+uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
+
+int
+get_event_name_mask(const char *name, uint32_t *mask)
+{
+	if (!strcmp(name, "unknown"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
+	else if (!strcmp(name, "intr_lsc"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
+	else if (!strcmp(name, "queue_state"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
+	else if (!strcmp(name, "intr_reset"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
+	else if (!strcmp(name, "vf_mbox"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
+	else if (!strcmp(name, "ipsec"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
+	else if (!strcmp(name, "macsec"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
+	else if (!strcmp(name, "intr_rmv"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
+	else if (!strcmp(name, "dev_probed"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
+	else if (!strcmp(name, "dev_released"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
+	else if (!strcmp(name, "flow_aged"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
+	else if (!strcmp(name, "err_recovering"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
+	else if (!strcmp(name, "recovery_success"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
+	else if (!strcmp(name, "recovery_failed"))
+		*mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
+	else if (!strcmp(name, "all"))
+		*mask = ~UINT32_C(0);
+	else
+		return -1;
+
+	return 0;
+}
+
+static void
+rmv_port_callback(void *arg)
+{
+	int need_to_start = 0;
+	int org_no_link_check = no_link_check;
+	portid_t port_id = (intptr_t)arg;
+	struct rte_eth_dev_info dev_info;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_RET(port_id);
+
+	if (!test_done && port_is_forwarding(port_id)) {
+		need_to_start = 1;
+		stop_packet_forwarding();
+	}
+	no_link_check = 1;
+	stop_port(port_id);
+	no_link_check = org_no_link_check;
+
+	ret = eth_dev_info_get_print_err(port_id, &dev_info);
+	if (ret != 0)
+		TESTPMD_LOG(ERR,
+			"Failed to get device info for port %d, not detaching\n",
+			port_id);
+	else {
+		struct rte_device *device = dev_info.device;
+		close_port(port_id);
+		detach_device(device); /* might be already removed or have more ports */
+	}
+	if (need_to_start)
+		start_packet_forwarding(0);
+}
+
+static int need_start_when_recovery_over;
+
+static bool
+has_port_in_err_recovering(void)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->err_recovering)
+			return true;
+	}
+
+	return false;
+}
+
+static void
+err_recovering_callback(portid_t port_id)
+{
+	if (!has_port_in_err_recovering())
+		printf("Please stop executing any commands until recovery result events are received!\n");
+
+	ports[port_id].err_recovering = 1;
+	ports[port_id].recover_failed = 0;
+
+	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
+	if (!test_done) {
+		printf("Stop packet forwarding because some ports are in error recovering!\n");
+		stop_packet_forwarding();
+		need_start_when_recovery_over = 1;
+	}
+}
+
+static void
+recover_success_callback(portid_t port_id)
+{
+	ports[port_id].err_recovering = 0;
+	if (has_port_in_err_recovering())
+		return;
+
+	if (need_start_when_recovery_over) {
+		printf("Recovery success! Restart packet forwarding!\n");
+		start_packet_forwarding(0);
+		need_start_when_recovery_over = 0;
+	} else {
+		printf("Recovery success!\n");
+	}
+}
+
+static void
+recover_failed_callback(portid_t port_id)
+{
+	struct rte_port *port;
+	portid_t pid;
+
+	ports[port_id].err_recovering = 0;
+	ports[port_id].recover_failed = 1;
+	if (has_port_in_err_recovering())
+		return;
+
+	need_start_when_recovery_over = 0;
+	printf("The ports:");
+	RTE_ETH_FOREACH_DEV(pid) {
+		port = &ports[pid];
+		if (port->recover_failed)
+			printf(" %u", pid);
+	}
+	printf(" recovery failed! Please remove them!\n");
+}
+
+/* This function is used by the interrupt thread */
+static int
+eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
+		  void *ret_param)
+{
+	RTE_SET_USED(param);
+	RTE_SET_USED(ret_param);
+
+	if (type >= RTE_ETH_EVENT_MAX) {
+		fprintf(stderr,
+			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
+			port_id, __func__, type);
+		fflush(stderr);
+	} else if (event_print_mask & (UINT32_C(1) << type)) {
+		printf("\nPort %" PRIu16 ": %s event\n", port_id,
+			eth_event_desc[type]);
+		fflush(stdout);
+	}
+
+	switch (type) {
+	case RTE_ETH_EVENT_NEW:
+		ports[port_id].need_setup = 1;
+		ports[port_id].port_status = RTE_PORT_HANDLING;
+		break;
+	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, DISABLED_WARN))
+			break;
+		if (rte_eal_alarm_set(100000,
+				rmv_port_callback, (void *)(intptr_t)port_id))
+			fprintf(stderr,
+				"Could not set up deferred device removal\n");
+		break;
+	case RTE_ETH_EVENT_DESTROY:
+		ports[port_id].port_status = RTE_PORT_CLOSED;
+		printf("Port %u is closed\n", port_id);
+		break;
+	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
+		uint16_t rxq_id;
+		int ret;
+
+		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
+		for (rxq_id = 0; ; rxq_id++) {
+			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
+							    NULL);
+			if (ret <= 0)
+				break;
+			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
+			       port_id, rxq_id);
+
+#ifdef RTE_NET_MLX5
+			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
+#endif
+		}
+		break;
+	}
+	case RTE_ETH_EVENT_ERR_RECOVERING:
+		err_recovering_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
+		recover_success_callback(port_id);
+		break;
+	case RTE_ETH_EVENT_RECOVERY_FAILED:
+		recover_failed_callback(port_id);
+		break;
+	default:
+		break;
+	}
+	return 0;
+}
+
+int
+register_eth_event_callback(void)
+{
+	int ret;
+	enum rte_eth_event_type event;
+
+	for (event = RTE_ETH_EVENT_UNKNOWN;
+			event < RTE_ETH_EVENT_MAX; event++) {
+		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
+				event,
+				eth_event_callback,
+				NULL);
+		if (ret != 0) {
+			TESTPMD_LOG(ERR, "Failed to register callback for "
+					"%s event\n", eth_event_desc[event]);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+int
+unregister_eth_event_callback(void)
+{
+	int ret;
+	enum rte_eth_event_type event;
+
+	for (event = RTE_ETH_EVENT_UNKNOWN;
+			event < RTE_ETH_EVENT_MAX; event++) {
+		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
+				event,
+				eth_event_callback,
+				NULL);
+		if (ret != 0) {
+			TESTPMD_LOG(ERR, "Failed to unregister callback for "
+					"%s event\n", eth_event_desc[event]);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/* This function is used by the interrupt thread */
+static void
+dev_event_callback(const char *device_name, enum rte_dev_event_type type,
+			     __rte_unused void *arg)
+{
+	uint16_t port_id;
+	int ret;
+
+	if (type >= RTE_DEV_EVENT_MAX) {
+		fprintf(stderr, "%s called upon invalid event %d\n",
+			__func__, type);
+		fflush(stderr);
+	}
+
+	switch (type) {
+	case RTE_DEV_EVENT_REMOVE:
+		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
+			device_name);
+		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
+		if (ret) {
+			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
+				device_name);
+			return;
+		}
+		/*
+		 * Because the user's callback is invoked in eal interrupt
+		 * callback, the interrupt callback need to be finished before
+		 * it can be unregistered when detaching device. So finish
+		 * callback soon and use a deferred removal to detach device
+		 * is need. It is a workaround, once the device detaching be
+		 * moved into the eal in the future, the deferred removal could
+		 * be deleted.
+		 */
+		if (rte_eal_alarm_set(100000,
+				rmv_port_callback, (void *)(intptr_t)port_id))
+			RTE_LOG(ERR, EAL,
+				"Could not set up deferred device removal\n");
+		break;
+	case RTE_DEV_EVENT_ADD:
+		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
+			device_name);
+		/* TODO: After finish kernel driver binding,
+		 * begin to attach port.
+		 */
+		break;
+	default:
+		break;
+	}
+}
+
+int
+register_dev_event_callback(void)
+{
+	int ret;
+
+	ret = rte_dev_event_callback_register(NULL,
+		dev_event_callback, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL,
+			"fail  to register device event callback\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+unregister_dev_event_callback(void)
+{
+	int ret;
+
+	ret = rte_dev_event_callback_unregister(NULL,
+		dev_event_callback, NULL);
+	if (ret < 0) {
+		RTE_LOG(ERR, EAL,
+			"fail to unregister device event callback.\n");
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 719f875be0..b7860f3ab0 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -14,6 +14,7 @@ sources = files(
         'cmd_flex_item.c',
         'config.c',
         'csumonly.c',
+        'event.c',
         'flowgen.c',
         'icmpecho.c',
         'ieee1588fwd.c',
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index a9ca58339d..504315da8b 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -434,45 +434,19 @@ static int
 parse_event_printing_config(const char *optarg, int enable)
 {
 	uint32_t mask = 0;
+	int ret;
 
-	if (!strcmp(optarg, "unknown"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN;
-	else if (!strcmp(optarg, "intr_lsc"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC;
-	else if (!strcmp(optarg, "queue_state"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE;
-	else if (!strcmp(optarg, "intr_reset"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET;
-	else if (!strcmp(optarg, "vf_mbox"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_VF_MBOX;
-	else if (!strcmp(optarg, "ipsec"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_IPSEC;
-	else if (!strcmp(optarg, "macsec"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_MACSEC;
-	else if (!strcmp(optarg, "intr_rmv"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV;
-	else if (!strcmp(optarg, "dev_probed"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_NEW;
-	else if (!strcmp(optarg, "dev_released"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY;
-	else if (!strcmp(optarg, "flow_aged"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED;
-	else if (!strcmp(optarg, "err_recovering"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING;
-	else if (!strcmp(optarg, "recovery_success"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS;
-	else if (!strcmp(optarg, "recovery_failed"))
-		mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED;
-	else if (!strcmp(optarg, "all"))
-		mask = ~UINT32_C(0);
-	else {
+	ret = get_event_name_mask(optarg, &mask);
+	if (ret != 0) {
 		fprintf(stderr, "Invalid event: %s\n", optarg);
 		return -1;
 	}
+
 	if (enable)
 		event_print_mask |= mask;
 	else
 		event_print_mask &= ~mask;
+
 	return 0;
 }
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a45c411398..0831180f3b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -434,41 +434,6 @@ uint8_t clear_ptypes = true;
 /* Hairpin ports configuration mode. */
 uint32_t hairpin_mode;
 
-/* Pretty printing of ethdev events */
-static const char * const eth_event_desc[] = {
-	[RTE_ETH_EVENT_UNKNOWN] = "unknown",
-	[RTE_ETH_EVENT_INTR_LSC] = "link state change",
-	[RTE_ETH_EVENT_QUEUE_STATE] = "queue state",
-	[RTE_ETH_EVENT_INTR_RESET] = "reset",
-	[RTE_ETH_EVENT_VF_MBOX] = "VF mbox",
-	[RTE_ETH_EVENT_IPSEC] = "IPsec",
-	[RTE_ETH_EVENT_MACSEC] = "MACsec",
-	[RTE_ETH_EVENT_INTR_RMV] = "device removal",
-	[RTE_ETH_EVENT_NEW] = "device probed",
-	[RTE_ETH_EVENT_DESTROY] = "device released",
-	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
-	[RTE_ETH_EVENT_RX_AVAIL_THRESH] = "RxQ available descriptors threshold reached",
-	[RTE_ETH_EVENT_ERR_RECOVERING] = "error recovering",
-	[RTE_ETH_EVENT_RECOVERY_SUCCESS] = "error recovery successful",
-	[RTE_ETH_EVENT_RECOVERY_FAILED] = "error recovery failed",
-	[RTE_ETH_EVENT_MAX] = NULL,
-};
-
-/*
- * Display or mask ether events
- * Default to all events except VF_MBOX
- */
-uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_LSC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_QUEUE_STATE) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RESET) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_SUCCESS) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERY_FAILED);
 /*
  * Decide if all memory are locked for performance.
  */
@@ -700,12 +665,6 @@ eth_dev_set_mtu_mp(uint16_t port_id, uint16_t mtu)
 /* Forward function declarations */
 static void setup_attached_port(portid_t pi);
 static void check_all_ports_link_status(uint32_t port_mask);
-static int eth_event_callback(portid_t port_id,
-			      enum rte_eth_event_type type,
-			      void *param, void *ret_param);
-static void dev_event_callback(const char *device_name,
-				enum rte_dev_event_type type,
-				void *param);
 static void fill_xstats_display_info(void);
 
 /*
@@ -3671,7 +3630,7 @@ setup_attached_port(portid_t pi)
 	printf("Done\n");
 }
 
-static void
+void
 detach_device(struct rte_device *dev)
 {
 	portid_t sibling;
@@ -3817,13 +3776,9 @@ pmd_test_exit(void)
 			return;
 		}
 
-		ret = rte_dev_event_callback_unregister(NULL,
-			dev_event_callback, NULL);
-		if (ret < 0) {
-			RTE_LOG(ERR, EAL,
-				"fail to unregister device event callback.\n");
+		ret = unregister_dev_event_callback();
+		if (ret != 0)
 			return;
-		}
 
 		ret = rte_dev_hotplug_handle_disable();
 		if (ret) {
@@ -3908,274 +3863,6 @@ check_all_ports_link_status(uint32_t port_mask)
 	}
 }
 
-static void
-rmv_port_callback(void *arg)
-{
-	int need_to_start = 0;
-	int org_no_link_check = no_link_check;
-	portid_t port_id = (intptr_t)arg;
-	struct rte_eth_dev_info dev_info;
-	int ret;
-
-	RTE_ETH_VALID_PORTID_OR_RET(port_id);
-
-	if (!test_done && port_is_forwarding(port_id)) {
-		need_to_start = 1;
-		stop_packet_forwarding();
-	}
-	no_link_check = 1;
-	stop_port(port_id);
-	no_link_check = org_no_link_check;
-
-	ret = eth_dev_info_get_print_err(port_id, &dev_info);
-	if (ret != 0)
-		TESTPMD_LOG(ERR,
-			"Failed to get device info for port %d, not detaching\n",
-			port_id);
-	else {
-		struct rte_device *device = dev_info.device;
-		close_port(port_id);
-		detach_device(device); /* might be already removed or have more ports */
-	}
-	if (need_to_start)
-		start_packet_forwarding(0);
-}
-
-static int need_start_when_recovery_over;
-
-static bool
-has_port_in_err_recovering(void)
-{
-	struct rte_port *port;
-	portid_t pid;
-
-	RTE_ETH_FOREACH_DEV(pid) {
-		port = &ports[pid];
-		if (port->err_recovering)
-			return true;
-	}
-
-	return false;
-}
-
-static void
-err_recovering_callback(portid_t port_id)
-{
-	if (!has_port_in_err_recovering())
-		printf("Please stop executing any commands until recovery result events are received!\n");
-
-	ports[port_id].err_recovering = 1;
-	ports[port_id].recover_failed = 0;
-
-	/* To simplify implementation, stop forwarding regardless of whether the port is used. */
-	if (!test_done) {
-		printf("Stop packet forwarding because some ports are in error recovering!\n");
-		stop_packet_forwarding();
-		need_start_when_recovery_over = 1;
-	}
-}
-
-static void
-recover_success_callback(portid_t port_id)
-{
-	ports[port_id].err_recovering = 0;
-	if (has_port_in_err_recovering())
-		return;
-
-	if (need_start_when_recovery_over) {
-		printf("Recovery success! Restart packet forwarding!\n");
-		start_packet_forwarding(0);
-		need_start_when_recovery_over = 0;
-	} else {
-		printf("Recovery success!\n");
-	}
-}
-
-static void
-recover_failed_callback(portid_t port_id)
-{
-	struct rte_port *port;
-	portid_t pid;
-
-	ports[port_id].err_recovering = 0;
-	ports[port_id].recover_failed = 1;
-	if (has_port_in_err_recovering())
-		return;
-
-	need_start_when_recovery_over = 0;
-	printf("The ports:");
-	RTE_ETH_FOREACH_DEV(pid) {
-		port = &ports[pid];
-		if (port->recover_failed)
-			printf(" %u", pid);
-	}
-	printf(" recovery failed! Please remove them!\n");
-}
-
-/* This function is used by the interrupt thread */
-static int
-eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
-		  void *ret_param)
-{
-	RTE_SET_USED(param);
-	RTE_SET_USED(ret_param);
-
-	if (type >= RTE_ETH_EVENT_MAX) {
-		fprintf(stderr,
-			"\nPort %" PRIu16 ": %s called upon invalid event %d\n",
-			port_id, __func__, type);
-		fflush(stderr);
-	} else if (event_print_mask & (UINT32_C(1) << type)) {
-		printf("\nPort %" PRIu16 ": %s event\n", port_id,
-			eth_event_desc[type]);
-		fflush(stdout);
-	}
-
-	switch (type) {
-	case RTE_ETH_EVENT_NEW:
-		ports[port_id].need_setup = 1;
-		ports[port_id].port_status = RTE_PORT_HANDLING;
-		break;
-	case RTE_ETH_EVENT_INTR_RMV:
-		if (port_id_is_invalid(port_id, DISABLED_WARN))
-			break;
-		if (rte_eal_alarm_set(100000,
-				rmv_port_callback, (void *)(intptr_t)port_id))
-			fprintf(stderr,
-				"Could not set up deferred device removal\n");
-		break;
-	case RTE_ETH_EVENT_DESTROY:
-		ports[port_id].port_status = RTE_PORT_CLOSED;
-		printf("Port %u is closed\n", port_id);
-		break;
-	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
-		uint16_t rxq_id;
-		int ret;
-
-		/* avail_thresh query API rewinds rxq_id, no need to check max RxQ num */
-		for (rxq_id = 0; ; rxq_id++) {
-			ret = rte_eth_rx_avail_thresh_query(port_id, &rxq_id,
-							    NULL);
-			if (ret <= 0)
-				break;
-			printf("Received avail_thresh event, port: %u, rxq_id: %u\n",
-			       port_id, rxq_id);
-
-#ifdef RTE_NET_MLX5
-			mlx5_test_avail_thresh_event_handler(port_id, rxq_id);
-#endif
-		}
-		break;
-	}
-	case RTE_ETH_EVENT_ERR_RECOVERING:
-		err_recovering_callback(port_id);
-		break;
-	case RTE_ETH_EVENT_RECOVERY_SUCCESS:
-		recover_success_callback(port_id);
-		break;
-	case RTE_ETH_EVENT_RECOVERY_FAILED:
-		recover_failed_callback(port_id);
-		break;
-	default:
-		break;
-	}
-	return 0;
-}
-
-static int
-register_eth_event_callback(void)
-{
-	int ret;
-	enum rte_eth_event_type event;
-
-	for (event = RTE_ETH_EVENT_UNKNOWN;
-			event < RTE_ETH_EVENT_MAX; event++) {
-		ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
-				event,
-				eth_event_callback,
-				NULL);
-		if (ret != 0) {
-			TESTPMD_LOG(ERR, "Failed to register callback for "
-					"%s event\n", eth_event_desc[event]);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
-static int
-unregister_eth_event_callback(void)
-{
-	int ret;
-	enum rte_eth_event_type event;
-
-	for (event = RTE_ETH_EVENT_UNKNOWN;
-			event < RTE_ETH_EVENT_MAX; event++) {
-		ret = rte_eth_dev_callback_unregister(RTE_ETH_ALL,
-				event,
-				eth_event_callback,
-				NULL);
-		if (ret != 0) {
-			TESTPMD_LOG(ERR, "Failed to unregister callback for "
-					"%s event\n", eth_event_desc[event]);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
-/* This function is used by the interrupt thread */
-static void
-dev_event_callback(const char *device_name, enum rte_dev_event_type type,
-			     __rte_unused void *arg)
-{
-	uint16_t port_id;
-	int ret;
-
-	if (type >= RTE_DEV_EVENT_MAX) {
-		fprintf(stderr, "%s called upon invalid event %d\n",
-			__func__, type);
-		fflush(stderr);
-	}
-
-	switch (type) {
-	case RTE_DEV_EVENT_REMOVE:
-		RTE_LOG(DEBUG, EAL, "The device: %s has been removed!\n",
-			device_name);
-		ret = rte_eth_dev_get_port_by_name(device_name, &port_id);
-		if (ret) {
-			RTE_LOG(ERR, EAL, "can not get port by device %s!\n",
-				device_name);
-			return;
-		}
-		/*
-		 * Because the user's callback is invoked in eal interrupt
-		 * callback, the interrupt callback need to be finished before
-		 * it can be unregistered when detaching device. So finish
-		 * callback soon and use a deferred removal to detach device
-		 * is need. It is a workaround, once the device detaching be
-		 * moved into the eal in the future, the deferred removal could
-		 * be deleted.
-		 */
-		if (rte_eal_alarm_set(100000,
-				rmv_port_callback, (void *)(intptr_t)port_id))
-			RTE_LOG(ERR, EAL,
-				"Could not set up deferred device removal\n");
-		break;
-	case RTE_DEV_EVENT_ADD:
-		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
-			device_name);
-		/* TODO: After finish kernel driver binding,
-		 * begin to attach port.
-		 */
-		break;
-	default:
-		break;
-	}
-}
-
 static void
 rxtx_port_config(portid_t pid)
 {
@@ -4724,13 +4411,9 @@ main(int argc, char** argv)
 			return -1;
 		}
 
-		ret = rte_dev_event_callback_register(NULL,
-			dev_event_callback, NULL);
-		if (ret) {
-			RTE_LOG(ERR, EAL,
-				"fail  to register device event callback\n");
+		ret = register_dev_event_callback();
+		if (ret != 0)
 			return -1;
-		}
 	}
 
 	if (!no_device_start && start_port(RTE_PORT_ALL) != 0) {
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index b8a0a4715a..8d3fb3475d 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -1109,6 +1109,11 @@ void set_nb_pkt_per_burst(uint16_t pkt_burst);
 char *list_pkt_forwarding_modes(void);
 char *list_pkt_forwarding_retry_modes(void);
 void set_pkt_forwarding_mode(const char *fwd_mode);
+int get_event_name_mask(const char *name, uint32_t *mask);
+int register_eth_event_callback(void);
+int unregister_eth_event_callback(void);
+int register_dev_event_callback(void);
+int unregister_dev_event_callback(void);
 void start_packet_forwarding(int with_tx_first);
 void fwd_stats_display(void);
 void fwd_stats_reset(void);
@@ -1128,6 +1133,7 @@ void stop_port(portid_t pid);
 void close_port(portid_t pid);
 void reset_port(portid_t pid);
 void attach_port(char *identifier);
+void detach_device(struct rte_device *dev);
 void detach_devargs(char *identifier);
 void detach_port_device(portid_t port_id);
 int all_ports_stopped(void);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 7/7] doc: testpmd support event handling section
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
                     ` (5 preceding siblings ...)
  2023-11-06 13:11   ` [PATCH v3 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
@ 2023-11-06 13:11   ` Chengwen Feng
  2023-11-08  3:03     ` lihuisong (C)
  2023-12-05  2:30   ` [PATCH v3 0/7] fix race-condition of proactive error handling mode fengchengwen
  7 siblings, 1 reply; 85+ messages in thread
From: Chengwen Feng @ 2023-11-06 13:11 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde,
	Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Add new section of event handling, which documented the ethdev and
device events.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/testpmd_app_ug/event_handling.rst | 81 ++++++++++++++++++++
 doc/guides/testpmd_app_ug/index.rst          |  1 +
 2 files changed, 82 insertions(+)
 create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst

diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
new file mode 100644
index 0000000000..1c39e0c486
--- /dev/null
+++ b/doc/guides/testpmd_app_ug/event_handling.rst
@@ -0,0 +1,81 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2023 HiSilicon Limited.
+
+Event Handling
+==============
+
+The ``testpmd`` application supports following two type event handling:
+
+ethdev events
+-------------
+
+The ``testpmd`` provide options "--print-event" and "--mask-event" to control
+whether display such as "Port x y event" when received "y" event on port "x".
+This is named as default processing.
+
+This section details the support events, unless otherwise specified, only the
+default processing is support.
+
+- ``RTE_ETH_EVENT_INTR_LSC``:
+  If device started with lsc enabled, the PMD will launch this event when it
+  detect link status changes.
+
+- ``RTE_ETH_EVENT_QUEUE_STATE``:
+  Used when notify queue state event changed, for example: vhost PMD use this
+  event report whether vring enabled.
+
+- ``RTE_ETH_EVENT_INTR_RESET``:
+  Used to report reset interrupt happened, this event only reported when the
+  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
+
+- ``RTE_ETH_EVENT_VF_MBOX``:
+  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
+
+- ``RTE_ETH_EVENT_INTR_RMV``:
+  Used to report device removal event. The ``testpmd`` will remove the port
+  later.
+
+- ``RTE_ETH_EVENT_NEW``:
+  Used to report port was probed event. The ``testpmd`` will setup the port
+  later.
+
+- ``RTE_ETH_EVENT_DESTROY``:
+  Used to report port was released event. The ``testpmd`` will changes the
+  port's status.
+
+- ``RTE_ETH_EVENT_MACSEC``:
+  Used to report MACsec offload related event.
+
+- ``RTE_ETH_EVENT_IPSEC``:
+  Used to report IPsec offload related event.
+
+- ``RTE_ETH_EVENT_FLOW_AGED``:
+  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
+
+- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
+  Used to report available Rx descriptors was smaller than the threshold. Only
+  valid with mlx5 PMD.
+
+- ``RTE_ETH_EVENT_ERR_RECOVERING``:
+  Used to report error happened, and PMD will do recover after report this
+  event. The ``testpmd`` will stop packet forwarding when received the event.
+
+- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
+  Used to report error recovery success. The ``testpmd`` will restart packet
+  forwarding when received the event.
+
+- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
+  Used to report error recovery failed. The ``testpmd`` will display one
+  message to show which ports failed.
+
+.. note::
+
+   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
+   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
+   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
+
+device events
+-------------
+
+Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
+enabled only when the ``testpmd`` stated with options "--hot-plug".
diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
index 1ac0d25d57..3c09448c4e 100644
--- a/doc/guides/testpmd_app_ug/index.rst
+++ b/doc/guides/testpmd_app_ug/index.rst
@@ -14,3 +14,4 @@ Testpmd Application User Guide
     build_app
     run_app
     testpmd_funcs
+    event_handling
-- 
2.17.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 7/7] doc: testpmd support event handling section
  2023-11-06 12:39       ` fengchengwen
@ 2023-11-08  3:02         ` lihuisong (C)
  0 siblings, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-08  3:02 UTC (permalink / raw)
  To: fengchengwen, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli


在 2023/11/6 20:39, fengchengwen 写道:
> Hi Huisong,
>
> On 2023/11/6 17:28, lihuisong (C) wrote:
>> 在 2023/10/20 18:07, Chengwen Feng 写道:
>>> Add new section of event handling, which documented the ethdev and
>>> device events.
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> ---
>>>    doc/guides/testpmd_app_ug/event_handling.rst | 80 ++++++++++++++++++++
>>>    doc/guides/testpmd_app_ug/index.rst          |  1 +
>>>    2 files changed, 81 insertions(+)
>>>    create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>>>
>>> diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
>>> new file mode 100644
>>> index 0000000000..c116753ad0
>>> --- /dev/null
>>> +++ b/doc/guides/testpmd_app_ug/event_handling.rst
>>> @@ -0,0 +1,80 @@
>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>> +    Copyright(c) 2023 HiSilicon Limited.
>>> +
>>> +Event Handling
>>> +==============
>>> +
>>> +The ``testpmd`` application supports following two type event handling:
>>> +
>>> +ethdev events
>>> +-------------
>>> +
>>> +The ``testpmd`` provide options "--print-event" and "--mask-event" to control
>>> +whether display such as "Port x y event" when received "y" event on port "x".
>>> +This is named as default processing.
>>> +
>>> +This section details the support events, unless otherwise specified, only the
>>> +default processing is support.
>>> +
>>> +- ``RTE_ETH_EVENT_INTR_LSC``:
>>> +  If device started with lsc enabled, the PMD will launch this event when it
>>> +  detect link status changes.
>>> +
>>> +- ``RTE_ETH_EVENT_QUEUE_STATE``:
>>> +  Used only within vhost PMD to report vring whether enabled.
>> Used only within vhost PMD? it seems that this is only used by vhost.
>> but ethdev lib says:
>> /** queue state event (enabled/disabled) */
>>      RTE_ETH_EVENT_QUEUE_STATE,
>> testpmd is also a demo for user, so suggest that change this commnts to avoid the confuesed by that.
> Ok, I think vhost could as example, e.g.
> Used when notify queue state event changed, for example: vhost PMD use this event report vring whether enabled.
>
> Thanks
> Chengwen
ok
>
>>> +
>>> +- ``RTE_ETH_EVENT_INTR_RESET``:
>>> +  Used to report reset interrupt happened, this event only reported when the
>>> +  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
>>> +
>>> +- ``RTE_ETH_EVENT_VF_MBOX``:
>>> +  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
>>> +
>>> +- ``RTE_ETH_EVENT_INTR_RMV``:
>>> +  Used to report device removal event. The ``testpmd`` will remove the port
>>> +  later.
>>> +
>>> +- ``RTE_ETH_EVENT_NEW``:
>>> +  Used to report port was probed event. The ``testpmd`` will setup the port
>>> +  later.
>>> +
>>> +- ``RTE_ETH_EVENT_DESTROY``:
>>> +  Used to report port was released event. The ``testpmd`` will changes the
>>> +  port's status.
>>> +
>>> +- ``RTE_ETH_EVENT_MACSEC``:
>>> +  Used to report MACsec offload related event.
>>> +
>>> +- ``RTE_ETH_EVENT_IPSEC``:
>>> +  Used to report IPsec offload related event.
>>> +
>>> +- ``RTE_ETH_EVENT_FLOW_AGED``:
>>> +  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
>>> +
>>> +- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
>>> +  Used to report available Rx descriptors was smaller than the threshold. Only
>>> +  valid with mlx5 PMD.
>>> +
>>> +- ``RTE_ETH_EVENT_ERR_RECOVERING``:
>>> +  Used to report error happened, and PMD will do recover after report this
>>> +  event. The ``testpmd`` will stop packet forwarding when received the event.
>>> +
>>> +- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
>>> +  Used to report error recovery success. The ``testpmd`` will restart packet
>>> +  forwarding when received the event.
>>> +
>>> +- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
>>> +  Used to report error recovery failed. The ``testpmd`` will display one
>>> +  message to show which ports failed.
>>> +
>>> +.. note::
>>> +
>>> +   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
>>> +   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
>>> +   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
>>> +
>>> +device events
>>> +-------------
>>> +
>>> +Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
>>> +enabled only when the ``testpmd`` stated with options "--hot-plug".
>>> diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
>>> index 1ac0d25d57..3c09448c4e 100644
>>> --- a/doc/guides/testpmd_app_ug/index.rst
>>> +++ b/doc/guides/testpmd_app_ug/index.rst
>>> @@ -14,3 +14,4 @@ Testpmd Application User Guide
>>>        build_app
>>>        run_app
>>>        testpmd_funcs
>>> +    event_handling
>> .
> .

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 7/7] doc: testpmd support event handling section
  2023-11-06 13:11   ` [PATCH v3 7/7] doc: testpmd support event handling section Chengwen Feng
@ 2023-11-08  3:03     ` lihuisong (C)
  0 siblings, 0 replies; 85+ messages in thread
From: lihuisong (C) @ 2023-11-08  3:03 UTC (permalink / raw)
  To: Chengwen Feng, thomas, ferruh.yigit, konstantin.ananyev,
	ajit.khaparde, Aman Singh, Yuying Zhang
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Acked-by: Huisong Li <lihuisong@huawei.com>

在 2023/11/6 21:11, Chengwen Feng 写道:
> Add new section of event handling, which documented the ethdev and
> device events.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>   doc/guides/testpmd_app_ug/event_handling.rst | 81 ++++++++++++++++++++
>   doc/guides/testpmd_app_ug/index.rst          |  1 +
>   2 files changed, 82 insertions(+)
>   create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>
> diff --git a/doc/guides/testpmd_app_ug/event_handling.rst b/doc/guides/testpmd_app_ug/event_handling.rst
> new file mode 100644
> index 0000000000..1c39e0c486
> --- /dev/null
> +++ b/doc/guides/testpmd_app_ug/event_handling.rst
> @@ -0,0 +1,81 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2023 HiSilicon Limited.
> +
> +Event Handling
> +==============
> +
> +The ``testpmd`` application supports following two type event handling:
> +
> +ethdev events
> +-------------
> +
> +The ``testpmd`` provide options "--print-event" and "--mask-event" to control
> +whether display such as "Port x y event" when received "y" event on port "x".
> +This is named as default processing.
> +
> +This section details the support events, unless otherwise specified, only the
> +default processing is support.
> +
> +- ``RTE_ETH_EVENT_INTR_LSC``:
> +  If device started with lsc enabled, the PMD will launch this event when it
> +  detect link status changes.
> +
> +- ``RTE_ETH_EVENT_QUEUE_STATE``:
> +  Used when notify queue state event changed, for example: vhost PMD use this
> +  event report whether vring enabled.
> +
> +- ``RTE_ETH_EVENT_INTR_RESET``:
> +  Used to report reset interrupt happened, this event only reported when the
> +  PMD supports ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``.
> +
> +- ``RTE_ETH_EVENT_VF_MBOX``:
> +  Used as a PF to process mailbox messages of the VFs to which the PF belongs.
> +
> +- ``RTE_ETH_EVENT_INTR_RMV``:
> +  Used to report device removal event. The ``testpmd`` will remove the port
> +  later.
> +
> +- ``RTE_ETH_EVENT_NEW``:
> +  Used to report port was probed event. The ``testpmd`` will setup the port
> +  later.
> +
> +- ``RTE_ETH_EVENT_DESTROY``:
> +  Used to report port was released event. The ``testpmd`` will changes the
> +  port's status.
> +
> +- ``RTE_ETH_EVENT_MACSEC``:
> +  Used to report MACsec offload related event.
> +
> +- ``RTE_ETH_EVENT_IPSEC``:
> +  Used to report IPsec offload related event.
> +
> +- ``RTE_ETH_EVENT_FLOW_AGED``:
> +  Used to report new aged-out flows was detected. Only valid with mlx5 PMD.
> +
> +- ``RTE_ETH_EVENT_RX_AVAIL_THRESH``:
> +  Used to report available Rx descriptors was smaller than the threshold. Only
> +  valid with mlx5 PMD.
> +
> +- ``RTE_ETH_EVENT_ERR_RECOVERING``:
> +  Used to report error happened, and PMD will do recover after report this
> +  event. The ``testpmd`` will stop packet forwarding when received the event.
> +
> +- ``RTE_ETH_EVENT_RECOVERY_SUCCESS``:
> +  Used to report error recovery success. The ``testpmd`` will restart packet
> +  forwarding when received the event.
> +
> +- ``RTE_ETH_EVENT_RECOVERY_FAILED``:
> +  Used to report error recovery failed. The ``testpmd`` will display one
> +  message to show which ports failed.
> +
> +.. note::
> +
> +   The ``RTE_ETH_EVENT_ERR_RECOVERING``, ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` and
> +   ``RTE_ETH_EVENT_RECOVERY_FAILED`` only reported when the PMD supports
> +   ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``.
> +
> +device events
> +-------------
> +
> +Including two events ``RTE_DEV_EVENT_ADD`` and ``RTE_DEV_EVENT_ADD``, and
> +enabled only when the ``testpmd`` stated with options "--hot-plug".
> diff --git a/doc/guides/testpmd_app_ug/index.rst b/doc/guides/testpmd_app_ug/index.rst
> index 1ac0d25d57..3c09448c4e 100644
> --- a/doc/guides/testpmd_app_ug/index.rst
> +++ b/doc/guides/testpmd_app_ug/index.rst
> @@ -14,3 +14,4 @@ Testpmd Application User Guide
>       build_app
>       run_app
>       testpmd_funcs
> +    event_handling

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 0/7] fix race-condition of proactive error handling mode
  2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
                     ` (6 preceding siblings ...)
  2023-11-06 13:11   ` [PATCH v3 7/7] doc: testpmd support event handling section Chengwen Feng
@ 2023-12-05  2:30   ` fengchengwen
  2024-01-15  1:44     ` fengchengwen
  7 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2023-12-05  2:30 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Hi Ferruh,

I notice this patchset was delegated to you, so could you take a view?

Thanks.

On 2023/11/6 21:11, Chengwen Feng wrote:
> This patch fixes race-condition of proactive error handling mode, the
> discussion thread [1].
> 
> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
> 
> Chengwen Feng (7):
>   ethdev: fix race-condition of proactive error handling mode
>   net/hns3: replace fp ops config function
>   net/bnxt: fix race-condition when report error recovery
>   net/bnxt: use fp ops setup function
>   app/testpmd: add error recovery usage demo
>   app/testpmd: extract event handling to event.c
>   doc: testpmd support event handling section
> 
> ---
> v3:
> - adjust the usage of RTE_ETH_EVENT_QUEUE_STATE in 7/7 commit.
> - add ack-by from Konstantin Ananyev, Ajit Khaparde and Huisong Li.
> v2:
> - extract event handling to event.c and document it, which address
>   Ferruh's comment.
> - add ack-by from Konstantin Ananyev and Dongdong Liu.
> 
>  app/test-pmd/event.c                         | 390 +++++++++++++++++++
>  app/test-pmd/meson.build                     |   1 +
>  app/test-pmd/parameters.c                    |  36 +-
>  app/test-pmd/testpmd.c                       | 247 +-----------
>  app/test-pmd/testpmd.h                       |  10 +-
>  doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
>  doc/guides/testpmd_app_ug/event_handling.rst |  81 ++++
>  doc/guides/testpmd_app_ug/index.rst          |   1 +
>  drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
>  drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
>  drivers/net/hns3/hns3_rxtx.c                 |  21 +-
>  lib/ethdev/ethdev_driver.c                   |   8 +
>  lib/ethdev/ethdev_driver.h                   |  10 +
>  lib/ethdev/rte_ethdev.h                      |  32 +-
>  lib/ethdev/version.map                       |   1 +
>  15 files changed, 552 insertions(+), 333 deletions(-)
>  create mode 100644 app/test-pmd/event.c
>  create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 0/7] fix race-condition of proactive error handling mode
  2023-12-05  2:30   ` [PATCH v3 0/7] fix race-condition of proactive error handling mode fengchengwen
@ 2024-01-15  1:44     ` fengchengwen
  2024-01-29  1:16       ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2024-01-15  1:44 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Kindly ping.

On 2023/12/5 10:30, fengchengwen wrote:
> Hi Ferruh,
> 
> I notice this patchset was delegated to you, so could you take a view?
> 
> Thanks.
> 
> On 2023/11/6 21:11, Chengwen Feng wrote:
>> This patch fixes race-condition of proactive error handling mode, the
>> discussion thread [1].
>>
>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>
>> Chengwen Feng (7):
>>   ethdev: fix race-condition of proactive error handling mode
>>   net/hns3: replace fp ops config function
>>   net/bnxt: fix race-condition when report error recovery
>>   net/bnxt: use fp ops setup function
>>   app/testpmd: add error recovery usage demo
>>   app/testpmd: extract event handling to event.c
>>   doc: testpmd support event handling section
>>
>> ---
>> v3:
>> - adjust the usage of RTE_ETH_EVENT_QUEUE_STATE in 7/7 commit.
>> - add ack-by from Konstantin Ananyev, Ajit Khaparde and Huisong Li.
>> v2:
>> - extract event handling to event.c and document it, which address
>>   Ferruh's comment.
>> - add ack-by from Konstantin Ananyev and Dongdong Liu.
>>
>>  app/test-pmd/event.c                         | 390 +++++++++++++++++++
>>  app/test-pmd/meson.build                     |   1 +
>>  app/test-pmd/parameters.c                    |  36 +-
>>  app/test-pmd/testpmd.c                       | 247 +-----------
>>  app/test-pmd/testpmd.h                       |  10 +-
>>  doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
>>  doc/guides/testpmd_app_ug/event_handling.rst |  81 ++++
>>  doc/guides/testpmd_app_ug/index.rst          |   1 +
>>  drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
>>  drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
>>  drivers/net/hns3/hns3_rxtx.c                 |  21 +-
>>  lib/ethdev/ethdev_driver.c                   |   8 +
>>  lib/ethdev/ethdev_driver.h                   |  10 +
>>  lib/ethdev/rte_ethdev.h                      |  32 +-
>>  lib/ethdev/version.map                       |   1 +
>>  15 files changed, 552 insertions(+), 333 deletions(-)
>>  create mode 100644 app/test-pmd/event.c
>>  create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>>
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 0/7] fix race-condition of proactive error handling mode
  2024-01-15  1:44     ` fengchengwen
@ 2024-01-29  1:16       ` fengchengwen
  2024-02-18  3:41         ` fengchengwen
  0 siblings, 1 reply; 85+ messages in thread
From: fengchengwen @ 2024-01-29  1:16 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Hi Ferruh,

Kindly ping for review.

Thanks

On 2024/1/15 9:44, fengchengwen wrote:
> Kindly ping.
> 
> On 2023/12/5 10:30, fengchengwen wrote:
>> Hi Ferruh,
>>
>> I notice this patchset was delegated to you, so could you take a view?
>>
>> Thanks.
>>
>> On 2023/11/6 21:11, Chengwen Feng wrote:
>>> This patch fixes race-condition of proactive error handling mode, the
>>> discussion thread [1].
>>>
>>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>
>>> Chengwen Feng (7):
>>>   ethdev: fix race-condition of proactive error handling mode
>>>   net/hns3: replace fp ops config function
>>>   net/bnxt: fix race-condition when report error recovery
>>>   net/bnxt: use fp ops setup function
>>>   app/testpmd: add error recovery usage demo
>>>   app/testpmd: extract event handling to event.c
>>>   doc: testpmd support event handling section
>>>
>>> ---
>>> v3:
>>> - adjust the usage of RTE_ETH_EVENT_QUEUE_STATE in 7/7 commit.
>>> - add ack-by from Konstantin Ananyev, Ajit Khaparde and Huisong Li.
>>> v2:
>>> - extract event handling to event.c and document it, which address
>>>   Ferruh's comment.
>>> - add ack-by from Konstantin Ananyev and Dongdong Liu.
>>>
>>>  app/test-pmd/event.c                         | 390 +++++++++++++++++++
>>>  app/test-pmd/meson.build                     |   1 +
>>>  app/test-pmd/parameters.c                    |  36 +-
>>>  app/test-pmd/testpmd.c                       | 247 +-----------
>>>  app/test-pmd/testpmd.h                       |  10 +-
>>>  doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
>>>  doc/guides/testpmd_app_ug/event_handling.rst |  81 ++++
>>>  doc/guides/testpmd_app_ug/index.rst          |   1 +
>>>  drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
>>>  drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
>>>  drivers/net/hns3/hns3_rxtx.c                 |  21 +-
>>>  lib/ethdev/ethdev_driver.c                   |   8 +
>>>  lib/ethdev/ethdev_driver.h                   |  10 +
>>>  lib/ethdev/rte_ethdev.h                      |  32 +-
>>>  lib/ethdev/version.map                       |   1 +
>>>  15 files changed, 552 insertions(+), 333 deletions(-)
>>>  create mode 100644 app/test-pmd/event.c
>>>  create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>>>
>> .
>>
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 0/7] fix race-condition of proactive error handling mode
  2024-01-29  1:16       ` fengchengwen
@ 2024-02-18  3:41         ` fengchengwen
  0 siblings, 0 replies; 85+ messages in thread
From: fengchengwen @ 2024-02-18  3:41 UTC (permalink / raw)
  To: thomas, ferruh.yigit, konstantin.ananyev, ajit.khaparde
  Cc: dev, andrew.rybchenko, kalesh-anakkur.purayil, Honnappa.Nagarahalli

Hi Ferruh,

This patchset will modify lib/ethdev/, Could you help review it before RC1?

Thanks

On 2024/1/29 9:16, fengchengwen wrote:
> Hi Ferruh,
> 
> Kindly ping for review.
> 
> Thanks
> 
> On 2024/1/15 9:44, fengchengwen wrote:
>> Kindly ping.
>>
>> On 2023/12/5 10:30, fengchengwen wrote:
>>> Hi Ferruh,
>>>
>>> I notice this patchset was delegated to you, so could you take a view?
>>>
>>> Thanks.
>>>
>>> On 2023/11/6 21:11, Chengwen Feng wrote:
>>>> This patch fixes race-condition of proactive error handling mode, the
>>>> discussion thread [1].
>>>>
>>>> [1] http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>
>>>> Chengwen Feng (7):
>>>>   ethdev: fix race-condition of proactive error handling mode
>>>>   net/hns3: replace fp ops config function
>>>>   net/bnxt: fix race-condition when report error recovery
>>>>   net/bnxt: use fp ops setup function
>>>>   app/testpmd: add error recovery usage demo
>>>>   app/testpmd: extract event handling to event.c
>>>>   doc: testpmd support event handling section
>>>>
>>>> ---
>>>> v3:
>>>> - adjust the usage of RTE_ETH_EVENT_QUEUE_STATE in 7/7 commit.
>>>> - add ack-by from Konstantin Ananyev, Ajit Khaparde and Huisong Li.
>>>> v2:
>>>> - extract event handling to event.c and document it, which address
>>>>   Ferruh's comment.
>>>> - add ack-by from Konstantin Ananyev and Dongdong Liu.
>>>>
>>>>  app/test-pmd/event.c                         | 390 +++++++++++++++++++
>>>>  app/test-pmd/meson.build                     |   1 +
>>>>  app/test-pmd/parameters.c                    |  36 +-
>>>>  app/test-pmd/testpmd.c                       | 247 +-----------
>>>>  app/test-pmd/testpmd.h                       |  10 +-
>>>>  doc/guides/prog_guide/poll_mode_drv.rst      |  20 +-
>>>>  doc/guides/testpmd_app_ug/event_handling.rst |  81 ++++
>>>>  doc/guides/testpmd_app_ug/index.rst          |   1 +
>>>>  drivers/net/bnxt/bnxt_cpr.c                  |  18 +-
>>>>  drivers/net/bnxt/bnxt_ethdev.c               |   9 +-
>>>>  drivers/net/hns3/hns3_rxtx.c                 |  21 +-
>>>>  lib/ethdev/ethdev_driver.c                   |   8 +
>>>>  lib/ethdev/ethdev_driver.h                   |  10 +
>>>>  lib/ethdev/rte_ethdev.h                      |  32 +-
>>>>  lib/ethdev/version.map                       |   1 +
>>>>  15 files changed, 552 insertions(+), 333 deletions(-)
>>>>  create mode 100644 app/test-pmd/event.c
>>>>  create mode 100644 doc/guides/testpmd_app_ug/event_handling.rst
>>>>
>>> .
>>>
>> .
>>
> .
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2024-02-18  3:41 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-01  3:06 [PATCH 0/5] fix race-condition of proactive error handling mode Chengwen Feng
2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
2023-03-02 12:08   ` Konstantin Ananyev
2023-03-03 16:51     ` Ferruh Yigit
2023-03-05 14:53       ` Konstantin Ananyev
2023-03-06  8:55         ` Ferruh Yigit
2023-03-06 10:22           ` Konstantin Ananyev
2023-03-06 11:00             ` Ferruh Yigit
2023-03-06 11:05               ` Ajit Khaparde
2023-03-06 11:13                 ` Konstantin Ananyev
2023-03-07  8:25                   ` fengchengwen
2023-03-07  9:52                     ` Konstantin Ananyev
2023-03-07 10:11                       ` Konstantin Ananyev
2023-03-07 12:07                     ` Ferruh Yigit
2023-03-07 12:26                       ` fengchengwen
2023-03-07 12:39                         ` Konstantin Ananyev
2023-03-09  2:05                           ` Ajit Khaparde
2023-03-06  1:41       ` fengchengwen
2023-03-06  8:57         ` Ferruh Yigit
2023-03-06  9:10         ` Ferruh Yigit
2023-03-02 23:30   ` Honnappa Nagarahalli
2023-03-03  0:21     ` Konstantin Ananyev
2023-03-04  5:08       ` Honnappa Nagarahalli
2023-03-05 15:23         ` Konstantin Ananyev
2023-03-07  5:34           ` Honnappa Nagarahalli
2023-03-07  8:39             ` fengchengwen
2023-03-08  1:09               ` Honnappa Nagarahalli
2023-03-09  0:59                 ` fengchengwen
2023-03-09  3:03                   ` Honnappa Nagarahalli
2023-03-09 11:30                     ` fengchengwen
2023-03-10  3:25                       ` Honnappa Nagarahalli
2023-03-07  9:56             ` Konstantin Ananyev
2023-03-01  3:06 ` [PATCH 2/5] net/hns3: replace fp ops config function Chengwen Feng
2023-03-02  6:50   ` Dongdong Liu
2023-03-01  3:06 ` [PATCH 3/5] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-03-02 12:23   ` Konstantin Ananyev
2023-03-01  3:06 ` [PATCH 4/5] net/bnxt: use fp ops setup function Chengwen Feng
2023-03-02 12:30   ` Konstantin Ananyev
2023-03-03  0:01     ` Konstantin Ananyev
2023-03-03  1:17       ` Ajit Khaparde
2023-03-03  2:02       ` fengchengwen
2023-03-03  1:38     ` fengchengwen
2023-03-05 15:57       ` Konstantin Ananyev
2023-03-06  2:47         ` Ajit Khaparde
2023-03-01  3:06 ` [PATCH 5/5] app/testpmd: add error recovery usage demo Chengwen Feng
2023-03-02 13:01   ` Konstantin Ananyev
2023-03-03  1:49     ` fengchengwen
2023-03-03 16:59       ` Ferruh Yigit
2023-09-21 11:12 ` [PATCH 0/5] fix race-condition of proactive error handling mode Ferruh Yigit
2023-10-07  2:32   ` fengchengwen
2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
2023-11-01  3:39     ` lihuisong (C)
2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
2023-11-01  3:40     ` lihuisong (C)
2023-11-02 10:34     ` Konstantin Ananyev
2023-10-20 10:07   ` [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-11-02 16:28     ` Ajit Khaparde
2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
2023-11-01  3:48     ` lihuisong (C)
2023-11-02 10:34     ` Konstantin Ananyev
2023-11-02 16:29       ` Ajit Khaparde
2023-10-20 10:07   ` [PATCH v2 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
2023-11-01  4:08     ` lihuisong (C)
2023-11-06 13:01       ` fengchengwen
2023-10-20 10:07   ` [PATCH v2 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
2023-11-01  4:09     ` lihuisong (C)
2023-10-20 10:07   ` [PATCH v2 7/7] doc: testpmd support event handling section Chengwen Feng
2023-11-06  9:28     ` lihuisong (C)
2023-11-06 12:39       ` fengchengwen
2023-11-08  3:02         ` lihuisong (C)
2023-11-06  1:35   ` [PATCH v2 0/7] fix race-condition of proactive error handling mode fengchengwen
2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 1/7] ethdev: " Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 2/7] net/hns3: replace fp ops config function Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 4/7] net/bnxt: use fp ops setup function Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 7/7] doc: testpmd support event handling section Chengwen Feng
2023-11-08  3:03     ` lihuisong (C)
2023-12-05  2:30   ` [PATCH v3 0/7] fix race-condition of proactive error handling mode fengchengwen
2024-01-15  1:44     ` fengchengwen
2024-01-29  1:16       ` fengchengwen
2024-02-18  3:41         ` fengchengwen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).