DPDK patches and discussions
 help / color / mirror / Atom feed
From: Rongwei Liu <rongweil@nvidia.com>
To: <dev@dpdk.org>, <matan@nvidia.com>, <viacheslavo@nvidia.com>,
	<orika@nvidia.com>, <thomas@monjalon.net>,
	<jerinjacobk@gmail.com>, <stephen@networkplumber.org>
Cc: <rasland@nvidia.com>, Ferruh Yigit <ferruh.yigit@amd.com>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>
Subject: [PATCH v4 2/3] ethdev: add standby state for live migration
Date: Wed, 18 Jan 2023 17:44:45 +0200	[thread overview]
Message-ID: <20230118154447.595231-3-rongweil@nvidia.com> (raw)
In-Reply-To: <20230118154447.595231-1-rongweil@nvidia.com>

When a DPDK application must be upgraded,
the traffic downtime should be shortened as much as possible.
During the migration time, the old application may stay alive
while the new application is starting and being configured.

In order to optimize the switch to the new application,
the old application may need to be aware of the presence
of the new application being prepared.
This is achieved with a new API allowing the user to change the
new application state to standby and active later.

The added function is trying to apply the new state to all probed
ethdev ports. To make this API simple and easy to use,
the same flags have to be accepted by all devices.

This is the scenario of operations in the old and new applications:
.       device: already configured by the old application
.       new:    start as active
.       new:    probe the same device
.       new:    set as standby
.       new:    configure the device
.       device: has configurations from old and new applications
.       old:    clear its device configuration
.       device: has only 1 configuration from new application
.       new:    set as active
.       device: downtime for connecting all to the new application
.       old:    shutdown

The active role means network handling configurations are programmed
to the HW immediately, and no behavior changed. This is the default state.
The standby role means configurations are queued in the HW.
If there is no application with active role,
any configuration is effective immediately.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
---
 doc/guides/rel_notes/release_23_03.rst |  7 ++++
 lib/ethdev/ethdev_driver.h             | 20 +++++++++
 lib/ethdev/rte_ethdev.c                | 42 +++++++++++++++++++
 lib/ethdev/rte_ethdev.h                | 56 ++++++++++++++++++++++++++
 lib/ethdev/version.map                 |  3 ++
 5 files changed, 128 insertions(+)

diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index b8c5b68d6c..5367123f24 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added process state in ethdev to improve live migration.**
+
+  Hot upgrade of an application may be accelerated by configuring
+  the new application in standby state while the old one is still active.
+  Such double ethdev configuration of the same device is possible
+  with the added process state API.
+
 
 Removed Items
 -------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..4a098410d5 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -219,6 +219,23 @@ typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
 /** @internal Function used to detect an Ethernet device removal. */
 typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
 
+/**
+ * @internal
+ * Set the role of the process to active or standby during live migration.
+ *
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param standby
+ *   Role active if false, standby if true.
+ * @param flags
+ *   Role specific flags.
+ *
+ * @return
+ *   Negative value on error, 0 on success.
+ */
+typedef int (*eth_dev_process_set_role_t)(struct rte_eth_dev *dev,
+		bool standby, uint32_t flags);
+
 /**
  * @internal
  * Function used to enable the Rx promiscuous mode of an Ethernet device.
@@ -1186,6 +1203,9 @@ struct eth_dev_ops {
 	/** Check if the device was physically removed */
 	eth_is_removed_t           is_removed;
 
+	/** Set role during live migration */
+	eth_dev_process_set_role_t process_set_role;
+
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF */
 	eth_allmulticast_enable_t  allmulticast_enable;/**< Rx multicast ON */
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 5d5e18db1e..3a1fb64053 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -558,6 +558,48 @@ rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
 	return 0;
 }
 
+int
+rte_eth_process_set_role(bool standby, uint32_t flags)
+{
+	struct rte_eth_dev_info dev_info = {0};
+	struct rte_eth_dev *dev;
+	uint16_t port_id;
+	int ret = 0;
+
+	/* Check if all devices support process role. */
+	RTE_ETH_FOREACH_DEV(port_id) {
+		dev = &rte_eth_devices[port_id];
+		if (*dev->dev_ops->process_set_role != NULL &&
+			*dev->dev_ops->dev_infos_get != NULL &&
+			(*dev->dev_ops->dev_infos_get)(dev, &dev_info) == 0 &&
+			(dev_info.dev_capa & RTE_ETH_DEV_CAPA_PROCESS_ROLE) != 0)
+			continue;
+		rte_errno = ENOTSUP;
+		return -rte_errno;
+	}
+	/* Call the driver callbacks. */
+	RTE_ETH_FOREACH_DEV(port_id) {
+		dev = &rte_eth_devices[port_id];
+		if ((*dev->dev_ops->process_set_role)(dev, standby, flags) < 0)
+			goto failure;
+		ret++;
+	}
+	return ret;
+
+failure:
+	/* Rollback all changed devices in case one failed. */
+	if (ret) {
+		RTE_ETH_FOREACH_DEV(port_id) {
+			dev = &rte_eth_devices[port_id];
+			(*dev->dev_ops->process_set_role)(dev, !standby, flags);
+			if (--ret == 0)
+				break;
+		}
+	}
+	rte_errno = EPERM;
+	return -rte_errno;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..1505396ced 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1606,6 +1606,8 @@ struct rte_eth_conf {
 #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP         RTE_BIT64(3)
 /** Device supports keeping shared flow objects across restart. */
 #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
+/** Device supports process role changing. @see rte_eth_process_set_active */
+#define RTE_ETH_DEV_CAPA_PROCESS_ROLE           RTE_BIT64(5)
 /**@}*/
 
 /*
@@ -2204,6 +2206,60 @@ int rte_eth_dev_owner_delete(const uint64_t owner_id);
 int rte_eth_dev_owner_get(const uint16_t port_id,
 		struct rte_eth_dev_owner *owner);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Set the role of the process to active or standby,
+ * affecting network traffic handling.
+ *
+ * If one device does not support this operation or fails,
+ * the whole operation is failed and rolled back.
+ *
+ * It is forbidden to have multiple processes with the same role
+ * unless only one of them is configured to handle the traffic.
+ *
+ * The application is active by default.
+ * The configuration from the active process is effective immediately
+ * while the configuration from the standby process is queued by hardware.
+ * When configuring the device from a standby process,
+ * it has no effect except for below situations:
+ *   - traffic not handled by the active process configuration
+ *   - no active process
+ *
+ * When a process is changed from a standby to an active role,
+ * all preceding configurations that are queued by hardware
+ * should become effective immediately.
+ * Before role transition, all the traffic handling configurations
+ * set by the active process should be flushed first.
+ *
+ * In summary, the operations are expected to happen in this order
+ * in "old" and "new" applications:
+ *   device: already configured by the old application
+ *   new:    start as active
+ *   new:    probe the same device
+ *   new:    set as standby
+ *   new:    configure the device
+ *   device: has configurations from old and new applications
+ *   old:    clear its device configuration
+ *   device: has only 1 configuration from new application
+ *   new:    set as active
+ *   device: downtime for connecting all to the new application
+ *   old:    shutdown
+ *
+ * @param standby
+ *   Role active if false, standby if true.
+ * @param flags
+ *   Role specific flags.
+ * @return
+ *   Positive value on success, -rte_errno value on error:
+ *   - (> 0) Number of switched devices.
+ *   - (-ENOTSUP) if not supported by a device.
+ *   - (-EPERM) if operation failed with a device.
+ */
+__rte_experimental
+int rte_eth_process_set_role(bool standby, uint32_t flags);
+
 /**
  * Get the number of ports which are usable for the application.
  *
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 17201fbe0f..d5d3ea5421 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -298,6 +298,9 @@ EXPERIMENTAL {
 	rte_flow_get_q_aged_flows;
 	rte_mtr_meter_policy_get;
 	rte_mtr_meter_profile_get;
+
+	# added in 23.03
+	rte_eth_process_set_role;
 };
 
 INTERNAL {
-- 
2.27.0


  parent reply	other threads:[~2023-01-18 15:45 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-01  8:20 [RFC 0/2] add API to set process to primary or secondary Rongwei Liu
2022-12-01  8:20 ` [RFC 1/2] ethdev: add group description Rongwei Liu
2022-12-01  8:20 ` [RFC 2/2] ethdev: add API to set process to primary or secondary Rongwei Liu
2022-12-01 15:10   ` Stephen Hemminger
2022-12-02  3:27     ` Rongwei Liu
2022-12-05 16:08       ` Stephen Hemminger
2022-12-06  3:47         ` Rongwei Liu
2022-12-06  5:54           ` Stephen Hemminger
2022-12-21  9:00             ` [RFC v3 0/2] add API to set process to active or standby Rongwei Liu
2022-12-21  9:00               ` [RFC v3 1/2] ethdev: add group description Rongwei Liu
2023-01-18 15:44                 ` [PATCH v4 0/3] add API for live migration Rongwei Liu
2023-01-18 15:44                   ` [PATCH v4 1/3] ethdev: add flow rule group description Rongwei Liu
2023-01-31 11:53                     ` Ori Kam
2023-02-06 12:15                       ` Rongwei Liu
2023-02-07  2:57                     ` [PATCH v5] " Rongwei Liu
2023-02-08 20:28                       ` Ferruh Yigit
2023-02-09  2:06                         ` Rongwei Liu
2023-02-09  7:32                         ` [PATCH v6] " Rongwei Liu
2023-02-09  8:01                           ` Ori Kam
2023-02-09 11:26                             ` Ferruh Yigit
2023-01-18 15:44                   ` Rongwei Liu [this message]
2023-01-31 13:50                     ` [PATCH v4 2/3] ethdev: add standby state for live migration Ori Kam
2023-01-31 18:14                     ` Jerin Jacob
2023-01-31 22:55                       ` Thomas Monjalon
2023-02-01  7:32                         ` Andrew Rybchenko
2023-02-01  8:31                           ` Thomas Monjalon
2023-02-01  8:40                           ` Jerin Jacob
2023-02-01  8:46                             ` Thomas Monjalon
2023-02-02 10:23                               ` Rongwei Liu
2023-02-01  7:52                     ` Andrew Rybchenko
2023-02-01  8:27                       ` Thomas Monjalon
2023-02-01  8:40                         ` Andrew Rybchenko
2023-01-18 15:44                   ` [PATCH v4 3/3] ethdev: add standby flags " Rongwei Liu
2023-01-23 13:20                     ` Jerin Jacob
2023-01-30  2:47                       ` Rongwei Liu
2023-01-30 17:10                         ` Jerin Jacob
2023-01-31  2:53                           ` Rongwei Liu
2023-01-31  8:45                             ` Jerin Jacob
2023-01-31  9:01                               ` Rongwei Liu
2023-01-31 14:37                                 ` Jerin Jacob
2023-01-31 14:45                                   ` Ori Kam
2023-01-31 17:50                                     ` Thomas Monjalon
2023-01-31 18:10                                       ` Jerin Jacob
2022-12-21  9:00               ` [RFC v3 2/2] ethdev: add API to set process to active or standby Rongwei Liu
2022-12-21  9:12                 ` Jerin Jacob
2022-12-21  9:32                   ` Rongwei Liu
2022-12-21 10:59                     ` Jerin Jacob
2022-12-21 12:05                       ` Rongwei Liu
2022-12-21 12:44                         ` Jerin Jacob
2022-12-21 12:50                           ` Rongwei Liu
2022-12-21 13:12                             ` Jerin Jacob
2022-12-21 14:33                               ` Rongwei Liu
2022-12-26 16:44                                 ` Ori Kam
2023-01-15 22:46                                   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230118154447.595231-3-rongweil@nvidia.com \
    --to=rongweil@nvidia.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=jerinjacobk@gmail.com \
    --cc=matan@nvidia.com \
    --cc=orika@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).