From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A243741B9B; Wed, 1 Feb 2023 08:52:47 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 414894113F; Wed, 1 Feb 2023 08:52:47 +0100 (CET) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 6718F4021F for ; Wed, 1 Feb 2023 08:52:45 +0100 (CET) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id B316C5D; Wed, 1 Feb 2023 10:52:44 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru B316C5D DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1675237964; bh=I/KwaOEU49O/8di0HGOygXAcxS577w1SCP4wbhqclX4=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=bwfwG3dEn7jYcz6DRqq5nShuStiOsHZpIcTdNlyQL+2dx8giEHatfVFdjDqk2rMrQ cBRAL1p3/k8DuZLBOxUuE+X9izYA7VZd2wBBxXVmS9e4oL6RH0Bl/XL6RE3mPgS0I+ 6+c9DrcXRTsXfazLu280oGtF3F0uNIS9CxKERYvE= Message-ID: <798721cd-92b8-3ee1-0aaa-20a63bb1cd67@oktetlabs.ru> Date: Wed, 1 Feb 2023 10:52:44 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH v4 2/3] ethdev: add standby state for live migration Content-Language: en-US To: Rongwei Liu , dev@dpdk.org, matan@nvidia.com, viacheslavo@nvidia.com, orika@nvidia.com, thomas@monjalon.net, jerinjacobk@gmail.com, stephen@networkplumber.org Cc: rasland@nvidia.com, Ferruh Yigit References: <20221221090017.3715030-2-rongweil@nvidia.com> <20230118154447.595231-1-rongweil@nvidia.com> <20230118154447.595231-3-rongweil@nvidia.com> From: Andrew Rybchenko Organization: OKTET Labs In-Reply-To: <20230118154447.595231-3-rongweil@nvidia.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 1/18/23 18:44, Rongwei Liu wrote: > When a DPDK application must be upgraded, > the traffic downtime should be shortened as much as possible. > During the migration time, the old application may stay alive > while the new application is starting and being configured. > > In order to optimize the switch to the new application, > the old application may need to be aware of the presence > of the new application being prepared. > This is achieved with a new API allowing the user to change the > new application state to standby and active later. > > The added function is trying to apply the new state to all probed > ethdev ports. To make this API simple and easy to use, > the same flags have to be accepted by all devices. > > This is the scenario of operations in the old and new applications: > . device: already configured by the old application > . new: start as active > . new: probe the same device > . new: set as standby > . new: configure the device > . device: has configurations from old and new applications > . old: clear its device configuration > . device: has only 1 configuration from new application > . new: set as active > . device: downtime for connecting all to the new application > . old: shutdown > > The active role means network handling configurations are programmed > to the HW immediately, and no behavior changed. This is the default state. > The standby role means configurations are queued in the HW. > If there is no application with active role, > any configuration is effective immediately. > > Signed-off-by: Rongwei Liu > --- > doc/guides/rel_notes/release_23_03.rst | 7 ++++ > lib/ethdev/ethdev_driver.h | 20 +++++++++ > lib/ethdev/rte_ethdev.c | 42 +++++++++++++++++++ > lib/ethdev/rte_ethdev.h | 56 ++++++++++++++++++++++++++ > lib/ethdev/version.map | 3 ++ > 5 files changed, 128 insertions(+) > > diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst > index b8c5b68d6c..5367123f24 100644 > --- a/doc/guides/rel_notes/release_23_03.rst > +++ b/doc/guides/rel_notes/release_23_03.rst > @@ -55,6 +55,13 @@ New Features > Also, make sure to start the actual text at the margin. > ======================================================= > > +* **Added process state in ethdev to improve live migration.** > + > + Hot upgrade of an application may be accelerated by configuring > + the new application in standby state while the old one is still active. > + Such double ethdev configuration of the same device is possible > + with the added process state API. > + Will we have the same API for other device classes? Any ideas how to avoid duplication? (I'm afraid we don't have generic place to put such functionality). > > Removed Items > ------------- > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h > index 6a550cfc83..4a098410d5 100644 > --- a/lib/ethdev/ethdev_driver.h > +++ b/lib/ethdev/ethdev_driver.h > @@ -219,6 +219,23 @@ typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev); > /** @internal Function used to detect an Ethernet device removal. */ > typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev); > > +/** > + * @internal > + * Set the role of the process to active or standby during live migration. > + * > + * @param dev > + * Port (ethdev) handle. > + * @param standby > + * Role active if false, standby if true. > + * @param flags > + * Role specific flags. > + * > + * @return > + * Negative value on error, 0 on success. > + */ > +typedef int (*eth_dev_process_set_role_t)(struct rte_eth_dev *dev, > + bool standby, uint32_t flags); > + > /** > * @internal > * Function used to enable the Rx promiscuous mode of an Ethernet device. > @@ -1186,6 +1203,9 @@ struct eth_dev_ops { > /** Check if the device was physically removed */ > eth_is_removed_t is_removed; > > + /** Set role during live migration */ > + eth_dev_process_set_role_t process_set_role; > + > eth_promiscuous_enable_t promiscuous_enable; /**< Promiscuous ON */ > eth_promiscuous_disable_t promiscuous_disable;/**< Promiscuous OFF */ > eth_allmulticast_enable_t allmulticast_enable;/**< Rx multicast ON */ > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c > index 5d5e18db1e..3a1fb64053 100644 > --- a/lib/ethdev/rte_ethdev.c > +++ b/lib/ethdev/rte_ethdev.c > @@ -558,6 +558,48 @@ rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner) > return 0; > } > > +int > +rte_eth_process_set_role(bool standby, uint32_t flags) > +{ > + struct rte_eth_dev_info dev_info = {0}; > + struct rte_eth_dev *dev; > + uint16_t port_id; > + int ret = 0; > + > + /* Check if all devices support process role. */ > + RTE_ETH_FOREACH_DEV(port_id) { > + dev = &rte_eth_devices[port_id]; > + if (*dev->dev_ops->process_set_role != NULL && > + *dev->dev_ops->dev_infos_get != NULL && > + (*dev->dev_ops->dev_infos_get)(dev, &dev_info) == 0 && > + (dev_info.dev_capa & RTE_ETH_DEV_CAPA_PROCESS_ROLE) != 0) Why do we need both non-NULL callback and dev_capa flag? Isn't just non-NULL callback sufficient? > + continue; > + rte_errno = ENOTSUP; > + return -rte_errno; > + } > + /* Call the driver callbacks. */ > + RTE_ETH_FOREACH_DEV(port_id) { > + dev = &rte_eth_devices[port_id]; > + if ((*dev->dev_ops->process_set_role)(dev, standby, flags) < 0) Please, add error logging on failure. > + goto failure; IMHO it would be useful to have info logs on success. > + ret++; ret variable name is very confusing, please, be more specific in variable naming. > + } > + return ret; > + > +failure: > + /* Rollback all changed devices in case one failed. */ > + if (ret) { Compare vs 0 > + RTE_ETH_FOREACH_DEV(port_id) { > + dev = &rte_eth_devices[port_id]; > + (*dev->dev_ops->process_set_role)(dev, !standby, flags); IMHO, it would be useful to have error logs on rollback failure and info logs on success. > + if (--ret == 0) > + break; > + } > + } > + rte_errno = EPERM; Why is it always EPERM? Isn't it better to forward failure code from driver? > + return -rte_errno; > +} > + > int > rte_eth_dev_socket_id(uint16_t port_id) > { > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > index c129ca1eaf..1505396ced 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -1606,6 +1606,8 @@ struct rte_eth_conf { > #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3) > /** Device supports keeping shared flow objects across restart. */ > #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4) > +/** Device supports process role changing. @see rte_eth_process_set_active */ There is no rte_eth_process_set_active() > +#define RTE_ETH_DEV_CAPA_PROCESS_ROLE RTE_BIT64(5) > /**@}*/ > > /* > @@ -2204,6 +2206,60 @@ int rte_eth_dev_owner_delete(const uint64_t owner_id); > int rte_eth_dev_owner_get(const uint16_t port_id, > struct rte_eth_dev_owner *owner); > > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice > + * > + * Set the role of the process to active or standby, > + * affecting network traffic handling. > + * > + * If one device does not support this operation or fails, > + * the whole operation is failed and rolled back. > + * > + * It is forbidden to have multiple processes with the same role > + * unless only one of them is configured to handle the traffic. Why is it not allowed to have many processes in standby? > + * > + * The application is active by default. > + * The configuration from the active process is effective immediately > + * while the configuration from the standby process is queued by hardware. I'm afraid error handling could be tricky in this case. I think it makes sense to highlight that configuration propagation failures will be returned on attempt to set the process active. > + * When configuring the device from a standby process, > + * it has no effect except for below situations: > + * - traffic not handled by the active process configuration > + * - no active process > + * > + * When a process is changed from a standby to an active role, > + * all preceding configurations that are queued by hardware > + * should become effective immediately. > + * Before role transition, all the traffic handling configurations > + * set by the active process should be flushed first. > + * > + * In summary, the operations are expected to happen in this order > + * in "old" and "new" applications: > + * device: already configured by the old application > + * new: start as active > + * new: probe the same device > + * new: set as standby > + * new: configure the device > + * device: has configurations from old and new applications > + * old: clear its device configuration > + * device: has only 1 configuration from new application > + * new: set as active > + * device: downtime for connecting all to the new application > + * old: shutdown > + * > + * @param standby > + * Role active if false, standby if true. Typically API with boolean parameters is bad. May be in this particular case it is better to have two functions: rte_eth_process_set_active() and rte_eth_process_set_standby(). > + * @param flags > + * Role specific flags. Please, define namespace for flags. > + * @return > + * Positive value on success, -rte_errno value on error: > + * - (> 0) Number of switched devices. Isn't is success if there is no devices to switch? > + * - (-ENOTSUP) if not supported by a device. > + * - (-EPERM) if operation failed with a device. > + */ > +__rte_experimental > +int rte_eth_process_set_role(bool standby, uint32_t flags); > + > /** > * Get the number of ports which are usable for the application. > * > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map > index 17201fbe0f..d5d3ea5421 100644 > --- a/lib/ethdev/version.map > +++ b/lib/ethdev/version.map > @@ -298,6 +298,9 @@ EXPERIMENTAL { > rte_flow_get_q_aged_flows; > rte_mtr_meter_policy_get; > rte_mtr_meter_profile_get; > + > + # added in 23.03 > + rte_eth_process_set_role; > }; > > INTERNAL {