From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 92A6A4240F; Wed, 18 Jan 2023 16:45:49 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 82E4542D51; Wed, 18 Jan 2023 16:45:49 +0100 (CET) Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2047.outbound.protection.outlook.com [40.107.102.47]) by mails.dpdk.org (Postfix) with ESMTP id B8E2842D5F for ; Wed, 18 Jan 2023 16:45:47 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PZCr/5/PBA7JarFKY2f/SbEUu2ZkfIixmr2QPqJOgzuKXCC0jIXGy+/MAnkjkI2ApxvK1pCMBEzvNyqTxJNQb05e6lmnN9G4/S0a8AjWmFJtE23N+PewPNGZYAqkVZRSi8upq63Jvd2MAz8ZVAvaV34GEGrliRjTVY4tQUgmec23aHR8Yajr5DExiC0Idt4GrJfzUViwHfWthUZuk/lSiP2+nvaAUJwQDgbUKttQqAN5PL4uR2vhJ2oDDuGJk/z2EMFTQ9EQbsL/vf9emQ6hz0IwgtlF6jzdcGqmIHlxn08cyQyAhmXc9cWzzMyZLK7y0lvWhEhUe3mbb652XrIlig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Prv2jWfl7oegMHzE/1yQDbIJx/7n4JtlQ/1gtICp9HI=; b=m/bdAnaw1VOcpCijunTeatt0meXW+faifoCP6/IImPvmgSUfq/MmbHZvBxHcsogZQbJHHDVqUA7D/cODG6ZPs8TB/pmoD8fd3r8CV+qEWCEzeO3XUGMAcQyfHseiS/yhuHRkO8xcgd+xEdTI+h8cZStRGvVijjNpXnctadauMHMnprbT5RACMus/NMWUfAd3zlFzoddR0ZTO5gb7lxeNgyzeauHH+Xz1v5cf2Mptjxxo3uQVuGU97cSyRN1eTEqJGDB7+lWfrsAv8gODd952l7boZPyh/mqZUdJKYrudoYiczaE9IYzYAVxIvy2Lm7Qw4oLMh7Do7An3uGfR1zCKPg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Prv2jWfl7oegMHzE/1yQDbIJx/7n4JtlQ/1gtICp9HI=; b=mkYNB7haz10ynYbDvahlkHlQnKEPeeGeGhX5SNVWnMRvrNso4gUUgIMcfAwM41QlvADXxRENbCmHuTMKqUcpGs/5rIv+YRhYV4chP3+WFUHNzww8BE8MmOoR3fkPNn0mJC2Gjh5NAQcR9rGjUhN+m7C7nXe5L84Yp+kDGCVRpYcEZyC/B4kuqsSXggWLI2FBMckA7rbh4VGP2XWyYD/JzO2XNgPj1DFVx4t29LvpE2TgCrFgkGzT7kMteRMus+2GEwoc6Hl7cYgiN1xCYuUr+ZFK6SRkduUumfIetQieUjnKbZxIEJJgTuxXsOw0BgV7tJSxgNh9TmVKnLpobTqh0Q== Received: from CY5PR18CA0058.namprd18.prod.outlook.com (2603:10b6:930:13::31) by IA1PR12MB7567.namprd12.prod.outlook.com (2603:10b6:208:42d::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6002.13; Wed, 18 Jan 2023 15:45:45 +0000 Received: from CY4PEPF0000B8E8.namprd05.prod.outlook.com (2603:10b6:930:13:cafe::ea) by CY5PR18CA0058.outlook.office365.com (2603:10b6:930:13::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6002.19 via Frontend Transport; Wed, 18 Jan 2023 15:45:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CY4PEPF0000B8E8.mail.protection.outlook.com (10.167.241.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6043.5 via Frontend Transport; Wed, 18 Jan 2023 15:45:44 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.36; Wed, 18 Jan 2023 07:45:08 -0800 Received: from nvidia.com (10.126.231.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.36; Wed, 18 Jan 2023 07:45:05 -0800 From: Rongwei Liu To: , , , , , , CC: , Ferruh Yigit , "Andrew Rybchenko" Subject: [PATCH v4 2/3] ethdev: add standby state for live migration Date: Wed, 18 Jan 2023 17:44:45 +0200 Message-ID: <20230118154447.595231-3-rongweil@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230118154447.595231-1-rongweil@nvidia.com> References: <20221221090017.3715030-2-rongweil@nvidia.com> <20230118154447.595231-1-rongweil@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.37] X-ClientProxiedBy: rnnvmail203.nvidia.com (10.129.68.9) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000B8E8:EE_|IA1PR12MB7567:EE_ X-MS-Office365-Filtering-Correlation-Id: 69adfd87-1dd3-45b3-0e3a-08daf96b0fce X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZcDQMz5NWbfHwWWUh803tv75DhxWHfKqTC+1AF1ew8evQ2cHfD6eJCKBLom8c7cL6dxZXc5QArhm1EGAuqpB0UfPW5MsYn67dbNPlwr3Nj4H60lP/2FqvatL2qbIG51MfphMtkcXnFXUnCzD+XDFTfNj0d5BezrNIF9FKqp22jBnEsl6bA9h08f+Zk4WVjvPpaVYa1P2GlFm9vWrsphqSRfWDUhg1XFek/5xoyQQ90RmbE071xpAWssn/6O/3o7/DEz+Apjt3qhua3FGNnZ3gvWAE/FJJETa+k37K8t/S7ZKxDQu9zQxFtvkbVdBebaaJIUK0XU5kDxrckAT6JuYCLsb+MgJUGy7uvT9l3TmtuEiipwjPaltSIpPxaaG5/Uwms5dj5InlHJ8sY/0hEK8hj4X5jE5xGcH5Zj++KdW/5r2NOj73E5gM1B9xW4kV7+CRBitPsQ7Cs4BQELeRoXwT0rMd/xuKJS6CFjAf04mZOgvmKQhbinClBsmul5XWLCeDEWqcNJEtYYMI679b4IG6coUQPzU3bwIMC5jQlTz5XEFxcWElX0FyQbBDPdDXfmDSGFnt05lFIbn4LZM023lFPJygOzouojp5PB4Pwg5nXjzymcQxaUtOXsg8EgN38xlJNDWYBO3aryVqbdBYNZOazaFmakCzFEN0Dj9pc5bsObl95dvNf+MGvYO9yrJRGufm/+p/urBeP6CnoQN7r9Ryg== X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(376002)(136003)(346002)(396003)(451199015)(46966006)(40470700004)(36840700001)(356005)(86362001)(70586007)(70206006)(55016003)(5660300002)(8936002)(2906002)(82740400003)(36860700001)(7636003)(316002)(54906003)(6666004)(110136005)(47076005)(426003)(7696005)(478600001)(36756003)(82310400005)(8676002)(40480700001)(4326008)(41300700001)(186003)(1076003)(6286002)(336012)(83380400001)(16526019)(26005)(40460700003)(2616005); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Jan 2023 15:45:44.3325 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 69adfd87-1dd3-45b3-0e3a-08daf96b0fce X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000B8E8.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB7567 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org When a DPDK application must be upgraded, the traffic downtime should be shortened as much as possible. During the migration time, the old application may stay alive while the new application is starting and being configured. In order to optimize the switch to the new application, the old application may need to be aware of the presence of the new application being prepared. This is achieved with a new API allowing the user to change the new application state to standby and active later. The added function is trying to apply the new state to all probed ethdev ports. To make this API simple and easy to use, the same flags have to be accepted by all devices. This is the scenario of operations in the old and new applications: . device: already configured by the old application . new: start as active . new: probe the same device . new: set as standby . new: configure the device . device: has configurations from old and new applications . old: clear its device configuration . device: has only 1 configuration from new application . new: set as active . device: downtime for connecting all to the new application . old: shutdown The active role means network handling configurations are programmed to the HW immediately, and no behavior changed. This is the default state. The standby role means configurations are queued in the HW. If there is no application with active role, any configuration is effective immediately. Signed-off-by: Rongwei Liu --- doc/guides/rel_notes/release_23_03.rst | 7 ++++ lib/ethdev/ethdev_driver.h | 20 +++++++++ lib/ethdev/rte_ethdev.c | 42 +++++++++++++++++++ lib/ethdev/rte_ethdev.h | 56 ++++++++++++++++++++++++++ lib/ethdev/version.map | 3 ++ 5 files changed, 128 insertions(+) diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst index b8c5b68d6c..5367123f24 100644 --- a/doc/guides/rel_notes/release_23_03.rst +++ b/doc/guides/rel_notes/release_23_03.rst @@ -55,6 +55,13 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Added process state in ethdev to improve live migration.** + + Hot upgrade of an application may be accelerated by configuring + the new application in standby state while the old one is still active. + Such double ethdev configuration of the same device is possible + with the added process state API. + Removed Items ------------- diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index 6a550cfc83..4a098410d5 100644 --- a/lib/ethdev/ethdev_driver.h +++ b/lib/ethdev/ethdev_driver.h @@ -219,6 +219,23 @@ typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev); /** @internal Function used to detect an Ethernet device removal. */ typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev); +/** + * @internal + * Set the role of the process to active or standby during live migration. + * + * @param dev + * Port (ethdev) handle. + * @param standby + * Role active if false, standby if true. + * @param flags + * Role specific flags. + * + * @return + * Negative value on error, 0 on success. + */ +typedef int (*eth_dev_process_set_role_t)(struct rte_eth_dev *dev, + bool standby, uint32_t flags); + /** * @internal * Function used to enable the Rx promiscuous mode of an Ethernet device. @@ -1186,6 +1203,9 @@ struct eth_dev_ops { /** Check if the device was physically removed */ eth_is_removed_t is_removed; + /** Set role during live migration */ + eth_dev_process_set_role_t process_set_role; + eth_promiscuous_enable_t promiscuous_enable; /**< Promiscuous ON */ eth_promiscuous_disable_t promiscuous_disable;/**< Promiscuous OFF */ eth_allmulticast_enable_t allmulticast_enable;/**< Rx multicast ON */ diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index 5d5e18db1e..3a1fb64053 100644 --- a/lib/ethdev/rte_ethdev.c +++ b/lib/ethdev/rte_ethdev.c @@ -558,6 +558,48 @@ rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner) return 0; } +int +rte_eth_process_set_role(bool standby, uint32_t flags) +{ + struct rte_eth_dev_info dev_info = {0}; + struct rte_eth_dev *dev; + uint16_t port_id; + int ret = 0; + + /* Check if all devices support process role. */ + RTE_ETH_FOREACH_DEV(port_id) { + dev = &rte_eth_devices[port_id]; + if (*dev->dev_ops->process_set_role != NULL && + *dev->dev_ops->dev_infos_get != NULL && + (*dev->dev_ops->dev_infos_get)(dev, &dev_info) == 0 && + (dev_info.dev_capa & RTE_ETH_DEV_CAPA_PROCESS_ROLE) != 0) + continue; + rte_errno = ENOTSUP; + return -rte_errno; + } + /* Call the driver callbacks. */ + RTE_ETH_FOREACH_DEV(port_id) { + dev = &rte_eth_devices[port_id]; + if ((*dev->dev_ops->process_set_role)(dev, standby, flags) < 0) + goto failure; + ret++; + } + return ret; + +failure: + /* Rollback all changed devices in case one failed. */ + if (ret) { + RTE_ETH_FOREACH_DEV(port_id) { + dev = &rte_eth_devices[port_id]; + (*dev->dev_ops->process_set_role)(dev, !standby, flags); + if (--ret == 0) + break; + } + } + rte_errno = EPERM; + return -rte_errno; +} + int rte_eth_dev_socket_id(uint16_t port_id) { diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index c129ca1eaf..1505396ced 100644 --- a/lib/ethdev/rte_ethdev.h +++ b/lib/ethdev/rte_ethdev.h @@ -1606,6 +1606,8 @@ struct rte_eth_conf { #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3) /** Device supports keeping shared flow objects across restart. */ #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4) +/** Device supports process role changing. @see rte_eth_process_set_active */ +#define RTE_ETH_DEV_CAPA_PROCESS_ROLE RTE_BIT64(5) /**@}*/ /* @@ -2204,6 +2206,60 @@ int rte_eth_dev_owner_delete(const uint64_t owner_id); int rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Set the role of the process to active or standby, + * affecting network traffic handling. + * + * If one device does not support this operation or fails, + * the whole operation is failed and rolled back. + * + * It is forbidden to have multiple processes with the same role + * unless only one of them is configured to handle the traffic. + * + * The application is active by default. + * The configuration from the active process is effective immediately + * while the configuration from the standby process is queued by hardware. + * When configuring the device from a standby process, + * it has no effect except for below situations: + * - traffic not handled by the active process configuration + * - no active process + * + * When a process is changed from a standby to an active role, + * all preceding configurations that are queued by hardware + * should become effective immediately. + * Before role transition, all the traffic handling configurations + * set by the active process should be flushed first. + * + * In summary, the operations are expected to happen in this order + * in "old" and "new" applications: + * device: already configured by the old application + * new: start as active + * new: probe the same device + * new: set as standby + * new: configure the device + * device: has configurations from old and new applications + * old: clear its device configuration + * device: has only 1 configuration from new application + * new: set as active + * device: downtime for connecting all to the new application + * old: shutdown + * + * @param standby + * Role active if false, standby if true. + * @param flags + * Role specific flags. + * @return + * Positive value on success, -rte_errno value on error: + * - (> 0) Number of switched devices. + * - (-ENOTSUP) if not supported by a device. + * - (-EPERM) if operation failed with a device. + */ +__rte_experimental +int rte_eth_process_set_role(bool standby, uint32_t flags); + /** * Get the number of ports which are usable for the application. * diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index 17201fbe0f..d5d3ea5421 100644 --- a/lib/ethdev/version.map +++ b/lib/ethdev/version.map @@ -298,6 +298,9 @@ EXPERIMENTAL { rte_flow_get_q_aged_flows; rte_mtr_meter_policy_get; rte_mtr_meter_profile_get; + + # added in 23.03 + rte_eth_process_set_role; }; INTERNAL { -- 2.27.0