From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DCC1343E57 for ; Sat, 13 Apr 2024 14:55:40 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D716340294; Sat, 13 Apr 2024 14:55:40 +0200 (CEST) Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2055.outbound.protection.outlook.com [40.107.92.55]) by mails.dpdk.org (Postfix) with ESMTP id 8AB00400D6 for ; Sat, 13 Apr 2024 14:55:39 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e/zY8TzYwARuxQqswBg0FasXAShQ0I1+eFTK2YC7n0DbILA59YHzO7PbB3eyUmC62BZQtVz3+xK46CS2ivRkT4LKqiyjNuLz0xKa7qJOUcCmPWpagDi/16vhUZ/9q4f8ZgQhhSU70kepn4/EJEZ5MT3IoMjtExODivByPAK9HF6aYJgureF05ZbhX0j2cHH6/M/Wz+w6WT2Zf00iVT5s7LoHmnZWuMm0Avu1d5e+QQdwQVbqxbR9vfnvCzp6e0EzpBtx6vHVBHDTy97uPC+LJ0LhRPVbO1idy3+M20axefFvtr00JeILkHHMhTwoWhg+NQL1cRZYSM3UnUTpdhCmmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZJHNoKZLPNvvsRbQzzK+Rv/cV4q1MOP6h+jW3cUjOvw=; b=EU++6dQO3iZ5hArJBdXynDamhvDcoP9LGu5Vyr8pX9C8bhfYXT+XS8n1RdX3vhsakvXaLKyjk8LMrd7Oin437kcKaRd41EIBmGnhrcQTJ9ZDizGpJbwbSwT4lZHz85bRaZg8g51JDIsSZcAMTnDYIQGb//lSUfF33DpclSfDboUhEeSTzeNbY2llOJRlJnm8TToSA8nLK2GD3lBm8f6pFYWL35pmNx20SDnoj58u2BXcnp858Ge8QZFF46Ux6wWCOjs3zs2XAcKBdRWySWsJ5bpO/cD3DhkPWSrk7IPMwwYNfnE+M0l9jfwyYVeK+dt00RPTgu0JVmtuvGo78n8MfA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZJHNoKZLPNvvsRbQzzK+Rv/cV4q1MOP6h+jW3cUjOvw=; b=tXIEazzE+1dmf4jNVI/MoK0FhkVIDoslhK27nJkg67LRq6tD66WC4SHJppCx9v3AsNDUnmgoyv/uo/VxTdlF6FnRUO6fbprb9scsE0PYLeHyXOe0dQSDxlI1fV4apGDre64QmFdveGbjw/5hUhz8ww37b/dO+JQTSlIyQaw7VoVC6y+j95dLnNJ1nhNV76z7SE2ADor4ii4jKuMYTaAUR1JnOA0OvGrIg+B7cW04G2jCZcUHvt5U7y7Z2/euI4mM3ehGIHRMuyC1GZu/WOVuwYGubFmtL3chHmekG42x291Gq/Xog4GZ84/uoI4VOzRRAwlZxPDWheoNIixWZzTYhg== Received: from SJ0PR03CA0042.namprd03.prod.outlook.com (2603:10b6:a03:33e::17) by SA1PR12MB7198.namprd12.prod.outlook.com (2603:10b6:806:2bf::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Sat, 13 Apr 2024 12:55:36 +0000 Received: from SJ1PEPF00001CDC.namprd05.prod.outlook.com (2603:10b6:a03:33e:cafe::f0) by SJ0PR03CA0042.outlook.office365.com (2603:10b6:a03:33e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7472.22 via Frontend Transport; Sat, 13 Apr 2024 12:55:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ1PEPF00001CDC.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.22 via Frontend Transport; Sat, 13 Apr 2024 12:55:36 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Sat, 13 Apr 2024 05:55:24 -0700 Received: from nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.12; Sat, 13 Apr 2024 05:55:22 -0700 From: Xueming Li To: Dariusz Sosnowski CC: Ori Kam , Bing Zhao , dpdk stable Subject: patch 'net/mlx5: fix flow counter cache starvation' has been queued to stable release 23.11.1 Date: Sat, 13 Apr 2024 20:48:59 +0800 Message-ID: <20240413125005.725659-59-xuemingl@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240413125005.725659-1-xuemingl@nvidia.com> References: <20240305094757.439387-1-xuemingl@nvidia.com> <20240413125005.725659-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CDC:EE_|SA1PR12MB7198:EE_ X-MS-Office365-Filtering-Correlation-Id: 2591619e-5f1b-45d8-a554-08dc5bb9035a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +fL6EWyjooTGYo7tScb3nKriAUqc3jR2RexOtZxrgSxhd84BL1BpEZ2eDLB0kKFjpOFdylVUyG7GixXlyzKj31whswKwHUYqqEwXORlFRI0CmdUp5liq8T7W2vuXQfHbu40LUmqaCra98d53WQKQaYO6KXPV7QGjHz4T40Y0eVFrwBCfNKBTqDJ0p53YvVzmFdvXnGEDd0lTQ8uLv4hTyxuinx7eFvN4Qg2FdA0O1vnUijei21+aLO2TtzPicwNxbhPKwZxNlB/CO1TgR0zqjET7x+Fmrc+chGzoMpKwtqSU7EYJLU8MYyXGJXuynqdEJ36AzaSO5QHz03E11cBUWOxutyd0nnFlLjylwRTQoyLYq+ojpZVo9atvolgqh1AxkacJmGeeCt+XlVutB7fhudFSWAPTYx2jKz1g3jidSwAxM/ZY4vboWLtstfCt/GcL2bYdRWtyXrf/L31oSuV2dm+Im+8/UjGLbhrE+K+rt8wtfQ1f5lcg4yS9FHpKiXXi6TV6WqBVZaBqEFrSM+29v05sUpDHZ7D8thF8F9JJaksxZA/gHnAFGpGYcGxjMwlLRC0LNfqTNyJ3ZTuZcgNcDn/vjyMhoYIBXrL8+CiAcaDMLb5R2ohCSPbW2y3rebPP1B5aPV2CNEJ7wU9IksjPmk9LbeBW4eRSDVTorWBEAyqIJZolob2873KDhuBGedUGGV5DulBxuO4lRqe+CNk7rItIDBCoD4OlVH+D6JD9D5C4ackjuipHx/BPB4SG2JIh X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230031)(376005)(1800799015)(82310400014)(36860700004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Apr 2024 12:55:36.2648 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2591619e-5f1b-45d8-a554-08dc5bb9035a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CDC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB7198 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Hi, FYI, your patch has been queued to stable release 23.11.1 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 04/15/24. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://git.dpdk.org/dpdk-stable/log/?h=23.11-staging This queued commit can be viewed at: https://git.dpdk.org/dpdk-stable/commit/?h=23.11-staging&id=091234f3cb541f513fb086fb511cb0ce390e7d82 Thanks. Xueming Li --- >From 091234f3cb541f513fb086fb511cb0ce390e7d82 Mon Sep 17 00:00:00 2001 From: Dariusz Sosnowski Date: Wed, 28 Feb 2024 20:06:06 +0100 Subject: [PATCH] net/mlx5: fix flow counter cache starvation Cc: Xueming Li [ upstream commit d755221b77c29549be4025e547cf907ad3a0abcf ] mlx5 PMD maintains a global counter pool and per-queue counter cache, which are used to allocate COUNT flow action objects. Whenever an empty cache is accessed, it is replenished with a pre-defined number of counters. If number of configured counters was sufficiently small, then it might have happened that caches associated with some queues could get starved because all counters were fetched on other queues. This patch fixes that by disabling cache at runtime if number of configured counters is not sufficient to avoid such starvation. Fixes: 4d368e1da3a4 ("net/mlx5: support flow counter action for HWS") Signed-off-by: Dariusz Sosnowski Acked-by: Ori Kam Acked-by: Bing Zhao --- drivers/net/mlx5/mlx5_flow_hw.c | 6 +-- drivers/net/mlx5/mlx5_hws_cnt.c | 72 ++++++++++++++++++++++++--------- drivers/net/mlx5/mlx5_hws_cnt.h | 25 +++++++++--- 3 files changed, 74 insertions(+), 29 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index 93035c8548..7cef9bd3ff 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -3123,8 +3123,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev, break; /* Fall-through. */ case RTE_FLOW_ACTION_TYPE_COUNT: - /* If the port is engaged in resource sharing, do not use queue cache. */ - cnt_queue = mlx5_hws_cnt_is_pool_shared(priv) ? NULL : &queue; + cnt_queue = mlx5_hws_cnt_get_queue(priv, &queue); ret = mlx5_hws_cnt_pool_get(priv->hws_cpool, cnt_queue, &cnt_id, age_idx); if (ret != 0) return ret; @@ -3722,8 +3721,7 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue, } return; } - /* If the port is engaged in resource sharing, do not use queue cache. */ - cnt_queue = mlx5_hws_cnt_is_pool_shared(priv) ? NULL : &queue; + cnt_queue = mlx5_hws_cnt_get_queue(priv, &queue); /* Put the counter first to reduce the race risk in BG thread. */ mlx5_hws_cnt_pool_put(priv->hws_cpool, cnt_queue, &flow->cnt_id); flow->cnt_id = 0; diff --git a/drivers/net/mlx5/mlx5_hws_cnt.c b/drivers/net/mlx5/mlx5_hws_cnt.c index a3bea94811..c31f2f380b 100644 --- a/drivers/net/mlx5/mlx5_hws_cnt.c +++ b/drivers/net/mlx5/mlx5_hws_cnt.c @@ -340,6 +340,55 @@ mlx5_hws_cnt_pool_deinit(struct mlx5_hws_cnt_pool * const cntp) mlx5_free(cntp); } +static bool +mlx5_hws_cnt_should_enable_cache(const struct mlx5_hws_cnt_pool_cfg *pcfg, + const struct mlx5_hws_cache_param *ccfg) +{ + /* + * Enable cache if and only if there are enough counters requested + * to populate all of the caches. + */ + return pcfg->request_num >= ccfg->q_num * ccfg->size; +} + +static struct mlx5_hws_cnt_pool_caches * +mlx5_hws_cnt_cache_init(const struct mlx5_hws_cnt_pool_cfg *pcfg, + const struct mlx5_hws_cache_param *ccfg) +{ + struct mlx5_hws_cnt_pool_caches *cache; + char mz_name[RTE_MEMZONE_NAMESIZE]; + uint32_t qidx; + + /* If counter pool is big enough, setup the counter pool cache. */ + cache = mlx5_malloc(MLX5_MEM_ANY | MLX5_MEM_ZERO, + sizeof(*cache) + + sizeof(((struct mlx5_hws_cnt_pool_caches *)0)->qcache[0]) + * ccfg->q_num, 0, SOCKET_ID_ANY); + if (cache == NULL) + return NULL; + /* Store the necessary cache parameters. */ + cache->fetch_sz = ccfg->fetch_sz; + cache->preload_sz = ccfg->preload_sz; + cache->threshold = ccfg->threshold; + cache->q_num = ccfg->q_num; + for (qidx = 0; qidx < ccfg->q_num; qidx++) { + snprintf(mz_name, sizeof(mz_name), "%s_qc/%x", pcfg->name, qidx); + cache->qcache[qidx] = rte_ring_create(mz_name, ccfg->size, + SOCKET_ID_ANY, + RING_F_SP_ENQ | RING_F_SC_DEQ | + RING_F_EXACT_SZ); + if (cache->qcache[qidx] == NULL) + goto error; + } + return cache; + +error: + while (qidx--) + rte_ring_free(cache->qcache[qidx]); + mlx5_free(cache); + return NULL; +} + static struct mlx5_hws_cnt_pool * mlx5_hws_cnt_pool_init(struct mlx5_dev_ctx_shared *sh, const struct mlx5_hws_cnt_pool_cfg *pcfg, @@ -348,7 +397,6 @@ mlx5_hws_cnt_pool_init(struct mlx5_dev_ctx_shared *sh, char mz_name[RTE_MEMZONE_NAMESIZE]; struct mlx5_hws_cnt_pool *cntp; uint64_t cnt_num = 0; - uint32_t qidx; MLX5_ASSERT(pcfg); MLX5_ASSERT(ccfg); @@ -360,17 +408,6 @@ mlx5_hws_cnt_pool_init(struct mlx5_dev_ctx_shared *sh, cntp->cfg = *pcfg; if (cntp->cfg.host_cpool) return cntp; - cntp->cache = mlx5_malloc(MLX5_MEM_ANY | MLX5_MEM_ZERO, - sizeof(*cntp->cache) + - sizeof(((struct mlx5_hws_cnt_pool_caches *)0)->qcache[0]) - * ccfg->q_num, 0, SOCKET_ID_ANY); - if (cntp->cache == NULL) - goto error; - /* store the necessary cache parameters. */ - cntp->cache->fetch_sz = ccfg->fetch_sz; - cntp->cache->preload_sz = ccfg->preload_sz; - cntp->cache->threshold = ccfg->threshold; - cntp->cache->q_num = ccfg->q_num; if (pcfg->request_num > sh->hws_max_nb_counters) { DRV_LOG(ERR, "Counter number %u " "is greater than the maximum supported (%u).", @@ -418,13 +455,10 @@ mlx5_hws_cnt_pool_init(struct mlx5_dev_ctx_shared *sh, DRV_LOG(ERR, "failed to create reuse list ring"); goto error; } - for (qidx = 0; qidx < ccfg->q_num; qidx++) { - snprintf(mz_name, sizeof(mz_name), "%s_qc/%x", pcfg->name, qidx); - cntp->cache->qcache[qidx] = rte_ring_create(mz_name, ccfg->size, - SOCKET_ID_ANY, - RING_F_SP_ENQ | RING_F_SC_DEQ | - RING_F_EXACT_SZ); - if (cntp->cache->qcache[qidx] == NULL) + /* Allocate counter cache only if needed. */ + if (mlx5_hws_cnt_should_enable_cache(pcfg, ccfg)) { + cntp->cache = mlx5_hws_cnt_cache_init(pcfg, ccfg); + if (cntp->cache == NULL) goto error; } /* Initialize the time for aging-out calculation. */ diff --git a/drivers/net/mlx5/mlx5_hws_cnt.h b/drivers/net/mlx5/mlx5_hws_cnt.h index 585b5a83ad..e00596088f 100644 --- a/drivers/net/mlx5/mlx5_hws_cnt.h +++ b/drivers/net/mlx5/mlx5_hws_cnt.h @@ -557,19 +557,32 @@ mlx5_hws_cnt_pool_get(struct mlx5_hws_cnt_pool *cpool, uint32_t *queue, } /** - * Check if counter pool allocated for HWS is shared between ports. + * Decide if the given queue can be used to perform counter allocation/deallcation + * based on counter configuration * * @param[in] priv * Pointer to the port private data structure. + * @param[in] queue + * Pointer to the queue index. * * @return - * True if counter pools is shared between ports. False otherwise. + * @p queue if cache related to the queue can be used. NULL otherwise. */ -static __rte_always_inline bool -mlx5_hws_cnt_is_pool_shared(struct mlx5_priv *priv) +static __rte_always_inline uint32_t * +mlx5_hws_cnt_get_queue(struct mlx5_priv *priv, uint32_t *queue) { - return priv && priv->hws_cpool && - (priv->shared_refcnt || priv->hws_cpool->cfg.host_cpool != NULL); + if (priv && priv->hws_cpool) { + /* Do not use queue cache if counter pool is shared. */ + if (priv->shared_refcnt || priv->hws_cpool->cfg.host_cpool != NULL) + return NULL; + /* Do not use queue cache if counter cache is disabled. */ + if (priv->hws_cpool->cache == NULL) + return NULL; + return queue; + } + /* This case should not be reached if counter pool was successfully configured. */ + MLX5_ASSERT(false); + return NULL; } static __rte_always_inline unsigned int -- 2.34.1 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2024-04-13 20:43:06.868225888 +0800 +++ 0059-net-mlx5-fix-flow-counter-cache-starvation.patch 2024-04-13 20:43:04.997753931 +0800 @@ -1 +1 @@ -From d755221b77c29549be4025e547cf907ad3a0abcf Mon Sep 17 00:00:00 2001 +From 091234f3cb541f513fb086fb511cb0ce390e7d82 Mon Sep 17 00:00:00 2001 @@ -4,0 +5,3 @@ +Cc: Xueming Li + +[ upstream commit d755221b77c29549be4025e547cf907ad3a0abcf ] @@ -20 +22,0 @@ -Cc: stable@dpdk.org @@ -32 +34 @@ -index 9620b7f576..c1dbdc5f19 100644 +index 93035c8548..7cef9bd3ff 100644 @@ -35 +37 @@ -@@ -3131,8 +3131,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev, +@@ -3123,8 +3123,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev, @@ -45 +47 @@ -@@ -3776,8 +3775,7 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue, +@@ -3722,8 +3721,7 @@ flow_hw_age_count_release(struct mlx5_priv *priv, uint32_t queue,