From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8449B431D5 for ; Sun, 22 Oct 2023 16:25:10 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7DE45402F1; Sun, 22 Oct 2023 16:25:10 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2083.outbound.protection.outlook.com [40.107.244.83]) by mails.dpdk.org (Postfix) with ESMTP id 3529A4064C for ; Sun, 22 Oct 2023 16:25:08 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LIkJ/vD31sdEvNwmKTTPoGWl1O6mexxFAcFGjFpfGcPvp7Gf4NqfdtmQmwkFrCeENNDO0DjLe6HSwBO9SwXIeHfdpt7nlNjBwJCAu3mV4WHhRDa/xjrC8wGI5V5ZqgJkWBmA0Vpt/+QDUgcPhus/HG7yFQ+KbBoz6yEYAjN9ZxOVgfuPxnlPVrlg6x11y/XwH1ONhEYIW6iWUQn9IfiFHE+8JVi7NAIvVvPr+4NxySMujUZ0jf2luVyjikucGDX3nT8iV2T4YPR5Q3cPbtdCzepIROD89DlpOju3hBnfrDZzuN+SrZrnds9zm1VRsIIu9Qsys8EqI53EyRhrh+suJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7ypAtqkply7+/NxQX+q74XAqjaLoEW4nNPM8q0G1pQQ=; b=GmtReYK25YpoxvacolrGG4RuJMF1LWPNYgn3/WrLci+LF5rY8+EWkv+KSCDSBnasVHfkKy1rQvTQl7SrqFn2RTwOr0XTuWzd1ttfrBc493BIbqyr0fAwCRS762UNn5Gm5aq9s9H4MV7b1cjz+gvucEOCpy8i45CxJ5iccRmniOrF/UNkYNFwGBL3bpg5MgDklHFsQZ+jhYEmJvBu3Gwg2nG1uBdXYr3fnJ0Axz3UxBi1L+CLOWAqxj/yeDWtp4+/UfL6dXeqv1fI459+K17W3PKNiFSPX9GQklWPuC44rFkLkbVF3OpY9EuWFI0rUQdN4a4/uirZsmpxkmxbNBe4iA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=gmail.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7ypAtqkply7+/NxQX+q74XAqjaLoEW4nNPM8q0G1pQQ=; b=E6WlTh15RC2w2hs+HJeNGGIdN8xO4x2D0QOEliqj0fuuRTjrldgn/M7D6n0ZXVNFkB/LWrbzbgX0k0f4ECZ3Yo51kmF2avBFQH3eqd+o2EPj6MP7JTP3IaJHNAUKaWTdMZIWibt+Dwaw2E2AdZcw67/nfZwFWbXsfsIhC+LU0ir7nUAfKW3EzOnO3MXdM7pPLTmak9hZT9rDpnpSVWRo398P/qh/isbeDyLfT3iaNZvy/Fm1T6mtjspjNyBxZRPHABq6HxzjaOJIbwAype3t4RVXnWIOa0SfCyuf4ClqcGtcOJ5qZ3orH5UrEalm2APbf4v9VQJUHx7XjqpqIKdmVw== Received: from MW4PR04CA0353.namprd04.prod.outlook.com (2603:10b6:303:8a::28) by PH7PR12MB7377.namprd12.prod.outlook.com (2603:10b6:510:20c::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.26; Sun, 22 Oct 2023 14:25:05 +0000 Received: from CO1PEPF000044F2.namprd05.prod.outlook.com (2603:10b6:303:8a:cafe::5a) by MW4PR04CA0353.outlook.office365.com (2603:10b6:303:8a::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.26 via Frontend Transport; Sun, 22 Oct 2023 14:25:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CO1PEPF000044F2.mail.protection.outlook.com (10.167.241.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.15 via Frontend Transport; Sun, 22 Oct 2023 14:25:05 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Sun, 22 Oct 2023 07:25:00 -0700 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Sun, 22 Oct 2023 07:24:57 -0700 From: Xueming Li To: Artemy Kovalyov CC: Dmitry Kozlyuk , Jonathan Erb , dpdk stable Subject: patch 'mem: fix deadlock with multiprocess' has been queued to stable release 22.11.4 Date: Sun, 22 Oct 2023 22:20:51 +0800 Message-ID: <20231022142250.10324-23-xuemingl@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231022142250.10324-1-xuemingl@nvidia.com> References: <20231022142250.10324-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F2:EE_|PH7PR12MB7377:EE_ X-MS-Office365-Filtering-Correlation-Id: 2a248446-3f3e-4e33-c9ab-08dbd30aafda X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Z1/5GeRPJvrNqDqnn7Uhq0v0raD8PhJgWwbpbFMqpWCUmimG51nj5LX3bFr9o6ejR7eh33gIFJS09J+c2Hg7U1cN7H+njZvnyOjW/I9yo01hRAvxBKzCfvx4zNU19cy/2AvW+PkXpTTv5kBo5suvmqQN1Wlnx3M2PmT0kahTjIcPLBAOLd+aLAK0xa9O9i8xe50ARgTxj/lZTZKOHFfzXoLKL24UFCJu0NtBbF0aNPK1im3wbP79rduS8gsKWC8FyxhQp5WBeYawzuaOD4CMtTWZQq8b/6yZDBc0cR7mVbUlIDGsB+VpB+gSZNHzLTe0PdrnuzChZ6+SD61jLxgmfbSsiiBi2hPsHlnsyzLXB7mOtUZENu4U1P9Layw46EZXfJhtPjq8tWRt/3JzKfoLYegS0XzPlitkGrHIdthqE372TeS0L9gi0wsMBVb2kGMskroZU3/BzuNFUiYpnRae4avr/NiaCfnci5KaaDcWMLKZIzQqSu7d4gVWjVS6Ft3M7i7kW6UI+Iheo2c+L3XGadtYyr3uFE2aV2mnkHWD62s68+BJMW6wfWo93lstkTKWd6a16jMrVPXbeB3KxFZ3smIQ/rh4x7Jrr8RqC/glbiaBz9Uh5S8zMu4QrFfbvUCFWOhfgDxolwosHnUb97g4x0keNVKFa8SM+XqzolyyUhoUdcJR//72XyHKu1+y2LcEkamcpInbqSvhHwFEOSqBk46umCfUPKy2LDvX/GKOgVFGTcOeu6c/Izs8m+Uy+mqUGr5Vchoe5pvQgXYXg9mCLQ== X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(346002)(39860400002)(136003)(396003)(230922051799003)(1800799009)(64100799003)(82310400011)(451199024)(186009)(40470700004)(36840700001)(46966006)(8936002)(86362001)(40460700003)(55016003)(7696005)(36756003)(47076005)(4326008)(4001150100001)(5660300002)(6862004)(2906002)(41300700001)(7636003)(83380400001)(356005)(82740400003)(336012)(26005)(16526019)(6286002)(40480700001)(36860700001)(1076003)(2616005)(53546011)(426003)(478600001)(6666004)(966005)(8676002)(70206006)(70586007)(6636002)(316002)(37006003)(54906003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Oct 2023 14:25:05.4979 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2a248446-3f3e-4e33-c9ab-08dbd30aafda X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7377 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Hi, FYI, your patch has been queued to stable release 22.11.4 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 11/15/23. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://git.dpdk.org/dpdk-stable/log/?h=22.11-staging This queued commit can be viewed at: https://git.dpdk.org/dpdk-stable/commit/?h=22.11-staging&id=421c47495c84a17da80e90777da557315f946c2d Thanks. Xueming Li --- >From 421c47495c84a17da80e90777da557315f946c2d Mon Sep 17 00:00:00 2001 From: Artemy Kovalyov Date: Fri, 8 Sep 2023 16:17:35 +0300 Subject: [PATCH] mem: fix deadlock with multiprocess Cc: Xueming Li [ upstream commit f82c02d3791ca1cd4ea1baa549a7d25dc2aef767 ] The issue arose due to changes in the DPDK read-write lock implementation. Following these changes, the RW-lock no longer supports recursion, implying that a single thread shouldn't obtain a read lock if it already possesses one. The problem arises during initialization: the rte_eal_init() function acquires the memory_hotplug_lock, and later on, there are sequences of calls that acquire it again without releasing it. * rte_eal_memory_init() -> eal_memalloc_init() -> rte_memseg_list_walk() * rte_eal_memory_init() -> rte_eal_hugepage_init() -> eal_dynmem_hugepage_init() -> rte_memseg_list_walk() This scenario introduces the risk of a potential deadlock when concurrent write locks are applied to the same memory_hotplug_lock. To address this locally, we resolved the issue by replacing rte_memseg_list_walk() with rte_memseg_list_walk_thread_unsafe(). Bugzilla ID: 1277 Fixes: 832cecc03d77 ("rwlock: prevent readers from starving writers") Signed-off-by: Artemy Kovalyov Acked-by: Dmitry Kozlyuk Tested-by: Jonathan Erb --- .mailmap | 3 ++- lib/eal/common/eal_common_dynmem.c | 5 ++++- lib/eal/include/generic/rte_rwlock.h | 4 ++++ lib/eal/linux/eal_memalloc.c | 7 +++++-- 4 files changed, 15 insertions(+), 4 deletions(-) diff --git a/.mailmap b/.mailmap index e2855454c2..361e2dac4f 100644 --- a/.mailmap +++ b/.mailmap @@ -120,6 +120,7 @@ Arkadiusz Kubalewski Arkadiusz Kusztal Arnon Warshavsky Arshdeep Kaur +Artemy Kovalyov Artem V. Andreev Artur Rojek Artur Trybula @@ -649,7 +650,7 @@ John Ousterhout John Romein John W. Linville Jonas Pfefferle -Jonathan Erb +Jonathan Erb Jon DeVree Jon Loeliger Joongi Kim diff --git a/lib/eal/common/eal_common_dynmem.c b/lib/eal/common/eal_common_dynmem.c index bdbbe233a0..95da55d9b0 100644 --- a/lib/eal/common/eal_common_dynmem.c +++ b/lib/eal/common/eal_common_dynmem.c @@ -251,7 +251,10 @@ eal_dynmem_hugepage_init(void) */ memset(&dummy, 0, sizeof(dummy)); dummy.hugepage_sz = hpi->hugepage_sz; - if (rte_memseg_list_walk(hugepage_count_walk, &dummy) < 0) + /* memory_hotplug_lock is held during initialization, so it's + * safe to call thread-unsafe version. + */ + if (rte_memseg_list_walk_thread_unsafe(hugepage_count_walk, &dummy) < 0) return -1; for (i = 0; i < RTE_DIM(dummy.num_pages); i++) { diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index 233d4262be..e479daa867 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -79,6 +79,10 @@ rte_rwlock_init(rte_rwlock_t *rwl) /** * Take a read lock. Loop until the lock is held. * + * @note The RW lock isn't recursive, so calling this function on the same + * lock twice without releasing it could potentially result in a deadlock + * scenario when a write lock is involved. + * * @param rwl * A pointer to a rwlock structure. */ diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c index f8b1588cae..9853ec78a2 100644 --- a/lib/eal/linux/eal_memalloc.c +++ b/lib/eal/linux/eal_memalloc.c @@ -1740,7 +1740,10 @@ eal_memalloc_init(void) eal_get_internal_configuration(); if (rte_eal_process_type() == RTE_PROC_SECONDARY) - if (rte_memseg_list_walk(secondary_msl_create_walk, NULL) < 0) + /* memory_hotplug_lock is held during initialization, so it's + * safe to call thread-unsafe version. + */ + if (rte_memseg_list_walk_thread_unsafe(secondary_msl_create_walk, NULL) < 0) return -1; if (rte_eal_process_type() == RTE_PROC_PRIMARY && internal_conf->in_memory) { @@ -1778,7 +1781,7 @@ eal_memalloc_init(void) } /* initialize all of the fd lists */ - if (rte_memseg_list_walk(fd_list_create_walk, NULL)) + if (rte_memseg_list_walk_thread_unsafe(fd_list_create_walk, NULL)) return -1; return 0; } -- 2.25.1 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2023-10-22 22:17:35.254253700 +0800 +++ 0022-mem-fix-deadlock-with-multiprocess.patch 2023-10-22 22:17:34.156723700 +0800 @@ -1 +1 @@ -From f82c02d3791ca1cd4ea1baa549a7d25dc2aef767 Mon Sep 17 00:00:00 2001 +From 421c47495c84a17da80e90777da557315f946c2d Mon Sep 17 00:00:00 2001 @@ -4,0 +5,3 @@ +Cc: Xueming Li + +[ upstream commit f82c02d3791ca1cd4ea1baa549a7d25dc2aef767 ] @@ -22 +24,0 @@ -Cc: stable@dpdk.org @@ -35 +37 @@ -index 755a4da2cd..bcb36bb5a5 100644 +index e2855454c2..361e2dac4f 100644 @@ -38 +40,2 @@ -@@ -126,6 +126,7 @@ Arnaud Fiorini +@@ -120,6 +120,7 @@ Arkadiusz Kubalewski + Arkadiusz Kusztal @@ -41 +43,0 @@ - Artemii Morozov @@ -46 +48 @@ -@@ -667,7 +668,7 @@ John Ousterhout +@@ -649,7 +650,7 @@ John Ousterhout @@ -72 +74 @@ -index 75f2b75782..5f939be98c 100644 +index 233d4262be..e479daa867 100644 @@ -75 +77 @@ -@@ -81,6 +81,10 @@ rte_rwlock_init(rte_rwlock_t *rwl) +@@ -79,6 +79,10 @@ rte_rwlock_init(rte_rwlock_t *rwl)