From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B564E41FCC; Wed, 30 Aug 2023 12:34:39 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4BE164027E; Wed, 30 Aug 2023 12:34:39 +0200 (CEST) Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2056.outbound.protection.outlook.com [40.107.100.56]) by mails.dpdk.org (Postfix) with ESMTP id 950C140277; Wed, 30 Aug 2023 12:34:37 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mfBprrdgrL31O8w2flbS+7HtLiL9ra3fgBE0FbEdpp7n9uJiJ0rzQs4Bk5hxG1KW4+8iH8kGO1kKRBEXJVopuN0745U51TSOmAM2kWGiWljXf3BhS0L5zcN6voZA9uKAcdetgLZ76TzMN2+cw7CXA4WRAzCuPYelt+k9/xe0UPfpw0LGxe9xIXbTLp07AXHG6/ABgdfOGofmnxguW1U5Wu3l6LIk/ykyzM3cdAteTKZhPOvVM6XXmtsVxDvianWXb0tPPjy2wxsdIVck7wD7rdpciszDkH0Uqupx3/D6YqxtlfylQekL6y+cM2qJ9NK9hZqoI9IZL0/6tl+UHi7LXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XIIrphqzA7XEf46xUiGzNh9E8mto4ugX1uB97qKuDQ4=; b=VgpHw/u6D1e+roy7EShm5dT1PEHNDNcmLfcO/4l7gMC4uo2uP4N6jC+v3F/sNLQMvxI6mki3Eyggiaow54TUt/Q7RE1NkrmnR2Ha8zehDo2yAm1jDJLl8DbwJNs/Sfh6/oBAj9uYeKGtbBU0JYmCr84p4dm4wGN8jT1Wr9/VcFUqN5/fK7ExiY1/LHTwRClqdRoD+QXPlI9L14KHXEkWhEjMof6ZSpcNstH2psHS01GQst3y5ltTKueaAcdsXQAK4bue0jkAeoX6QjZqvPMxPEunyFXOSHWv1mZhpUg3SX3fAgzykSEUdZ7fmfOhemvjjDJQ6CfH1SFQqlDPF0Dm+Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XIIrphqzA7XEf46xUiGzNh9E8mto4ugX1uB97qKuDQ4=; b=nku3NadNnGVHQczypVW5ar3PTnPe6GGGE666OxrcDDz108LT6PBdlu9ryT2jOvDEqc/I07jW/h9F6Uc4ylL7MzO+LUQu8YFwOvd13KGqU8HcPSGIu1gbruHgkzM2SKJ+99t7wAStSipIPLdPXLAdz2ux9GD59nCeGRjkoPCmnUnTzwN/6dyDWXATQZr9TD2S4diVkPmtfaP7ZiGhuSxc2ZEswp68S/i9uJVdaZ+DNdo/GL1qh1vl/6l2cYAHFkUsrOMFHGUZ3Osa1ka30ePsuGwSEp3rSORkefCd2dsixnnQJiHc7SMRcsx5mZ3AhEFKJ9u/ZbRjOAEDFigTlnLSXw== Received: from CY5PR19CA0049.namprd19.prod.outlook.com (2603:10b6:930:1a::7) by SA1PR12MB6996.namprd12.prod.outlook.com (2603:10b6:806:24f::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6745.18; Wed, 30 Aug 2023 10:34:35 +0000 Received: from CY4PEPF0000E9D1.namprd03.prod.outlook.com (2603:10b6:930:1a:cafe::91) by CY5PR19CA0049.outlook.office365.com (2603:10b6:930:1a::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6745.20 via Frontend Transport; Wed, 30 Aug 2023 10:34:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by CY4PEPF0000E9D1.mail.protection.outlook.com (10.167.241.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6745.16 via Frontend Transport; Wed, 30 Aug 2023 10:34:35 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 30 Aug 2023 03:34:22 -0700 Received: from rnnvmail201.nvidia.com (10.129.68.8) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 30 Aug 2023 03:34:21 -0700 Received: from nvidia.com (10.127.8.9) by mail.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 30 Aug 2023 03:34:19 -0700 From: Artemy Kovalyov To: CC: Thomas Monjalon , Ophir Munk , , Anatoly Burakov , "Stephen Hemminger" , =?UTF-8?q?Morten=20Br=C3=B8rup?= Subject: [PATCH] eal: fix memory initialization deadlock Date: Wed, 30 Aug 2023 13:33:03 +0300 Message-ID: <20230830103303.2428995-1-artemyko@nvidia.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9D1:EE_|SA1PR12MB6996:EE_ X-MS-Office365-Filtering-Correlation-Id: 352c7254-2104-4bd9-32fc-08dba944b474 X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gfKJ6GOGQ7qIyK+Wy2D0sWi8M9By5c+2CB0uoNOJ/HyDOjypkLryFQBP7WlEhx6TAmSQ1A6DXmZjgvrBdOd18MoTsa5FpXmI6UhFjw/PPRuUyanj3y3JpcTDzaYPusvG/im1IrnTqgfCLyvlncaI2FPsItx6i6TiEsoFlNznfF5hMHbvVYPhDMq8nv3OfwtgVU8Lk8v0+oNRQ4IX8TJxvaFmtxqzNWUOJdF+6153Dj8NFu/YD492W/5tyinNipEtcBvm/ZHGrH70xKSQqanIWENRsrF2TgWZV/cKrIh9O6tXhFUhXKZ/7otnO2yqAoyx6KKZRgwN+dI+JzqjOpdfrwKEO0VIuVLOdBGggTFmPJ/QATDot8m0elsyq4fdm3SA0FKjaC3MJr8Z6fAgyfoJKjsh+0fNHMvVtspnt3lwJIobaGBUkE8p0zT5ymaB2IqUaFkYMpvfz5bipI+CxRtFo4bKPm3x3EbGbfpwu1X8tGNh1ntIU5ZjUufIcbVvEaDyxfWi5ZisfaWc1fjPx4L6gTohzLgaPmHprFOtGAsrU2LC/RX+xSMPXHlQBKa5VO2xi4nltur1Zy12N1OarrpI6+wzdS166Qx04GVEP1wBcvq63G6QKihpvNtCNUCvYv7GTqjef9PrMtrAX/JtzM9M1aO9Xh8KcnqgLDa/MVvtwujZ70icjT+xQFOS3vHOWv5Z93kP9KsCKHXFLd18WZM4IPhy1oZNVwG9GjdRJsxuNxc+rvK2Ap+ieMdsAC8KzhQJ X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(396003)(376002)(136003)(346002)(451199024)(82310400011)(1800799009)(186009)(40470700004)(36840700001)(46966006)(40460700003)(316002)(41300700001)(6916009)(4326008)(336012)(426003)(6286002)(2906002)(47076005)(86362001)(2616005)(55016003)(8676002)(5660300002)(26005)(83380400001)(36756003)(1076003)(40480700001)(36860700001)(8936002)(6666004)(7636003)(356005)(82740400003)(7696005)(54906003)(70206006)(70586007)(478600001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Aug 2023 10:34:35.2191 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 352c7254-2104-4bd9-32fc-08dba944b474 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9D1.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB6996 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The issue arose due to the change in the DPDK read-write lock implementation. That change added a new flag, RTE_RWLOCK_WAIT, designed to prevent new read locks while a write lock is in the queue. However, this change has led to a scenario where a recursive read lock, where a lock is acquired twice by the same execution thread, can initiate a sequence of events resulting in a deadlock: Process 1 takes the first read lock. Process 2 attempts to take a write lock, triggering RTE_RWLOCK_WAIT due to the presence of a read lock. This makes process 2 enter a wait loop until the read lock is released. Process 1 tries to take a second read lock. However, since a write lock is waiting (due to RTE_RWLOCK_WAIT), it also enters a wait loop until the write lock is acquired and then released. Both processes end up in a blocked state, unable to proceed, resulting in a deadlock scenario. Following these changes, the RW-lock no longer supports recursion, implying that a single thread shouldn't obtain a read lock if it already possesses one. The problem arises during initialization: the rte_eal_init() function acquires the memory_hotplug_lock, and later on, the sequence of calls rte_eal_memory_init() -> eal_memalloc_init() -> rte_memseg_list_walk() acquires it again without releasing it. This scenario introduces the risk of a potential deadlock when concurrent write locks are applied to the same memory_hotplug_lock. To address this we resolved the issue by replacing rte_memseg_list_walk() with rte_memseg_list_walk_thread_unsafe(). Bugzilla ID: 1277 Fixes: 832cecc03d77 ("rwlock: prevent readers from starving writers") Cc: stable@dpdk.org Signed-off-by: Artemy Kovalyov --- lib/eal/include/generic/rte_rwlock.h | 4 ++++ lib/eal/linux/eal_memalloc.c | 7 +++++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index 9e083bbc61..c98fc7d083 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -80,6 +80,10 @@ rte_rwlock_init(rte_rwlock_t *rwl) /** * Take a read lock. Loop until the lock is held. * + * @note The RW lock isn't recursive, so calling this function on the same + * lock twice without releasing it could potentially result in a deadlock + * scenario when a write lock is involved. + * * @param rwl * A pointer to a rwlock structure. */ diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c index f8b1588cae..3705b41f5f 100644 --- a/lib/eal/linux/eal_memalloc.c +++ b/lib/eal/linux/eal_memalloc.c @@ -1740,7 +1740,10 @@ eal_memalloc_init(void) eal_get_internal_configuration(); if (rte_eal_process_type() == RTE_PROC_SECONDARY) - if (rte_memseg_list_walk(secondary_msl_create_walk, NULL) < 0) + /* memory_hotplug_lock is taken in rte_eal_init(), so it's + * safe to call thread-unsafe version. + */ + if (rte_memseg_list_walk_thread_unsafe(secondary_msl_create_walk, NULL) < 0) return -1; if (rte_eal_process_type() == RTE_PROC_PRIMARY && internal_conf->in_memory) { @@ -1778,7 +1781,7 @@ eal_memalloc_init(void) } /* initialize all of the fd lists */ - if (rte_memseg_list_walk(fd_list_create_walk, NULL)) + if (rte_memseg_list_walk_thread_unsafe(fd_list_create_walk, NULL)) return -1; return 0; } -- 2.25.1