From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 555FA48B69 for ; Fri, 21 Nov 2025 12:23:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4F98A402E0; Fri, 21 Nov 2025 12:23:58 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 6B7864026F for ; Fri, 21 Nov 2025 12:23:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1763724235; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nWhEOhncLURdNBoJOYR9Pa/vwvziU9+D20v6AJk2cwo=; b=gbqQsxDfuu/diQqQ1amvLZge/TKtKk2A1H5fgAMLCt4Wt2oz94IGNT1sv2CVSxVo3qB5mc GbiN9eTZt0rzWh4xu4+7TA+5YwKuorb2Bam9X8c/wi7DODCsUtvwDVpq4oprquoEND8+1X cX8gfEqf47pHa4mre+S7gPtSMfVWfjM= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-75-KdNW68NAM9OgptcSr1GOmw-1; Fri, 21 Nov 2025 06:23:53 -0500 X-MC-Unique: KdNW68NAM9OgptcSr1GOmw-1 X-Mimecast-MFC-AGG-ID: KdNW68NAM9OgptcSr1GOmw_1763724232 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 56B651800473; Fri, 21 Nov 2025 11:23:52 +0000 (UTC) Received: from rh.redhat.com (unknown [10.42.28.165]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D092530044DB; Fri, 21 Nov 2025 11:23:50 +0000 (UTC) From: Kevin Traynor To: Wathsala Vithanage Cc: Ola Liljedahl , dpdk stable Subject: patch 'mcslock: fix memory ordering' has been queued to stable release 24.11.4 Date: Fri, 21 Nov 2025 11:20:48 +0000 Message-ID: <20251121112128.485623-64-ktraynor@redhat.com> In-Reply-To: <20251121112128.485623-1-ktraynor@redhat.com> References: <20251121112128.485623-1-ktraynor@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: TExAixIFfE_1xJEFKWzLJ7YYOYNhIyJfJeqpipDJ2k4_1763724232 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Hi, FYI, your patch has been queued to stable release 24.11.4 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 11/26/25. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://github.com/kevintraynor/dpdk-stable This queued commit can be viewed at: https://github.com/kevintraynor/dpdk-stable/commit/c9e3afaa776e94a0d8671bc547806fdc1e657269 Thanks. Kevin --- >From c9e3afaa776e94a0d8671bc547806fdc1e657269 Mon Sep 17 00:00:00 2001 From: Wathsala Vithanage Date: Tue, 11 Nov 2025 17:38:50 +0000 Subject: [PATCH] mcslock: fix memory ordering MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ upstream commit 8357af1cb3a359810bd56eab78ed104495c8094f ] Fix incorrect memory ordering in the MCS lock implementation by adding proper synchronizing edges to establish clear happens-before relationships between threads invoking lock() and unlock(). These synchronizing edges prevent potential deadlocks caused by improper ordering and are documented in detail through in-code comments. The previously relaxed load of the successor’s lock object pointer in unlock() has been upgraded to a load-acquire operation. This change ensures that the successor’s initialization does not overwrite the current lock holder’s update to the locked field, which could otherwise lead to deadlocks. Remove two unnecessary fences: The acquire fence in unlock() had no matching release fence, making it ineffective for enforcing memory order. The associated comment suggested it prevented speculative reordering, but such fences (data memory barriers) only establish memory ordering and do not control instruction speculation. The release-acquire fence pair in lock() was previously justified as preventing reordering between the load-acquire loop of me->locked and the store-release of prev->next. This is no longer needed, as the new synchronizing edges ensure a chain of happens-before relationships between memory operations of threads calling lock() and unlock(). Fixes: 2173f3333b61d ("mcslock: add MCS queued lock implementation") Signed-off-by: Wathsala Vithanage Reviewed-by: Ola Liljedahl --- lib/eal/include/rte_mcslock.h | 100 ++++++++++++++++++++++------------ 1 file changed, 64 insertions(+), 36 deletions(-) diff --git a/lib/eal/include/rte_mcslock.h b/lib/eal/include/rte_mcslock.h index bb218d2e50..0af7a94a06 100644 --- a/lib/eal/include/rte_mcslock.h +++ b/lib/eal/include/rte_mcslock.h @@ -58,9 +58,19 @@ rte_mcslock_lock(RTE_ATOMIC(rte_mcslock_t *) *msl, rte_mcslock_t *me) rte_atomic_store_explicit(&me->next, NULL, rte_memory_order_relaxed); - /* If the queue is empty, the exchange operation is enough to acquire - * the lock. Hence, the exchange operation requires acquire semantics. - * The store to me->next above should complete before the node is - * visible to other CPUs/threads. Hence, the exchange operation requires - * release semantics as well. + /* + * A0/R0: Queue might be empty, perform the exchange (RMW) with both acquire and + * release semantics: + * A0: Acquire — synchronizes with both R0 and R2. + * Must synchronize with R0 to ensure that this thread observes predecessor's + * initialization of its lock object or risk them overwriting this thread's + * update to the next of the same object via store to prev->next. + * + * Must synchronize with R2 the releasing CAS in unlock(), this will ensure + * that all prior critical-section writes become visible to this thread. + * + * R0: Release — ensures the successor observes our initialization of me->next; + * without it, me->next could be overwritten to NULL after the successor + * sets its own address, causing deadlock. This release synchronizes with + * A0 above. */ prev = rte_atomic_exchange_explicit(msl, me, rte_memory_order_acq_rel); @@ -71,22 +81,24 @@ rte_mcslock_lock(RTE_ATOMIC(rte_mcslock_t *) *msl, rte_mcslock_t *me) return; } - /* The store to me->next above should also complete before the node is - * visible to predecessor thread releasing the lock. Hence, the store - * prev->next also requires release semantics. Note that, for example, - * on ARM, the release semantics in the exchange operation is not - * strong as a release fence and is not sufficient to enforce the - * desired order here. + + /* + * R1: With the relaxed memory model of C/C++, it's essential that after + * we link ourselves by storing prev->next = me, the owner of prev must + * observe our prior initialization of me->locked. Otherwise it could + * clear me->locked before we set it to 1, which may deadlock. + * Perform a releasing store so the predecessor's acquire loads A2 and A3 + * observes our initialization, establishing a happens-before from those + * writes. */ rte_atomic_store_explicit(&prev->next, me, rte_memory_order_release); - /* The while-load of me->locked should not move above the previous - * store to prev->next. Otherwise it will cause a deadlock. Need a - * store-load barrier. - */ - rte_atomic_thread_fence(rte_memory_order_acq_rel); - /* If the lock has already been acquired, it first atomically + /* + * A1: If the lock has already been acquired, it first atomically * places the node at the end of the queue and then proceeds * to spin on me->locked until the previous lock holder resets - * the me->locked using mcslock_unlock(). + * the me->locked in rte_mcslock_unlock(). + * This load must synchronize with store-release R3 to ensure that + * all updates to critical section by previous lock holder is visible + * to this thread after acquiring the lock. */ rte_wait_until_equal_32((uint32_t *)(uintptr_t)&me->locked, 0, rte_memory_order_acquire); @@ -104,28 +116,44 @@ static inline void rte_mcslock_unlock(RTE_ATOMIC(rte_mcslock_t *) *msl, RTE_ATOMIC(rte_mcslock_t *) me) { - /* Check if there are more nodes in the queue. */ - if (likely(rte_atomic_load_explicit(&me->next, rte_memory_order_relaxed) == NULL)) { + /* + * A2: Check whether a successor is already queued. + * Load me->next with acquire semantics so it can synchronize with the + * successor’s release store R1. This guarantees that the successor’s + * initialization of its lock object (me) is completed before we observe + * it here, preventing a race between this thread’s store-release to + * me->next->locked and the successor’s store to me->locked. + */ + if (likely(rte_atomic_load_explicit(&me->next, rte_memory_order_acquire) == NULL)) { /* No, last member in the queue. */ - rte_mcslock_t *save_me = rte_atomic_load_explicit(&me, rte_memory_order_relaxed); + rte_mcslock_t *save_me = me; - /* Release the lock by setting it to NULL */ + /* + * R2: Try to release the lock by swinging *msl from save_me to NULL. + * Use release semantics so all critical section writes become + * visible to the next lock acquirer. + */ if (likely(rte_atomic_compare_exchange_strong_explicit(msl, &save_me, NULL, rte_memory_order_release, rte_memory_order_relaxed))) return; - /* Speculative execution would be allowed to read in the - * while-loop first. This has the potential to cause a - * deadlock. Need a load barrier. - */ - rte_atomic_thread_fence(rte_memory_order_acquire); - /* More nodes added to the queue by other CPUs. - * Wait until the next pointer is set. + /* + * A3: Another thread was enqueued concurrently, so the CAS and the lock + * release failed. Wait until the successor sets our 'next' pointer. + * This load must synchronize with the successor’s release store (R1) to + * ensure that the successor’s initialization completes before we observe + * it here. This ordering prevents a race between this thread’s later + * store-release to me->next->locked and the successor’s store to me->locked. */ RTE_ATOMIC(uintptr_t) *next; next = (__rte_atomic uintptr_t *)&me->next; - RTE_WAIT_UNTIL_MASKED(next, UINTPTR_MAX, !=, 0, rte_memory_order_relaxed); + RTE_WAIT_UNTIL_MASKED(next, UINTPTR_MAX, !=, 0, rte_memory_order_acquire); } - /* Pass lock to next waiter. */ + /* + * R3: Pass the lock to the successor. + * Use a release store to synchronize with A1 when clearing me->next->locked + * so the successor observes our critical section writes after it sees locked + * become 0. + */ rte_atomic_store_explicit(&me->next->locked, 0, rte_memory_order_release); } @@ -150,9 +178,9 @@ rte_mcslock_trylock(RTE_ATOMIC(rte_mcslock_t *) *msl, rte_mcslock_t *me) rte_mcslock_t *expected = NULL; - /* The lock can be taken only when the queue is empty. Hence, - * the compare-exchange operation requires acquire semantics. - * The store to me->next above should complete before the node - * is visible to other CPUs/threads. Hence, the compare-exchange - * operation requires release semantics as well. + /* + * A4/R4: The lock can be acquired only when the queue is empty. + * The compare-and-exchange operation must use acquire and release + * semantics for the same reasons described in the rte_mcslock_lock() + * function’s empty-queue case (see A0/R0 for details). */ return rte_atomic_compare_exchange_strong_explicit(msl, &expected, me, -- 2.51.0 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2025-11-21 11:05:11.588297999 +0000 +++ 0064-mcslock-fix-memory-ordering.patch 2025-11-21 11:05:09.540201498 +0000 @@ -1 +1 @@ -From 8357af1cb3a359810bd56eab78ed104495c8094f Mon Sep 17 00:00:00 2001 +From c9e3afaa776e94a0d8671bc547806fdc1e657269 Mon Sep 17 00:00:00 2001 @@ -8,0 +9,2 @@ +[ upstream commit 8357af1cb3a359810bd56eab78ed104495c8094f ] + @@ -37 +38,0 @@ -Cc: stable@dpdk.org