From: Wathsala Vithanage <wathsala.vithanage@arm.com>
To: luca.boccassi@gmail.com
Cc: Ola Liljedahl <ola.liljedahl@arm.com>,
Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
Dhruv Tripathi <dhruv.tripathi@arm.com>,
Konstantin Ananyev <konstantin.ananyev@huawei.com>,
dpdk stable <stable@dpdk.org>
Subject: Re: patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11
Date: Wed, 12 Nov 2025 13:12:30 -0600 [thread overview]
Message-ID: <4f026a78-346e-4504-926a-24411dce7e53@arm.com> (raw)
In-Reply-To: <20251112165308.1618107-48-luca.boccassi@gmail.com>
Hi Luca,
On 11/12/25 10:53, luca.boccassi@gmail.com wrote:
> Hi,
>
> FYI, your patch has been queued to stable release 22.11.11
>
> Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
> It will be pushed if I get no objections before 11/14/25. So please
> shout if anyone has objections.
>
Looked like it needed some work, so I just posted a patch.
Let me know if you need any help with RTS and HTS patches as well.
Thanks
--wathsala
> Also note that after the patch there's a diff of the upstream commit vs the
> patch applied to the branch. This will indicate if there was any rebasing
> needed to apply to the stable branch. If there were code changes for rebasing
> (ie: not only metadata diffs), please double check that the rebase was
> correctly done.
>
> Queued patches are on a temporary branch at:
> https://github.com/bluca/dpdk-stable
>
> This queued commit can be viewed at:
> https://github.com/bluca/dpdk-stable/commit/8e64e64659fe628f6b7ce903b67a6c8d271da524
>
> Thanks.
>
> Luca Boccassi
>
> ---
> From 8e64e64659fe628f6b7ce903b67a6c8d271da524 Mon Sep 17 00:00:00 2001
> From: Wathsala Vithanage <wathsala.vithanage@arm.com>
> Date: Tue, 11 Nov 2025 18:37:17 +0000
> Subject: [PATCH] ring: establish safe partial order in default mode
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> [ upstream commit a4ad0eba9def1d1d071da8afe5e96eb2a2e0d71f ]
>
> The function __rte_ring_headtail_move_head() assumes that the barrier
> (fence) between the load of the head and the load-acquire of the
> opposing tail guarantees the following: if a first thread reads tail
> and then writes head and a second thread reads the new value of head
> and then reads tail, then it should observe the same (or a later)
> value of tail.
>
> This assumption is incorrect under the C11 memory model. If the barrier
> (fence) is intended to establish a total ordering of ring operations,
> it fails to do so. Instead, the current implementation only enforces a
> partial ordering, which can lead to unsafe interleavings. In particular,
> some partial orders can cause underflows in free slot or available
> element computations, potentially resulting in data corruption.
>
> The issue manifests when a CPU first acts as a producer and later as a
> consumer. In this scenario, the barrier assumption may fail when another
> core takes the consumer role. A Herd7 litmus test in C11 can demonstrate
> this violation. The problem has not been widely observed so far because:
> (a) on strong memory models (e.g., x86-64) the assumption holds, and
> (b) on relaxed models with RCsc semantics the ordering is still strong
> enough to prevent hazards.
> The problem becomes visible only on weaker models, when load-acquire is
> implemented with RCpc semantics (e.g. some AArch64 CPUs which support
> the LDAPR and LDAPUR instructions).
>
> Three possible solutions exist:
> 1. Strengthen ordering by upgrading release/acquire semantics to
> sequential consistency. This requires using seq-cst for stores,
> loads, and CAS operations. However, this approach introduces a
> significant performance penalty on relaxed-memory architectures.
>
> 2. Establish a safe partial order by enforcing a pair-wise
> happens-before relationship between thread of same role by changing
> the CAS and the preceding load of the head by converting them to
> release and acquire respectively. This approach makes the original
> barrier assumption unnecessary and allows its removal.
>
> 3. Retain partial ordering but ensure only safe partial orders are
> committed. This can be done by detecting underflow conditions
> (producer < consumer) and quashing the update in such cases.
> This approach makes the original barrier assumption unnecessary
> and allows its removal.
>
> This patch implements solution (2) to preserve the “enqueue always
> succeeds” contract expected by dependent libraries (e.g., mempool).
> While solution (3) offers higher performance, adopting it now would
> break that assumption.
>
> Fixes: 49594a63147a9 ("ring/c11: relax ordering for load and store of the head")
>
> Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> Signed-off-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Tested-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> ---
> lib/ring/rte_ring_c11_pvt.h | 37 +++++++++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index f895950df4..5c04a001e1 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -24,6 +24,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> if (!single)
> rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>
> + /*
> + * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> + * Ensures that memory effects by this thread on ring elements array
> + * is observed by a different thread of the other type.
> + */
> __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
> }
>
> @@ -61,16 +66,23 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
> unsigned int max = n;
> int success;
>
> - *old_head = __atomic_load_n(&r->prod.head, __ATOMIC_RELAXED);
> + /*
> + * A0: Establishes a synchronizing edge with R1.
> + * Ensure that this thread observes same values
> + * to stail observed by the thread that updated
> + * d->head.
> + * If not, an unsafe partial order may ensue.
> + */
> + *old_head = __atomic_load_n(&r->prod.head, __ATOMIC_ACQUIRE);
> do {
> /* Reset n to the initial burst count */
> n = max;
>
> - /* Ensure the head is read before tail */
> - __atomic_thread_fence(__ATOMIC_ACQUIRE);
> -
> - /* load-acquire synchronize with store-release of ht->tail
> - * in update_tail.
> + /*
> + * A1: Establishes a synchronizing edge with R0.
> + * Ensures that other thread's memory effects on
> + * ring elements array is observed by the time
> + * this thread observes its tail update.
> */
> cons_tail = __atomic_load_n(&r->cons.tail,
> __ATOMIC_ACQUIRE);
> @@ -170,10 +182,19 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> r->cons.head = *new_head, success = 1;
> else
> /* on failure, *old_head will be updated */
> + /*
> + * R1/A2.
> + * R1: Establishes a synchronizing edge with A0 of a
> + * different thread.
> + * A2: Establishes a synchronizing edge with R1 of a
> + * different thread to observe same value for stail
> + * observed by that thread on CAS failure (to retry
> + * with an updated *old_head).
> + */
> success = __atomic_compare_exchange_n(&r->cons.head,
> old_head, *new_head,
> - 0, __ATOMIC_RELAXED,
> - __ATOMIC_RELAXED);
> + 0, __ATOMIC_RELEASE,
> + __ATOMIC_ACQUIRE);
> } while (unlikely(success == 0));
> return n;
> }
next prev parent reply other threads:[~2025-11-12 19:12 UTC|newest]
Thread overview: 130+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 16:18 patch 'net/gve: allocate Rx QPL pages using malloc' " luca.boccassi
2025-10-27 16:18 ` patch 'eal: fix plugin dir walk' " luca.boccassi
2025-10-27 16:18 ` patch 'cmdline: fix port list parsing' " luca.boccassi
2025-10-27 16:18 ` patch 'cmdline: fix highest bit " luca.boccassi
2025-10-27 16:18 ` patch 'tailq: fix lookup macro' " luca.boccassi
2025-10-27 16:18 ` patch 'hash: fix unaligned access in predictable RSS' " luca.boccassi
2025-10-27 16:18 ` patch 'graph: fix unaligned access in stats' " luca.boccassi
2025-10-27 16:18 ` patch 'eventdev: fix listing timer adapters with telemetry' " luca.boccassi
2025-10-27 16:18 ` patch 'cfgfile: fix section count with no name' " luca.boccassi
2025-10-27 16:18 ` patch 'net/vmxnet3: fix mapping of mempools to queues' " luca.boccassi
2025-10-27 16:18 ` patch 'app/testpmd: increase size of set cores list command' " luca.boccassi
2025-10-27 16:18 ` patch 'net/dpaa2: fix shaper rate' " luca.boccassi
2025-10-27 16:18 ` patch 'app/testpmd: monitor state of primary process' " luca.boccassi
2025-10-27 16:18 ` patch 'app/testpmd: fix conntrack action query' " luca.boccassi
2025-10-27 16:18 ` patch 'doc: add conntrack state inspect command to testpmd guide' " luca.boccassi
2025-10-27 16:18 ` patch 'app/testpmd: validate DSCP and VLAN for meter creation' " luca.boccassi
2025-10-27 16:18 ` patch 'net/mlx5: fix min and max MTU reporting' " luca.boccassi
2025-10-27 16:18 ` patch 'net/mlx5: fix unsupported flow rule port action' " luca.boccassi
2025-10-27 16:18 ` patch 'net/mlx5: fix non-template age rules flush' " luca.boccassi
2025-10-27 16:18 ` patch 'net/mlx5: fix connection tracking state item validation' " luca.boccassi
2025-10-27 16:18 ` patch 'net/mlx5: fix indirect flow age action handling' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: fix Direct Verbs counter offset detection' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: fix interface name parameter definition' " luca.boccassi
2025-10-27 16:19 ` patch 'net/intel: fix assumption about tag placement order' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice/base: fix adding special words' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice/base: fix memory leak in HW profile handling' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice/base: fix memory leak in recipe " luca.boccassi
2025-10-27 16:19 ` patch 'eal: fix DMA mask validation with IOVA mode option' " luca.boccassi
2025-10-27 16:19 ` patch 'eal: fix MP socket cleanup' " luca.boccassi
2025-10-27 16:19 ` patch 'crypto/ipsec_mb: fix QP release in secondary' " luca.boccassi
2025-10-27 16:19 ` patch 'efd: fix AVX2 support' " luca.boccassi
2025-10-27 16:19 ` patch 'common/cnxk: fix async event handling' " luca.boccassi
2025-10-27 16:19 ` patch 'doc: fix feature list of ice driver' " luca.boccassi
2025-10-27 16:19 ` patch 'doc: fix feature list of iavf " luca.boccassi
2025-10-27 16:19 ` patch 'baseband/acc: fix exported header' " luca.boccassi
2025-10-27 16:19 ` patch 'gpudev: fix driver header for Windows' " luca.boccassi
2025-10-27 16:19 ` patch 'drivers: fix some exported headers' " luca.boccassi
2025-10-27 16:19 ` patch 'test/debug: fix crash with mlx5 devices' " luca.boccassi
2025-10-27 16:19 ` patch 'bus/pci: fix build with MinGW 13' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: " luca.boccassi
2025-10-27 16:19 ` patch 'dma/hisilicon: fix stop with pending transfers' " luca.boccassi
2025-10-27 16:19 ` patch 'test/dma: fix failure condition' " luca.boccassi
2025-10-27 16:19 ` patch 'fib6: fix tbl8 allocation check logic' " luca.boccassi
2025-10-27 16:19 ` patch 'vhost: fix double fetch when dequeue offloading' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice/base: fix integer overflow on NVM init' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice: fix initialization with 8 ports' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice: remove indirection for FDIR filters' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ice: fix memory leak in raw pattern parse' " luca.boccassi
2025-10-27 16:19 ` patch 'net/i40e: fix symmetric Toeplitz hashing for SCTP' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: fix multicast' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: fix MTU initialization' " luca.boccassi
2025-10-27 16:19 ` patch 'net/mlx5: fix leak of flow indexed pools' " luca.boccassi
2025-10-27 16:19 ` patch 'net/hns3: fix inconsistent lock' " luca.boccassi
2025-10-27 16:19 ` patch 'net/hns3: fix VLAN resources freeing' " luca.boccassi
2025-10-27 16:19 ` patch 'net/af_packet: fix crash in secondary process' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ark: remove double mbuf free' " luca.boccassi
2025-10-27 16:19 ` patch 'net/hns3: fix VLAN tag loss for short tunnel frame' " luca.boccassi
2025-10-27 16:19 ` patch 'ethdev: fix VLAN filter parameter description' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix file descriptor leak on read error' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix out-of-bounds access in UIO mapping' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix buffer descriptor size configuration' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix Tx queue free' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix checksum flag handling and error return' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: reject multi-queue configuration' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: fix memory leak in Rx buffer cleanup' " luca.boccassi
2025-10-27 16:19 ` patch 'net/enetfec: reject Tx deferred queue' " luca.boccassi
2025-10-27 16:19 ` patch 'net/tap: fix interrupt callback crash after failed start' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ena: fix PCI BAR mapping on 64K page size' " luca.boccassi
2025-10-27 16:19 ` patch 'net/ena/base: fix unsafe memcpy on invalid memory' " luca.boccassi
2025-10-27 16:19 ` patch 'net/dpaa2: fix uninitialized variable' " luca.boccassi
2025-10-27 16:19 ` patch 'net/dpaa2: fix L3/L4 checksum results' " luca.boccassi
2025-10-27 16:19 ` patch 'net/dpaa2: receive packets with additional parse errors' " luca.boccassi
2025-10-27 16:19 ` patch 'crypto/qat: fix source buffer alignment' " luca.boccassi
2025-10-27 16:19 ` patch 'crypto/cnxk: refactor RSA verification' " luca.boccassi
2025-10-27 16:19 ` patch 'test/crypto: fix mbuf handling' " luca.boccassi
2025-10-27 16:19 ` patch 'app/crypto-perf: fix plaintext size exceeds buffer size' " luca.boccassi
2025-10-27 16:19 ` patch 'test/crypto: fix vector initialization' " luca.boccassi
2025-10-27 16:19 ` patch 'crypto/virtio: fix cookies leak' " luca.boccassi
2025-10-27 16:19 ` patch 'sched: fix WRR parameter data type' " luca.boccassi
2025-11-12 16:52 ` patch 'test/hash: check memory allocation' " luca.boccassi
2025-11-12 16:52 ` patch 'dmadev: fix debug build with tracepoints' " luca.boccassi
2025-11-12 16:52 ` patch 'buildtools/pmdinfogen: fix warning with python 3.14' " luca.boccassi
2025-11-12 16:52 ` patch 'net/iavf: fix build with clang 21' " luca.boccassi
2025-11-12 16:52 ` patch 'eventdev/crypto: " luca.boccassi
2025-11-12 16:52 ` patch 'rawdev: " luca.boccassi
2025-11-12 16:52 ` patch 'vdpa/mlx5: remove unused constant' " luca.boccassi
2025-11-12 16:52 ` patch 'crypto/mlx5: remove unused constants' " luca.boccassi
2025-11-12 16:52 ` patch 'regex/mlx5: remove useless " luca.boccassi
2025-11-12 16:52 ` patch 'common/mlx5: " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: remove unused macros' " luca.boccassi
2025-11-12 16:52 ` patch 'doc: fix NVIDIA bifurcated driver presentation link' " luca.boccassi
2025-11-12 16:52 ` patch 'vfio: fix custom containers in multiprocess' " luca.boccassi
2025-11-12 16:52 ` patch 'net/vmxnet3: disable RSS for single queue for ESX8.0+' " luca.boccassi
2025-11-12 16:52 ` patch 'net/dpaa: fix resource leak' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: reduce memory size of ring descriptors' " luca.boccassi
2025-11-12 16:52 ` patch 'net/ngbe: " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix VF Rx buffer size in config register' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: add device arguments for FDIR' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix maximum number of FDIR filters' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix FDIR mode clearing' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix FDIR drop action for L4 match packets' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix FDIR filter for SCTP tunnel' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: filter FDIR match flex bytes for " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix FDIR rule raw relative for L3 packets' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: fix FDIR input mask' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: switch to FDIR when ntuple filter is full' " luca.boccassi
2025-11-12 16:52 ` patch 'net/txgbe: remove unsupported flow action mark' " luca.boccassi
2025-11-12 16:52 ` patch 'net/bonding: fix MAC address propagation in 802.3ad mode' " luca.boccassi
2025-11-12 16:52 ` patch 'app/testpmd: fix DCB Tx port' " luca.boccassi
2025-11-12 16:52 ` patch 'app/testpmd: fix DCB Rx queues' " luca.boccassi
2025-11-12 16:52 ` patch 'net/e1000/base: fix crash on init with GCC 13' " luca.boccassi
2025-11-12 16:52 ` patch 'net/fm10k: fix build with GCC 16' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx4: fix unnecessary comma' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: fix unnecessary commas' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: fix multi-process Tx default rules' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: store MTU at Rx queue allocation time' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: fix indirect RSS action hash' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: fix indirect meter index leak' " luca.boccassi
2025-11-12 16:52 ` patch 'net/mlx5: fix error reporting on masked indirect actions' " luca.boccassi
2025-11-12 16:52 ` patch 'net: fix L2 length for GRE packets' " luca.boccassi
2025-11-12 16:52 ` patch 'graph: fix updating edge with active graph' " luca.boccassi
2025-11-12 16:52 ` patch 'app/pdump: remove hard-coded memory channels' " luca.boccassi
2025-11-12 16:52 ` patch 'pdump: handle primary process exit' " luca.boccassi
2025-11-12 16:53 ` patch 'examples/l3fwd-power: fix telemetry command registration' " luca.boccassi
2025-11-12 16:53 ` patch 'lib: fix backticks matching in Doxygen comments' " luca.boccassi
2025-11-12 16:53 ` patch 'ring: establish safe partial order in default mode' " luca.boccassi
2025-11-12 19:12 ` Wathsala Vithanage [this message]
2025-11-12 21:12 ` Luca Boccassi
2025-11-12 16:53 ` patch 'doc: add device arguments in txgbe guide' " luca.boccassi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4f026a78-346e-4504-926a-24411dce7e53@arm.com \
--to=wathsala.vithanage@arm.com \
--cc=dhruv.tripathi@arm.com \
--cc=honnappa.nagarahalli@arm.com \
--cc=konstantin.ananyev@huawei.com \
--cc=luca.boccassi@gmail.com \
--cc=ola.liljedahl@arm.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).