From: Ola Liljedahl <Ola.Liljedahl@arm.com>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>,
Wathsala Vithanage <wathsala.vithanage@arm.com>,
Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
Dhruv Tripathi <Dhruv.Tripathi@arm.com>,
Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [PATCH 1/1] ring: safe partial ordering for head/tail update
Date: Wed, 17 Sep 2025 09:05:30 +0000 [thread overview]
Message-ID: <4173E44D-BB31-45B9-AF6A-553B1E755604@arm.com> (raw)
In-Reply-To: <61a9d5b157be4816aa296194f9c0eabe@huawei.com>
> On 2025-09-17, 09:58, "Konstantin Ananyev" <konstantin.ananyev@huawei.com <mailto:konstantin.ananyev@huawei.com>> wrote:
>
> To avoid information loss I combined reply to two Wathsala replies into one.
>
>
> > > > The function __rte_ring_headtail_move_head() assumes that the
> > > > barrier
> > > (fence) between the load of the head and the load-acquire of the
> > > > opposing tail guarantees the following: if a first thread reads
> > > > tail
> > > > and then writes head and a second thread reads the new value of
> > > > head
> > > > and then reads tail, then it should observe the same (or a later)
> > > > value of tail.
> > > >
> > > > This assumption is incorrect under the C11 memory model. If the
> > > > barrier
> > > > (fence) is intended to establish a total ordering of ring
> > > > operations,
> > > > it fails to do so. Instead, the current implementation only
> > > > enforces a
> > > > partial ordering, which can lead to unsafe interleavings. In
> > > > particular,
> > > > some partial orders can cause underflows in free slot or available
> > > > element computations, potentially resulting in data corruption.
> > >
> > > Hmm... sounds exactly like the problem from the patch we discussed
> > > earlier that year:
> > > https://patchwork.dpdk.org/project/dpdk/patch/20250521111432.207936-4-konstantin.ananyev@huawei.com <mailto:20250521111432.207936-4-konstantin.ananyev@huawei.com>/
> > > In two words:
> > > "... thread can see 'latest' 'cons.head' value, with 'previous' value
> > > for 'prod.tail' or visa-versa.
> > > In other words: 'cons.head' value depends on 'prod.tail', so before
> > > making latest 'cons.head'
> > > value visible to other threads, we need to ensure that latest
> > > 'prod.tail' is also visible."
> > > Is that the one?
>
>
> > Yes, the behavior occurs under RCpc (LDAPR) but not under RCsc (LDAR),
> > which is why we didn’t catch it earlier. A fuller explanation, with
> > Herd7 simulations, is in the blog post linked in the cover letter.
> >
> > https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/when-a-barrier-does-not-block-the-pitfalls-of-partial-order <https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/when-a-barrier-does-not-block-the-pitfalls-of-partial-order>
>
>
> I see, so now it is reproducible with core rte_ring on real HW.
>
>
> > >
> > > > The issue manifests when a CPU first acts as a producer and later
> > > > as a
> > > > consumer. In this scenario, the barrier assumption may fail when
> > > > another
> > > > core takes the consumer role. A Herd7 litmus test in C11 can
> > > > demonstrate
> > > > this violation. The problem has not been widely observed so far
> > > > because:
> > > > (a) on strong memory models (e.g., x86-64) the assumption holds,
> > > > and
> > > > (b) on relaxed models with RCsc semantics the ordering is still
> > > > strong
> > > > enough to prevent hazards.
> > > > The problem becomes visible only on weaker models, when load-
> > > > acquire is
> > > > implemented with RCpc semantics (e.g. some AArch64 CPUs which
> > > > support
> > > > the LDAPR and LDAPUR instructions).
> > > >
> > > > Three possible solutions exist:
> > > > 1. Strengthen ordering by upgrading release/acquire semantics to
> > > > sequential consistency. This requires using seq-cst for
> > > > stores,
> > > > loads, and CAS operations. However, this approach introduces a
> > > > significant performance penalty on relaxed-memory
> > > > architectures.
> > > >
> > > > 2. Establish a safe partial order by enforcing a pair-wise
> > > > happens-before relationship between thread of same role by
> > > > changing
> > > > the CAS and the preceding load of the head by converting them
> > > > to
> > > > release and acquire respectively. This approach makes the
> > > > original
> > > > barrier assumption unnecessary and allows its removal.
> > >
> > > For the sake of clarity, can you outline what would be exact code
> > > changes for
> > > approach #2? Same as in that patch:
> > > https://patchwork.dpdk.org/project/dpdk/patch/20250521111432.207936-4- <https://patchwork.dpdk.org/project/dpdk/patch/20250521111432.207936-4->
> > konstantin.ananyev@huawei.com <mailto:konstantin.ananyev@huawei.com>/
> > > Or something different?
> >
> > Sorry, I missed the later half you your comment before.
> > Yes, you have proposed the same solution there.
>
>
> Ok, thanks for confirmation.
>
>
> > >
> > >
> > > > 3. Retain partial ordering but ensure only safe partial orders
> > > > are
> > > > committed. This can be done by detecting underflow conditions
> > > > (producer < consumer) and quashing the update in such cases.
> > > > This approach makes the original barrier assumption
> > > > unnecessary
> > > > and allows its removal.
> > >
> > > > This patch implements solution (3) for performance reasons.
> > > >
> > > > Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com <mailto:wathsala.vithanage@arm.com>>
> > > > Signed-off-by: Ola Liljedahl <ola.liljedahl@arm.com <mailto:ola.liljedahl@arm.com>>
> > > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com <mailto:honnappa.nagarahalli@arm.com>>
> > > > Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com <mailto:dhruv.tripathi@arm.com>>
> > > > ---
> > > > lib/ring/rte_ring_c11_pvt.h | 10 +++++++---
> > > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/lib/ring/rte_ring_c11_pvt.h
> > > > b/lib/ring/rte_ring_c11_pvt.h
> > > > index b9388af0da..e5ac1f6b9e 100644
> > > > --- a/lib/ring/rte_ring_c11_pvt.h
> > > > +++ b/lib/ring/rte_ring_c11_pvt.h
> > > > @@ -83,9 +83,6 @@ __rte_ring_headtail_move_head(struct
> > > > rte_ring_headtail
> > > > *d,
> > > > /* Reset n to the initial burst count */
> > > > n = max;
> > > >
> > > > - /* Ensure the head is read before tail */
> > > > - rte_atomic_thread_fence(rte_memory_order_acquire);
> > > > -
> > > > /* load-acquire synchronize with store-release of
> > > > ht->tail
> > > > * in update_tail.
> > > > */
> > >
> > > But then cons.head can be read a before prod.tail (and visa-versa),
> > > right?
> >
> > Right, we let it happen but eliminate any resulting states that are
> > semantically incorrect at the end.
>
>
> Two comments here:
> 1) I think it is probably safer to do the check like that:
> If (*entries > ring->capacity) ...
Yes, this might be another way of handling underflow situations. We could study this.
I have used the check for negative without problems in my ring buffer implementations
https://github.com/ARM-software/progress64/blob/master/src/p64_ringbuf.c
but can't say that has been battle-tested.
> 2) My concern that without forcing a proper read ordering
> (cons.head first then prod.tail) we re-introduce a window for all sorts of
> ABA-like problems.
Head and tail indexes are monotonically increasing so I don't see a risk for ABA-like problems.
Indeed, adding a monotonically increasing tag to pointers is the common way of avoiding ABA
problems in lock-free designs.
- Ola
Ignore the following auto-added disclaimer.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
prev parent reply other threads:[~2025-09-17 12:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 18:54 [PATCH 0/1] ring: correct ordering issue in " Wathsala Vithanage
2025-09-15 18:54 ` [PATCH 1/1] ring: safe partial ordering for " Wathsala Vithanage
2025-09-16 15:42 ` Bruce Richardson
2025-09-16 18:19 ` Ola Liljedahl
2025-09-17 7:47 ` Bruce Richardson
2025-09-16 22:57 ` Konstantin Ananyev
2025-09-16 23:08 ` Konstantin Ananyev
[not found] ` <2a611c3cf926d752a54b7655c27d6df874a2d0de.camel@arm.com>
2025-09-17 7:58 ` Konstantin Ananyev
2025-09-17 9:05 ` Ola Liljedahl [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4173E44D-BB31-45B9-AF6A-553B1E755604@arm.com \
--to=ola.liljedahl@arm.com \
--cc=Dhruv.Tripathi@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@huawei.com \
--cc=wathsala.vithanage@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).