Re: [PATCH 1/1] ring: safe partial ordering for head/tail update

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Ola Liljedahl <Ola.Liljedahl@arm.com>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>,
	Wathsala Vithanage <wathsala.vithanage@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Dhruv Tripathi <Dhruv.Tripathi@arm.com>,
	Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [PATCH 1/1] ring: safe partial ordering for head/tail update
Date: Wed, 24 Sep 2025 13:28:50 +0000	[thread overview]
Message-ID: <BD3A1BB1-B778-46DE-94BA-6E27DA4B0B08@arm.com> (raw)
In-Reply-To: <73287d6c09d049aa994a1d17962130b9@huawei.com>

> On 2025-09-24, 13:51, "Konstantin Ananyev" <konstantin.ananyev@huawei.com <mailto:konstantin.ananyev@huawei.com>> wrote:
>
> > > > > > Sure, I am talking about MT scenario.
> > > > > > I think I already provided an example: DPDK mempool library (see below).
> > > > > > In brief, It works like that:
> > > > > > At init it allocates ring of N memory buffers and ring big enough to hold all of
> > > > them.
> > > > >
> > > > > Sorry, I meant to say: "it allocates N memory buffers and ring big enough to
> > hold
> > > > all of them".
> > > > >
> > > > > > Then it enqueues all allocated memory buffers into the ring.
> > > > > > mempool_get - retrieves (dequeues) buffers from the ring.
> > > > > > mempool_put - puts them back (enqueues) to the ring
> > > > > > get() might fail (ENOMEM), while put is expected to always succeed.
> > > > But how does the thread which calls mempool_put() get hold of the memory
> > buffers
> > > > that
> > > > were obtained using mempool_get() by some other thread? Or this is not the
> > > > scenario you
> > > > are worrying about?
> > > > Is it rather that multiple threads independently call mempool_get() and then
> > > > mempool_put()
> > > > on their own buffers? And you are worried that a thread will fail to return
> > > > (mempool_put) a
> > > > buffer that it earlier allocated (mempool_get)? We could create a litmus test for
> > > > that.
> > >
> > >
> > > Both scenarios are possible.
> > > For Run-To-Completion model each thread usually does: allocate/use/free group
> > of mbufs.
> > > For pipleline model one thread can allocate bunch of mbufs, then pass them to
> > other
> > thread (via another ring for example) for further processing and then releasing.
> > In the pipeline model, if the last stage (thread) frees (enqueues) buffers onto some
> > ring buffer
> > and the first stage (thread) allocates (dequeues) buffers from the same ring buffer
> > but there
> > isn't any other type of synchronization between the threads, we can never guarantee
> > that
> > the first thread will be able to dequeue buffers because it doesn't know whether the
> > last
> > thread has enqueued any buffers.
>
>
> Yes, as I said above - for mempool use-case: dequeue can fail, enqueue should always succeed.
> The closest analogy: malloc() can fail, free() should never fail.
>
>
> >
> > However, enqueue ought to always succeed. We should be able to create a litmus
> > test for that.
> > Ring 1 is used as mempool, it initially contains capacity elements (full).
> > Ring 2 is used as pipe between stages 1 and 2, it initially contains 0 elements (empty).
> > Thread 1 allocates/dequeues a buffer from ring 1.
> > Thread 1 enqueues that buffer onto ring 2.
> > Thread 2 dequeues a buffer from ring 2.
> > Thread 2 frees/enqueues that buffer onto ring 1. <<< this must succeed!
> > Does this reflect the situation you worry about?
>
>
> This is one of the possible scenarios.
> As I said above - mempool_put() is expected to always be able to enqueue element to the ring.
> TBH, I am not sure what you are trying to prove with the litmus test.
With a litmus test, we can prove that a specific situation can or cannot occur when
executing the specified memory accesses in multiple threads. By experimenting with
different memory access sequences and orderings, we can find the most relaxed
sequence that still prohibits undesired results.

So with a litmus test, I could prove that enqueue in thread 2 above will succeed or
alternatively, may not succeed (which is undesired). And I could find out what orderings
are required to guarantee success.

> Looking at the changes you proposed:
>
>
> + /*
> + * Ensure the entries calculation was not based on a stale
> + * and unsafe stail observation that causes underflow.
> + */
> + if ((int)*entries < 0)
> + *entries = 0;
> +
>
>
> /* check that we have enough room in ring */
> if (unlikely(n > *entries))
> n = (behavior == RTE_RING_QUEUE_FIXED) ?
> 0 : *entries;
>
>
> *new_head = *old_head + n;
> if (n == 0)
> return 0;
>
>
> It is clear that with these changes enqueue/dequeue might fail even
> when there are available entries in the ring.
Without explicit synchronization between the threads, they will still race and
a consumer cannot be guaranteed to succeed with its dequeue operation, e.g.
the dequeue operation could occur before the (supposedly) matching enqueue
operation in another thread.

Using acquire/release for all ring buffer metadata accesses doesn't inform the
thread that its operation is guaranteed to succeed, it just ensures any data is
transferred (or passed ownership) in a safe way (establishes a happens-before
relation). But the ring buffer implementation itself cannot ensure that the
enqueue in thread P happens before the dequeue in thread C. The dequeue
can still fail and return 0 elements.

How do you define "there are available entries in the ring"? Just reading one
metadata variable doesn't tell you this. And I assume users usually don't access
internal ring buffer metadata. So how does a thread know there are available
entries in the ring that can be dequeued?

It seems to me that your perspective is still very much that of sequential
consistency and total order. But even so, you need synchronization (usually
based on memory accesses but there are other ways, e.g. using system calls)
to ensure that operation D in thread 2 is only initiated after operation E in
thread 1 has completed. Otherwise, the operations will race and the outcome
is not guaranteed.

- Ola

> One simple way that probably will introduce a loop instead of 'if':
> (keep reading head and tail values till we get a valid results)
> but again I am not sure it is a good way.
> Konstantin



IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

next prev parent reply	other threads:[~2025-09-24 15:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-15 18:54 [PATCH 0/1] ring: correct ordering issue in " Wathsala Vithanage
2025-09-15 18:54 ` [PATCH 1/1] ring: safe partial ordering for " Wathsala Vithanage
2025-09-16 15:42   ` Bruce Richardson
2025-09-16 18:19     ` Ola Liljedahl
2025-09-17  7:47       ` Bruce Richardson
2025-09-17 15:06         ` Stephen Hemminger
2025-09-18 17:40         ` Wathsala Vithanage
2025-09-16 22:57   ` Konstantin Ananyev
2025-09-16 23:08     ` Konstantin Ananyev
     [not found]     ` <2a611c3cf926d752a54b7655c27d6df874a2d0de.camel@arm.com>
2025-09-17  7:58       ` Konstantin Ananyev
2025-09-17  9:05         ` Ola Liljedahl
2025-09-20 12:01           ` Konstantin Ananyev
     [not found]             ` <cf7e14d4ba5e9d78fddf083b6c92d75942447931.camel@arm.com>
2025-09-22  7:12               ` Konstantin Ananyev
2025-09-23 21:57             ` Ola Liljedahl
2025-09-24  6:56               ` Konstantin Ananyev
2025-09-24  7:50                 ` Konstantin Ananyev
2025-09-24  8:51                   ` Ola Liljedahl
2025-09-24 10:08                     ` Konstantin Ananyev
2025-09-24 11:27                       ` Ola Liljedahl
2025-09-24 11:50                         ` Konstantin Ananyev
2025-09-24 13:28                           ` Ola Liljedahl [this message]
2025-09-24 15:03                             ` Konstantin Ananyev
2025-09-25  4:29                               ` Morten Brørup
2025-09-25  7:11                                 ` Konstantin Ananyev
2025-09-24 15:24               ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BD3A1BB1-B778-46DE-94BA-6E27DA4B0B08@arm.com \
    --to=ola.liljedahl@arm.com \
    --cc=Dhruv.Tripathi@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@huawei.com \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).