DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Olivier Matz" <olivier.matz@6wind.com>
Cc: <dev@dpdk.org>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"nd" <nd@arm.com>
Subject: Re: [dpdk-dev] [RFC] ring: count and empty optimizations
Date: Thu, 30 Apr 2020 11:19:18 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C60F98@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBBPR08MB4646B574E03820EC2A8ACD4998AA0@DBBPR08MB4646.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Thursday, April 30, 2020 3:12 AM
> 
> <snip>
> 
> >
> > Hi Morten,
> >
> > On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > > Olivier (maintainer of the Ring),
> >
> > I'm not anymore, CC'ing Konstantin and Honnappa.
> >
> > > I would like to suggest a couple of minor optimizations to the ring
> library.
> > >
> > >
> > > 1. Testing if the ring is empty is as simple as comparing the
> producer and
> > consumer pointers:
> > >
> > > static inline int
> > > rte_ring_empty(const struct rte_ring *r) {
> > > -	return rte_ring_count(r) == 0;
> > > +	uint32_t prod_tail = r->prod.tail;
> > > +	uint32_t cons_tail = r->cons.tail;
> > > +	return cons_tail == prod_tail;
> > > }
> > >
> > > In theory, this optimization reduces the number of potential cache
> misses
> > from 3 to 2 by not having to read r->mask in rte_ring_count().
> >
> > This one looks correct to me.
> >
> >
> > > 2. It is not possible to enqueue more elements than the capacity of
> a ring,
> > so the count function does not need to test if the capacity is
> exceeded:
> > >
> > > static inline unsigned
> > > rte_ring_count(const struct rte_ring *r) {
> > > 	uint32_t prod_tail = r->prod.tail;
> > > 	uint32_t cons_tail = r->cons.tail;
> > > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > -	return (count > r->capacity) ? r->capacity : count;
> > > + 	return count;
> > > }
> > >
> > > I cannot even come up with a race condition in this function where
> the
> > count would exceed the capacity. Maybe I missed something?
> >
> > Since there is no memory barrier, the order in which prod_tail and
> cons_tail
> > are fetched is not guaranteed. Or the thread could be interrupted by
> the
> > kernel in between.
> The '__rte_ring_move_prod_head' function ensures that the distance
> between prod.head and cons.tail is always within the capacity
> irrespective of whether the consumers/producers are sleeping.

Yes, this is exactly what I was thinking.

The tails are the pointers after any updates, which is shown very nicely in the documentation.
And certainly prod.tail will not move further ahead than prod.head.

So it makes sense using the tails outside the functions that move the pointers.

Olivier found the race I couldn't find:
1. The thread calls rte_ring_count(), and since there is no memory barrier it reads cons.tail, and has not yet read prod.tail.
2. Other threads use the ring simultaneously. A consumer thread moves cons.tail ahead and a producer thread then moves prod.tail ahead. Note: Without first moving cons.tail ahead, prod.tail cannot move too far ahead.
3. The thread proceeds to read prod.tail. Now the count is completely incorrect, and may even exceed the capacity.

Olivier pointed out that this could happen if the rte_ring_count thread is interrupted by the kernel, and I agree. However, intuitively I don't think that it can happen in a non-EAL thread, because the consumer thread needs to finish moving cons.tail before the producer thread can move prod.tail too far ahead. And by then the rte_ring_count thread has had plenty of time to read prod.tail. But it could happen in theory.

> > This function may probably return invalid results in MC/MP cases.
> > We just ensure that the result is between 0 and r->capacity.

So should we update the documentation to say that it might return an incorrect count (if there is a race), or should we fix the function to always provide a correct value?

Furthermore, the same race condition probably affects rte_ring_empty() similarly, even in my improved version.

And do these functions need to support non-EAL threads? I don't think so. What do you think?


-Morten


  reply	other threads:[~2020-04-30  9:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28 13:53 Morten Brørup
2020-04-29 13:38 ` Olivier Matz
2020-04-30  1:12   ` Honnappa Nagarahalli
2020-04-30  9:19     ` Morten Brørup [this message]
2020-04-30 15:36       ` Ananyev, Konstantin
2020-04-30 21:38         ` Honnappa Nagarahalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C60F98@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).