DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] ring: count and empty optimizations
@ 2020-04-28 13:53 Morten Brørup
  2020-04-29 13:38 ` Olivier Matz
  0 siblings, 1 reply; 6+ messages in thread
From: Morten Brørup @ 2020-04-28 13:53 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

Olivier (maintainer of the Ring),

I would like to suggest a couple of minor optimizations to the ring library.


1. Testing if the ring is empty is as simple as comparing the producer and consumer pointers:

static inline int
rte_ring_empty(const struct rte_ring *r)
{
-	return rte_ring_count(r) == 0;
+	uint32_t prod_tail = r->prod.tail;
+	uint32_t cons_tail = r->cons.tail;
+	return cons_tail == prod_tail;
}

In theory, this optimization reduces the number of potential cache misses from 3 to 2 by not having to read r->mask in rte_ring_count().


2. It is not possible to enqueue more elements than the capacity of a ring, so the count function does not need to test if the capacity is exceeded:

static inline unsigned
rte_ring_count(const struct rte_ring *r)
{
	uint32_t prod_tail = r->prod.tail;
	uint32_t cons_tail = r->cons.tail;
	uint32_t count = (prod_tail - cons_tail) & r->mask;
-	return (count > r->capacity) ? r->capacity : count;
+ 	return count;
}

I cannot even come up with a race condition in this function where the count would exceed the capacity. Maybe I missed something?


Med venlig hilsen / kind regards
- Morten Brørup


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ring: count and empty optimizations
  2020-04-28 13:53 [dpdk-dev] [RFC] ring: count and empty optimizations Morten Brørup
@ 2020-04-29 13:38 ` Olivier Matz
  2020-04-30  1:12   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 6+ messages in thread
From: Olivier Matz @ 2020-04-29 13:38 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Ananyev, Konstantin, Honnappa Nagarahalli

Hi Morten,

On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> Olivier (maintainer of the Ring),

I'm not anymore, CC'ing Konstantin and Honnappa.

> I would like to suggest a couple of minor optimizations to the ring library.
> 
> 
> 1. Testing if the ring is empty is as simple as comparing the producer and consumer pointers:
> 
> static inline int
> rte_ring_empty(const struct rte_ring *r)
> {
> -	return rte_ring_count(r) == 0;
> +	uint32_t prod_tail = r->prod.tail;
> +	uint32_t cons_tail = r->cons.tail;
> +	return cons_tail == prod_tail;
> }
> 
> In theory, this optimization reduces the number of potential cache misses from 3 to 2 by not having to read r->mask in rte_ring_count().

This one looks correct to me.


> 2. It is not possible to enqueue more elements than the capacity of a ring, so the count function does not need to test if the capacity is exceeded:
> 
> static inline unsigned
> rte_ring_count(const struct rte_ring *r)
> {
> 	uint32_t prod_tail = r->prod.tail;
> 	uint32_t cons_tail = r->cons.tail;
> 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> -	return (count > r->capacity) ? r->capacity : count;
> + 	return count;
> }
> 
> I cannot even come up with a race condition in this function where the count would exceed the capacity. Maybe I missed something?

Since there is no memory barrier, the order in which prod_tail and
cons_tail are fetched is not guaranteed. Or the thread could be
interrupted by the kernel in between.

This function may probably return invalid results in MC/MP cases.
We just ensure that the result is between 0 and r->capacity.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ring: count and empty optimizations
  2020-04-29 13:38 ` Olivier Matz
@ 2020-04-30  1:12   ` Honnappa Nagarahalli
  2020-04-30  9:19     ` Morten Brørup
  0 siblings, 1 reply; 6+ messages in thread
From: Honnappa Nagarahalli @ 2020-04-30  1:12 UTC (permalink / raw)
  To: Olivier Matz, Morten Brørup
  Cc: dev, Ananyev, Konstantin, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Hi Morten,
> 
> On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > Olivier (maintainer of the Ring),
> 
> I'm not anymore, CC'ing Konstantin and Honnappa.
> 
> > I would like to suggest a couple of minor optimizations to the ring library.
> >
> >
> > 1. Testing if the ring is empty is as simple as comparing the producer and
> consumer pointers:
> >
> > static inline int
> > rte_ring_empty(const struct rte_ring *r) {
> > -	return rte_ring_count(r) == 0;
> > +	uint32_t prod_tail = r->prod.tail;
> > +	uint32_t cons_tail = r->cons.tail;
> > +	return cons_tail == prod_tail;
> > }
> >
> > In theory, this optimization reduces the number of potential cache misses
> from 3 to 2 by not having to read r->mask in rte_ring_count().
> 
> This one looks correct to me.
> 
> 
> > 2. It is not possible to enqueue more elements than the capacity of a ring,
> so the count function does not need to test if the capacity is exceeded:
> >
> > static inline unsigned
> > rte_ring_count(const struct rte_ring *r) {
> > 	uint32_t prod_tail = r->prod.tail;
> > 	uint32_t cons_tail = r->cons.tail;
> > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > -	return (count > r->capacity) ? r->capacity : count;
> > + 	return count;
> > }
> >
> > I cannot even come up with a race condition in this function where the
> count would exceed the capacity. Maybe I missed something?
> 
> Since there is no memory barrier, the order in which prod_tail and cons_tail
> are fetched is not guaranteed. Or the thread could be interrupted by the
> kernel in between.
The '__rte_ring_move_prod_head' function ensures that the distance between prod.head and cons.tail is always within the capacity irrespective of whether the consumers/producers are sleeping.

> 
> This function may probably return invalid results in MC/MP cases.
> We just ensure that the result is between 0 and r->capacity.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ring: count and empty optimizations
  2020-04-30  1:12   ` Honnappa Nagarahalli
@ 2020-04-30  9:19     ` Morten Brørup
  2020-04-30 15:36       ` Ananyev, Konstantin
  0 siblings, 1 reply; 6+ messages in thread
From: Morten Brørup @ 2020-04-30  9:19 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Olivier Matz; +Cc: dev, Ananyev, Konstantin, nd

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Thursday, April 30, 2020 3:12 AM
> 
> <snip>
> 
> >
> > Hi Morten,
> >
> > On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > > Olivier (maintainer of the Ring),
> >
> > I'm not anymore, CC'ing Konstantin and Honnappa.
> >
> > > I would like to suggest a couple of minor optimizations to the ring
> library.
> > >
> > >
> > > 1. Testing if the ring is empty is as simple as comparing the
> producer and
> > consumer pointers:
> > >
> > > static inline int
> > > rte_ring_empty(const struct rte_ring *r) {
> > > -	return rte_ring_count(r) == 0;
> > > +	uint32_t prod_tail = r->prod.tail;
> > > +	uint32_t cons_tail = r->cons.tail;
> > > +	return cons_tail == prod_tail;
> > > }
> > >
> > > In theory, this optimization reduces the number of potential cache
> misses
> > from 3 to 2 by not having to read r->mask in rte_ring_count().
> >
> > This one looks correct to me.
> >
> >
> > > 2. It is not possible to enqueue more elements than the capacity of
> a ring,
> > so the count function does not need to test if the capacity is
> exceeded:
> > >
> > > static inline unsigned
> > > rte_ring_count(const struct rte_ring *r) {
> > > 	uint32_t prod_tail = r->prod.tail;
> > > 	uint32_t cons_tail = r->cons.tail;
> > > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > -	return (count > r->capacity) ? r->capacity : count;
> > > + 	return count;
> > > }
> > >
> > > I cannot even come up with a race condition in this function where
> the
> > count would exceed the capacity. Maybe I missed something?
> >
> > Since there is no memory barrier, the order in which prod_tail and
> cons_tail
> > are fetched is not guaranteed. Or the thread could be interrupted by
> the
> > kernel in between.
> The '__rte_ring_move_prod_head' function ensures that the distance
> between prod.head and cons.tail is always within the capacity
> irrespective of whether the consumers/producers are sleeping.

Yes, this is exactly what I was thinking.

The tails are the pointers after any updates, which is shown very nicely in the documentation.
And certainly prod.tail will not move further ahead than prod.head.

So it makes sense using the tails outside the functions that move the pointers.

Olivier found the race I couldn't find:
1. The thread calls rte_ring_count(), and since there is no memory barrier it reads cons.tail, and has not yet read prod.tail.
2. Other threads use the ring simultaneously. A consumer thread moves cons.tail ahead and a producer thread then moves prod.tail ahead. Note: Without first moving cons.tail ahead, prod.tail cannot move too far ahead.
3. The thread proceeds to read prod.tail. Now the count is completely incorrect, and may even exceed the capacity.

Olivier pointed out that this could happen if the rte_ring_count thread is interrupted by the kernel, and I agree. However, intuitively I don't think that it can happen in a non-EAL thread, because the consumer thread needs to finish moving cons.tail before the producer thread can move prod.tail too far ahead. And by then the rte_ring_count thread has had plenty of time to read prod.tail. But it could happen in theory.

> > This function may probably return invalid results in MC/MP cases.
> > We just ensure that the result is between 0 and r->capacity.

So should we update the documentation to say that it might return an incorrect count (if there is a race), or should we fix the function to always provide a correct value?

Furthermore, the same race condition probably affects rte_ring_empty() similarly, even in my improved version.

And do these functions need to support non-EAL threads? I don't think so. What do you think?


-Morten


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ring: count and empty optimizations
  2020-04-30  9:19     ` Morten Brørup
@ 2020-04-30 15:36       ` Ananyev, Konstantin
  2020-04-30 21:38         ` Honnappa Nagarahalli
  0 siblings, 1 reply; 6+ messages in thread
From: Ananyev, Konstantin @ 2020-04-30 15:36 UTC (permalink / raw)
  To: Morten Brørup, Honnappa Nagarahalli, Olivier Matz; +Cc: dev, nd

> 
> > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > Sent: Thursday, April 30, 2020 3:12 AM
> >
> > <snip>
> >
> > >
> > > Hi Morten,
> > >
> > > On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > > > Olivier (maintainer of the Ring),
> > >
> > > I'm not anymore, CC'ing Konstantin and Honnappa.
> > >
> > > > I would like to suggest a couple of minor optimizations to the ring
> > library.
> > > >
> > > >
> > > > 1. Testing if the ring is empty is as simple as comparing the
> > producer and
> > > consumer pointers:
> > > >
> > > > static inline int
> > > > rte_ring_empty(const struct rte_ring *r) {
> > > > -	return rte_ring_count(r) == 0;
> > > > +	uint32_t prod_tail = r->prod.tail;
> > > > +	uint32_t cons_tail = r->cons.tail;
> > > > +	return cons_tail == prod_tail;
> > > > }
> > > >
> > > > In theory, this optimization reduces the number of potential cache
> > misses
> > > from 3 to 2 by not having to read r->mask in rte_ring_count().
> > >
> > > This one looks correct to me.
> > >
> > >
> > > > 2. It is not possible to enqueue more elements than the capacity of
> > a ring,
> > > so the count function does not need to test if the capacity is
> > exceeded:
> > > >
> > > > static inline unsigned
> > > > rte_ring_count(const struct rte_ring *r) {
> > > > 	uint32_t prod_tail = r->prod.tail;
> > > > 	uint32_t cons_tail = r->cons.tail;
> > > > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > > -	return (count > r->capacity) ? r->capacity : count;
> > > > + 	return count;
> > > > }
> > > >
> > > > I cannot even come up with a race condition in this function where
> > the
> > > count would exceed the capacity. Maybe I missed something?
> > >
> > > Since there is no memory barrier, the order in which prod_tail and
> > cons_tail
> > > are fetched is not guaranteed. Or the thread could be interrupted by
> > the
> > > kernel in between.
> > The '__rte_ring_move_prod_head' function ensures that the distance
> > between prod.head and cons.tail is always within the capacity
> > irrespective of whether the consumers/producers are sleeping.
> 
> Yes, this is exactly what I was thinking.
> 
> The tails are the pointers after any updates, which is shown very nicely in the documentation.
> And certainly prod.tail will not move further ahead than prod.head.
> 
> So it makes sense using the tails outside the functions that move the pointers.
> 
> Olivier found the race I couldn't find:
> 1. The thread calls rte_ring_count(), and since there is no memory barrier it reads cons.tail, and has not yet read prod.tail.
> 2. Other threads use the ring simultaneously. A consumer thread moves cons.tail ahead and a producer thread then moves prod.tail ahead.
> Note: Without first moving cons.tail ahead, prod.tail cannot move too far ahead.
> 3. The thread proceeds to read prod.tail. Now the count is completely incorrect, and may even exceed the capacity.
> 
> Olivier pointed out that this could happen if the rte_ring_count thread is interrupted by the kernel, and I agree. However, intuitively I don't
> think that it can happen in a non-EAL thread, because the consumer thread needs to finish moving cons.tail before the producer thread can
> move prod.tail too far ahead. And by then the rte_ring_count thread has had plenty of time to read prod.tail. But it could happen in theory.
> 
> > > This function may probably return invalid results in MC/MP cases.
> > > We just ensure that the result is between 0 and r->capacity.
> 
> So should we update the documentation to say that it might return an incorrect count (if there is a race), or should we fix the function to
> always provide a correct value?

As long as you invoke rte_ring_count() while there are other
active producers/consumers for that ring -
it's return value can always be outdated, and not reflect current ring state.
So I think just updating the doc is enough.

> 
> Furthermore, the same race condition probably affects rte_ring_empty() similarly, even in my improved version.
> 
> And do these functions need to support non-EAL threads? I don't think so. What do you think?

Not sure why you differ EAL and non EAL threads here.
The only difference between them from scheduler point of view -
EAL threads have cpu affinity set by rte_eal_init().
But nothing prevents user from setting/updating cpu affinity
for any thread in his process in a way he likes.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ring: count and empty optimizations
  2020-04-30 15:36       ` Ananyev, Konstantin
@ 2020-04-30 21:38         ` Honnappa Nagarahalli
  0 siblings, 0 replies; 6+ messages in thread
From: Honnappa Nagarahalli @ 2020-04-30 21:38 UTC (permalink / raw)
  To: Ananyev, Konstantin, Morten Brørup, Olivier Matz
  Cc: dev, nd, Honnappa Nagarahalli, nd

> > > <snip>
> > >
> > > >
> > > > Hi Morten,
> > > >
> > > > On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > > > > Olivier (maintainer of the Ring),
> > > >
> > > > I'm not anymore, CC'ing Konstantin and Honnappa.
> > > >
> > > > > I would like to suggest a couple of minor optimizations to the
> > > > > ring
> > > library.
> > > > >
> > > > >
> > > > > 1. Testing if the ring is empty is as simple as comparing the
> > > producer and
> > > > consumer pointers:
> > > > >
> > > > > static inline int
> > > > > rte_ring_empty(const struct rte_ring *r) {
> > > > > -	return rte_ring_count(r) == 0;
> > > > > +	uint32_t prod_tail = r->prod.tail;
> > > > > +	uint32_t cons_tail = r->cons.tail;
> > > > > +	return cons_tail == prod_tail;
> > > > > }
> > > > >
> > > > > In theory, this optimization reduces the number of potential
> > > > > cache
> > > misses
> > > > from 3 to 2 by not having to read r->mask in rte_ring_count().
> > > >
> > > > This one looks correct to me.
> > > >
> > > >
> > > > > 2. It is not possible to enqueue more elements than the capacity
> > > > > of
> > > a ring,
> > > > so the count function does not need to test if the capacity is
> > > exceeded:
> > > > >
> > > > > static inline unsigned
> > > > > rte_ring_count(const struct rte_ring *r) {
> > > > > 	uint32_t prod_tail = r->prod.tail;
> > > > > 	uint32_t cons_tail = r->cons.tail;
> > > > > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > > > -	return (count > r->capacity) ? r->capacity : count;
> > > > > + 	return count;
> > > > > }
> > > > >
> > > > > I cannot even come up with a race condition in this function
> > > > > where
> > > the
> > > > count would exceed the capacity. Maybe I missed something?
> > > >
> > > > Since there is no memory barrier, the order in which prod_tail and
> > > cons_tail
> > > > are fetched is not guaranteed. Or the thread could be interrupted
> > > > by
> > > the
> > > > kernel in between.
> > > The '__rte_ring_move_prod_head' function ensures that the distance
> > > between prod.head and cons.tail is always within the capacity
> > > irrespective of whether the consumers/producers are sleeping.
> >
> > Yes, this is exactly what I was thinking.
> >
> > The tails are the pointers after any updates, which is shown very nicely in
> the documentation.
> > And certainly prod.tail will not move further ahead than prod.head.
> >
> > So it makes sense using the tails outside the functions that move the
> pointers.
> >
> > Olivier found the race I couldn't find:
> > 1. The thread calls rte_ring_count(), and since there is no memory barrier it
> reads cons.tail, and has not yet read prod.tail.
> > 2. Other threads use the ring simultaneously. A consumer thread moves
> cons.tail ahead and a producer thread then moves prod.tail ahead.
> > Note: Without first moving cons.tail ahead, prod.tail cannot move too far
> ahead.
> > 3. The thread proceeds to read prod.tail. Now the count is completely
> incorrect, and may even exceed the capacity.
> >
> > Olivier pointed out that this could happen if the rte_ring_count
> > thread is interrupted by the kernel, and I agree. However, intuitively
> > I don't think that it can happen in a non-EAL thread, because the consumer
> thread needs to finish moving cons.tail before the producer thread can move
> prod.tail too far ahead. And by then the rte_ring_count thread has had plenty
> of time to read prod.tail. But it could happen in theory.
The issues associated with using rte_ring on non-EAL threads are documented at [1]. Knowing these issues, I think these functions should not pretend to keep the values correct which will save some cycles.

[1] https://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#known-issues

> >
> > > > This function may probably return invalid results in MC/MP cases.
> > > > We just ensure that the result is between 0 and r->capacity.
> >
> > So should we update the documentation to say that it might return an
> > incorrect count (if there is a race), or should we fix the function to always
> provide a correct value?
I cannot think of a way to fix these functions for non-EAL observer threads (threads from where these functions are called).
For EAL observer threads, the data is correct at the point of observation in the code (may be not at the point of use of this data)

> 
> As long as you invoke rte_ring_count() while there are other active
> producers/consumers for that ring - it's return value can always be outdated,
> and not reflect current ring state.
> So I think just updating the doc is enough.
> 
> >
> > Furthermore, the same race condition probably affects rte_ring_empty()
> similarly, even in my improved version.
Yes

> >
> > And do these functions need to support non-EAL threads? I don't think so.
> What do you think?
> 
> Not sure why you differ EAL and non EAL threads here.
> The only difference between them from scheduler point of view - EAL threads
> have cpu affinity set by rte_eal_init().
> But nothing prevents user from setting/updating cpu affinity for any thread in
> his process in a way he likes.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-30 21:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-28 13:53 [dpdk-dev] [RFC] ring: count and empty optimizations Morten Brørup
2020-04-29 13:38 ` Olivier Matz
2020-04-30  1:12   ` Honnappa Nagarahalli
2020-04-30  9:19     ` Morten Brørup
2020-04-30 15:36       ` Ananyev, Konstantin
2020-04-30 21:38         ` Honnappa Nagarahalli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).