* RE: [PATCH 1/1] ring: fix off by 1 mistake
2022-01-03 14:22 ` [PATCH 1/1] ring: fix off by 1 mistake Andrzej Ostruszka
@ 2022-01-03 14:56 ` Morten Brørup
2022-01-06 10:45 ` Olivier Matz
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Morten Brørup @ 2022-01-03 14:56 UTC (permalink / raw)
To: Andrzej Ostruszka, Honnappa Nagarahalli, Konstantin Ananyev; +Cc: upstream, dev
+Ring queue maintainers: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>, Konstantin Ananyev <konstantin.ananyev@intel.com>
> From: Andrzej Ostruszka [mailto:amo@semihalf.com]
> Sent: Monday, 3 January 2022 15.22
>
> When enqueueing/dequeueing to/from the ring we try to optimize by
> manual
> loop unrolling. The check for this optimization looks like:
>
> if (likely(idx + n < size)) {
>
> where 'idx' points to the first usable element (empty slot for enqueue,
> data for dequeue). The correct comparison here should be '<=' instead
> of '<'.
>
> This is not a functional error since we fall back to the loop with
> correct checks on indexes. Just a minor suboptimal behaviour for the
> case when we want to enqueue/dequeue exactly the number of elements
> that
> we have in the ring before wrapping to its beginning.
>
> Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
> ---
> lib/ring/rte_ring_elem_pvt.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ring/rte_ring_elem_pvt.h
> b/lib/ring/rte_ring_elem_pvt.h
> index 275ec55393..83788c56e6 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, const
> uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> const uint32_t *obj = (const uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> const unaligned_uint64_t *obj = (const unaligned_uint64_t
> *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> const rte_int128_t *obj = (const rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(ring + idx),
> (const void *)(obj + i), 32);
> @@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r,
> const uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> uint32_t *obj = (uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> rte_int128_t *obj = (rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(obj + i), (void *)(ring + idx), 32);
> switch (n & 0x1) {
> --
> 2.34.1.448.ga2b2bfdf31-goog
>
Well spotted! I took a very good look at it, and came to the same conclusion: It not a functional bug; the only consequence is that the optimized code path may not be taken in a situation where it could be taken. But it should be fixed as suggested in your patch.
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ring: fix off by 1 mistake
2022-01-03 14:22 ` [PATCH 1/1] ring: fix off by 1 mistake Andrzej Ostruszka
2022-01-03 14:56 ` Morten Brørup
@ 2022-01-06 10:45 ` Olivier Matz
2022-01-10 15:09 ` Ananyev, Konstantin
2022-01-11 11:37 ` [PATCH] ring: optimize corner case for enqueue/dequeue Andrzej Ostruszka
3 siblings, 0 replies; 11+ messages in thread
From: Olivier Matz @ 2022-01-06 10:45 UTC (permalink / raw)
To: Andrzej Ostruszka; +Cc: dev, upstream
Hi Andrzej,
On Mon, Jan 03, 2022 at 03:22:01PM +0100, Andrzej Ostruszka wrote:
> ring: fix off by 1 mistake
I suggest something less scary for the title:
ring: optimize corner case for enqueue/dequeue
> When enqueueing/dequeueing to/from the ring we try to optimize by manual
> loop unrolling. The check for this optimization looks like:
>
> if (likely(idx + n < size)) {
>
> where 'idx' points to the first usable element (empty slot for enqueue,
> data for dequeue). The correct comparison here should be '<=' instead
> of '<'.
>
> This is not a functional error since we fall back to the loop with
> correct checks on indexes. Just a minor suboptimal behaviour for the
> case when we want to enqueue/dequeue exactly the number of elements that
> we have in the ring before wrapping to its beginning.
>
> Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
I'll tend to add:
Fixes: cc4b218790f6 ("ring: support configurable element size")
But the same error was in the ENQUEUE_PTRS()/DEQUEUE_PTRS() macros
since the beginning, so we may also add:
Fixes: 286bd05bf70d ("ring: optimisations")
This macro was removed in commit 2d6ed071a8b9 ("ring: use custom element
for fixed size API")
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 1/1] ring: fix off by 1 mistake
2022-01-03 14:22 ` [PATCH 1/1] ring: fix off by 1 mistake Andrzej Ostruszka
2022-01-03 14:56 ` Morten Brørup
2022-01-06 10:45 ` Olivier Matz
@ 2022-01-10 15:09 ` Ananyev, Konstantin
2022-01-11 11:49 ` Andrzej Ostruszka
2022-01-11 11:37 ` [PATCH] ring: optimize corner case for enqueue/dequeue Andrzej Ostruszka
3 siblings, 1 reply; 11+ messages in thread
From: Ananyev, Konstantin @ 2022-01-10 15:09 UTC (permalink / raw)
To: Andrzej Ostruszka, dev; +Cc: upstream
> When enqueueing/dequeueing to/from the ring we try to optimize by manual
> loop unrolling. The check for this optimization looks like:
>
> if (likely(idx + n < size)) {
>
> where 'idx' points to the first usable element (empty slot for enqueue,
> data for dequeue). The correct comparison here should be '<=' instead
> of '<'.
>
> This is not a functional error since we fall back to the loop with
> correct checks on indexes. Just a minor suboptimal behaviour for the
> case when we want to enqueue/dequeue exactly the number of elements that
> we have in the ring before wrapping to its beginning.
>
> Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
> ---
> lib/ring/rte_ring_elem_pvt.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
> index 275ec55393..83788c56e6 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, const uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> const uint32_t *obj = (const uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r, uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> const unaligned_uint64_t *obj = (const unaligned_uint64_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r, uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> const rte_int128_t *obj = (const rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(ring + idx),
> (const void *)(obj + i), 32);
> @@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r, const uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> uint32_t *obj = (uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r, uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r, uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> rte_int128_t *obj = (rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(obj + i), (void *)(ring + idx), 32);
> switch (n & 0x1) {
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] ring: optimize corner case for enqueue/dequeue
2022-01-03 14:22 ` [PATCH 1/1] ring: fix off by 1 mistake Andrzej Ostruszka
` (2 preceding siblings ...)
2022-01-10 15:09 ` Ananyev, Konstantin
@ 2022-01-11 11:37 ` Andrzej Ostruszka
2022-01-11 12:00 ` Morten Brørup
2022-02-05 16:48 ` David Marchand
3 siblings, 2 replies; 11+ messages in thread
From: Andrzej Ostruszka @ 2022-01-11 11:37 UTC (permalink / raw)
To: dev; +Cc: Andrzej Ostruszka, Olivier Matz, Konstantin Ananyev
When enqueueing/dequeueing to/from the ring we try to optimize by manual
loop unrolling. The check for this optimization looks like:
if (likely(idx + n < size)) {
where 'idx' points to the first usable element (empty slot for enqueue,
data for dequeue). The correct comparison here should be '<=' instead
of '<'.
This is not a functional error since we fall back to the loop with
correct checks on indexes. Just a minor suboptimal behaviour for the
case when we want to enqueue/dequeue exactly the number of elements that
we have in the ring before wrapping to its beginning.
Fixes: cc4b218790f6 ("ring: support configurable element size")
Fixes: 286bd05bf70d ("ring: optimisations")
Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
lib/ring/rte_ring_elem_pvt.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
index 275ec55393..83788c56e6 100644
--- a/lib/ring/rte_ring_elem_pvt.h
+++ b/lib/ring/rte_ring_elem_pvt.h
@@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, const uint32_t size,
unsigned int i;
uint32_t *ring = (uint32_t *)&r[1];
const uint32_t *obj = (const uint32_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
ring[idx] = obj[i];
ring[idx + 1] = obj[i + 1];
@@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r, uint32_t prod_head,
uint32_t idx = prod_head & r->mask;
uint64_t *ring = (uint64_t *)&r[1];
const unaligned_uint64_t *obj = (const unaligned_uint64_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
ring[idx] = obj[i];
ring[idx + 1] = obj[i + 1];
@@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r, uint32_t prod_head,
uint32_t idx = prod_head & r->mask;
rte_int128_t *ring = (rte_int128_t *)&r[1];
const rte_int128_t *obj = (const rte_int128_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
memcpy((void *)(ring + idx),
(const void *)(obj + i), 32);
@@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r, const uint32_t size,
unsigned int i;
uint32_t *ring = (uint32_t *)&r[1];
uint32_t *obj = (uint32_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
obj[i] = ring[idx];
obj[i + 1] = ring[idx + 1];
@@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r, uint32_t prod_head,
uint32_t idx = prod_head & r->mask;
uint64_t *ring = (uint64_t *)&r[1];
unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
obj[i] = ring[idx];
obj[i + 1] = ring[idx + 1];
@@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r, uint32_t prod_head,
uint32_t idx = prod_head & r->mask;
rte_int128_t *ring = (rte_int128_t *)&r[1];
rte_int128_t *obj = (rte_int128_t *)obj_table;
- if (likely(idx + n < size)) {
+ if (likely(idx + n <= size)) {
for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
memcpy((void *)(obj + i), (void *)(ring + idx), 32);
switch (n & 0x1) {
--
2.34.1.575.g55b058a8bb-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH] ring: optimize corner case for enqueue/dequeue
2022-01-11 11:37 ` [PATCH] ring: optimize corner case for enqueue/dequeue Andrzej Ostruszka
@ 2022-01-11 12:00 ` Morten Brørup
2022-01-11 13:46 ` Andrzej Ostruszka
2022-02-05 16:48 ` David Marchand
1 sibling, 1 reply; 11+ messages in thread
From: Morten Brørup @ 2022-01-11 12:00 UTC (permalink / raw)
To: Andrzej Ostruszka, dev; +Cc: Olivier Matz, Konstantin Ananyev
> From: Andrzej Ostruszka [mailto:amo@semihalf.com]
> Sent: Tuesday, 11 January 2022 12.38
>
> When enqueueing/dequeueing to/from the ring we try to optimize by
> manual
> loop unrolling. The check for this optimization looks like:
>
> if (likely(idx + n < size)) {
>
> where 'idx' points to the first usable element (empty slot for enqueue,
> data for dequeue). The correct comparison here should be '<=' instead
> of '<'.
>
> This is not a functional error since we fall back to the loop with
> correct checks on indexes. Just a minor suboptimal behaviour for the
> case when we want to enqueue/dequeue exactly the number of elements
> that
> we have in the ring before wrapping to its beginning.
>
> Fixes: cc4b218790f6 ("ring: support configurable element size")
> Fixes: 286bd05bf70d ("ring: optimisations")
>
> Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---
> lib/ring/rte_ring_elem_pvt.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ring/rte_ring_elem_pvt.h
> b/lib/ring/rte_ring_elem_pvt.h
> index 275ec55393..83788c56e6 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, const
> uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> const uint32_t *obj = (const uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> const unaligned_uint64_t *obj = (const unaligned_uint64_t
> *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> ring[idx] = obj[i];
> ring[idx + 1] = obj[i + 1];
> @@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> const rte_int128_t *obj = (const rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(ring + idx),
> (const void *)(obj + i), 32);
> @@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r,
> const uint32_t size,
> unsigned int i;
> uint32_t *ring = (uint32_t *)&r[1];
> uint32_t *obj = (uint32_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x7); i += 8, idx += 8) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> uint64_t *ring = (uint64_t *)&r[1];
> unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> obj[i] = ring[idx];
> obj[i + 1] = ring[idx + 1];
> @@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r,
> uint32_t prod_head,
> uint32_t idx = prod_head & r->mask;
> rte_int128_t *ring = (rte_int128_t *)&r[1];
> rte_int128_t *obj = (rte_int128_t *)obj_table;
> - if (likely(idx + n < size)) {
> + if (likely(idx + n <= size)) {
> for (i = 0; i < (n & ~0x1); i += 2, idx += 2)
> memcpy((void *)(obj + i), (void *)(ring + idx), 32);
> switch (n & 0x1) {
> --
> 2.34.1.575.g55b058a8bb-goog
>
Also this version of the patch:
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ring: optimize corner case for enqueue/dequeue
2022-01-11 12:00 ` Morten Brørup
@ 2022-01-11 13:46 ` Andrzej Ostruszka
2022-01-11 13:53 ` Morten Brørup
0 siblings, 1 reply; 11+ messages in thread
From: Andrzej Ostruszka @ 2022-01-11 13:46 UTC (permalink / raw)
To: Morten Brørup; +Cc: dev
On Tue, Jan 11, 2022 at 01:00:25PM +0100, Morten Brørup wrote:
[...]
> Also this version of the patch:
>
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
My apologies Morten, don't know how I missed your tag.
With regards
Andrzej Ostruszka
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH] ring: optimize corner case for enqueue/dequeue
2022-01-11 13:46 ` Andrzej Ostruszka
@ 2022-01-11 13:53 ` Morten Brørup
0 siblings, 0 replies; 11+ messages in thread
From: Morten Brørup @ 2022-01-11 13:53 UTC (permalink / raw)
To: Andrzej Ostruszka; +Cc: dev
> From: Andrzej Ostruszka [mailto:amo@semihalf.com]
> Sent: Tuesday, 11 January 2022 14.46
>
> On Tue, Jan 11, 2022 at 01:00:25PM +0100, Morten Brørup wrote:
> [...]
> > Also this version of the patch:
> >
> > Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
>
> My apologies Morten, don't know how I missed your tag.
>
> With regards
> Andrzej Ostruszka
No worries. :-)
I added the tag again, for Patchwork to show a sufficient level of acceptance when the maintainer considers the patch for merging.
-Morten
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ring: optimize corner case for enqueue/dequeue
2022-01-11 11:37 ` [PATCH] ring: optimize corner case for enqueue/dequeue Andrzej Ostruszka
2022-01-11 12:00 ` Morten Brørup
@ 2022-02-05 16:48 ` David Marchand
1 sibling, 0 replies; 11+ messages in thread
From: David Marchand @ 2022-02-05 16:48 UTC (permalink / raw)
To: Andrzej Ostruszka; +Cc: dev, Olivier Matz, Konstantin Ananyev
On Tue, Jan 11, 2022 at 12:38 PM Andrzej Ostruszka <amo@semihalf.com> wrote:
> When enqueueing/dequeueing to/from the ring we try to optimize by manual
> loop unrolling. The check for this optimization looks like:
>
> if (likely(idx + n < size)) {
>
> where 'idx' points to the first usable element (empty slot for enqueue,
> data for dequeue). The correct comparison here should be '<=' instead
> of '<'.
>
> This is not a functional error since we fall back to the loop with
> correct checks on indexes. Just a minor suboptimal behaviour for the
> case when we want to enqueue/dequeue exactly the number of elements that
> we have in the ring before wrapping to its beginning.
>
> Fixes: cc4b218790f6 ("ring: support configurable element size")
> Fixes: 286bd05bf70d ("ring: optimisations")
>
> Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Applied, thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 11+ messages in thread