DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring
@ 2020-04-24  7:07 Gavin Hu
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
                   ` (7 more replies)
  0 siblings, 8 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-24  7:07 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.

[1] http://patches.dpdk.org/cover/62703/

Gavin Hu (2):
  spinlock: use wfe to reduce contention on aarch64
  ring: use wfe to wait for ring tail update on aarch64

 lib/librte_eal/include/generic/rte_spinlock.h | 5 ++---
 lib/librte_ring/rte_ring_c11_mem.h            | 4 ++--
 lib/librte_ring/rte_ring_generic.h            | 3 +--
 3 files changed, 5 insertions(+), 7 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v1 1/2] spinlock: use wfe to reduce contention on aarch64
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
@ 2020-04-24  7:07 ` Gavin Hu
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-24  7:07 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 lib/librte_eal/include/generic/rte_spinlock.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/include/generic/rte_spinlock.h b/lib/librte_eal/include/generic/rte_spinlock.h
index 87ae7a4f1..5cc123247 100644
--- a/lib/librte_eal/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/include/generic/rte_spinlock.h
@@ -28,7 +28,7 @@
  * The rte_spinlock_t type.
  */
 typedef struct {
-	volatile int locked; /**< lock status 0 = unlocked, 1 = locked */
+	volatile uint32_t locked; /**< lock status 0 = unlocked, 1 = locked */
 } rte_spinlock_t;
 
 /**
@@ -65,8 +65,7 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
 	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
 				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-			rte_pause();
+		rte_wait_until_equal_32(&sl->locked, 0, __ATOMIC_RELAXED);
 		exp = 0;
 	}
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
@ 2020-04-24  7:07 ` Gavin Hu
  2020-04-24 13:23   ` Ananyev, Konstantin
  2020-04-24 14:46   ` Honnappa Nagarahalli
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-24  7:07 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
 lib/librte_ring/rte_ring_generic.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 0fb73a337..764d8f186 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2019 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index 953cdbbd5..682852783 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
 
 	ht->tail = new_val;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
@ 2020-04-24 13:23   ` Ananyev, Konstantin
  2020-04-26  8:41     ` Gavin Hu
  2020-04-24 14:46   ` Honnappa Nagarahalli
  1 sibling, 1 reply; 60+ messages in thread
From: Ananyev, Konstantin @ 2020-04-24 13:23 UTC (permalink / raw)
  To: Gavin Hu, dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong


> 
> Instead of polling for tail to be updated, use wfe instruction.
> 
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
>  lib/librte_ring/rte_ring_generic.h | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
> index 0fb73a337..764d8f186 100644
> --- a/lib/librte_ring/rte_ring_c11_mem.h
> +++ b/lib/librte_ring/rte_ring_c11_mem.h
> @@ -2,6 +2,7 @@
>   *
>   * Copyright (c) 2017,2018 HXT-semitech Corporation.
>   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * Copyright (c) 2019 Arm Limited
>   * All rights reserved.
>   * Derived from FreeBSD's bufring.h
>   * Used as BSD-3 Licensed with permission from Kip Macy.
> @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
>  	 * we need to wait for them to complete
>  	 */
>  	if (!single)
> -		while (unlikely(ht->tail != old_val))
> -			rte_pause();
> +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> 
>  	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
>  }
> diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> index 953cdbbd5..682852783 100644
> --- a/lib/librte_ring/rte_ring_generic.h
> +++ b/lib/librte_ring/rte_ring_generic.h
> @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
>  	 * we need to wait for them to complete
>  	 */
>  	if (!single)
> -		while (unlikely(ht->tail != old_val))
> -			rte_pause();
> +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);

./lib/librte_ring/rte_ring_generic.h:26:3: error: implicit declaration of function 'rte_wait_until_equal_relaxed_32'; did you mean 'rte_wait_until_equal_32'? [-Werror=implicit-function-declaration]
   rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


BTW, after patch #1 compilation fails also:
../lib/librte_eal/x86/include/rte_spinlock.h:125:24: error: pointer targets in passing argument 1 of 'rte_try_tm' differ in signedness [-Werror=pointer-sign]
  if (likely(rte_try_tm(&sl->locked)))
                        ^



> 
>  	ht->tail = new_val;
>  }
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
  2020-04-24 13:23   ` Ananyev, Konstantin
@ 2020-04-24 14:46   ` Honnappa Nagarahalli
  2020-04-26  8:33     ` Gavin Hu
  1 sibling, 1 reply; 60+ messages in thread
From: Honnappa Nagarahalli @ 2020-04-24 14:46 UTC (permalink / raw)
  To: Gavin Hu, dev
  Cc: nd, david.marchand, thomas, jerinj, Ruifeng Wang, Phil Yang,
	Joyce Kong, Honnappa Nagarahalli, nd

<snip>

Hi Gavin,
	There are other sync modes added to rte_ring library. Can you please extend this to other rte_pause() calls in the library?

Thanks,
Honnappa

> Subject: [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
> 
> Instead of polling for tail to be updated, use wfe instruction.
> 
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> lib/librte_ring/rte_ring_generic.h | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> b/lib/librte_ring/rte_ring_c11_mem.h
> index 0fb73a337..764d8f186 100644
> --- a/lib/librte_ring/rte_ring_c11_mem.h
> +++ b/lib/librte_ring/rte_ring_c11_mem.h
> @@ -2,6 +2,7 @@
>   *
>   * Copyright (c) 2017,2018 HXT-semitech Corporation.
>   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * Copyright (c) 2019 Arm Limited
>   * All rights reserved.
>   * Derived from FreeBSD's bufring.h
>   * Used as BSD-3 Licensed with permission from Kip Macy.
> @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
>  	 * we need to wait for them to complete
>  	 */
>  	if (!single)
> -		while (unlikely(ht->tail != old_val))
> -			rte_pause();
> +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> 
>  	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);  } diff --
> git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> index 953cdbbd5..682852783 100644
> --- a/lib/librte_ring/rte_ring_generic.h
> +++ b/lib/librte_ring/rte_ring_generic.h
> @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
>  	 * we need to wait for them to complete
>  	 */
>  	if (!single)
> -		while (unlikely(ht->tail != old_val))
> -			rte_pause();
> +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> 
>  	ht->tail = new_val;
>  }
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24 14:46   ` Honnappa Nagarahalli
@ 2020-04-26  8:33     ` Gavin Hu
  0 siblings, 0 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-26  8:33 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev
  Cc: nd, david.marchand, thomas, jerinj, Ruifeng Wang, Phil Yang,
	Joyce Kong, nd, nd

Hi Honnappa,

> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, April 24, 2020 10:46 PM
> To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net; jerinj@marvell.com; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH v1 2/2] ring: use wfe to wait for ring tail update on
> aarch64
> 
> <snip>
> 
> Hi Gavin,
> 	There are other sync modes added to rte_ring library. Can you
> please extend this to other rte_pause() calls in the library?
I looked into the other two calls to rte_cause, they can't use the rte_wait_until_equal_XX APIs simply.
Maybe new APIs are required? 
Here is the 32-bit API for your reference:
rte_wait_until_equal_32(volatile uint32_t *addr, uint32_t expected, int memorder)
> 
> Thanks,
> Honnappa
> 
> > Subject: [PATCH v1 2/2] ring: use wfe to wait for ring tail update on
> aarch64
> >
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> >  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> > lib/librte_ring/rte_ring_generic.h | 3 +--
> >  2 files changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> > b/lib/librte_ring/rte_ring_c11_mem.h
> > index 0fb73a337..764d8f186 100644
> > --- a/lib/librte_ring/rte_ring_c11_mem.h
> > +++ b/lib/librte_ring/rte_ring_c11_mem.h
> > @@ -2,6 +2,7 @@
> >   *
> >   * Copyright (c) 2017,2018 HXT-semitech Corporation.
> >   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * Copyright (c) 2019 Arm Limited
> >   * All rights reserved.
> >   * Derived from FreeBSD's bufring.h
> >   * Used as BSD-3 Licensed with permission from Kip Macy.
> > @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> > old_val, uint32_t new_val,
> >  	 * we need to wait for them to complete
> >  	 */
> >  	if (!single)
> > -		while (unlikely(ht->tail != old_val))
> > -			rte_pause();
> > +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> >  	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);  } diff --
> > git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> > index 953cdbbd5..682852783 100644
> > --- a/lib/librte_ring/rte_ring_generic.h
> > +++ b/lib/librte_ring/rte_ring_generic.h
> > @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> > old_val, uint32_t new_val,
> >  	 * we need to wait for them to complete
> >  	 */
> >  	if (!single)
> > -		while (unlikely(ht->tail != old_val))
> > -			rte_pause();
> > +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> >  	ht->tail = new_val;
> >  }
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
  2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
@ 2020-04-26  8:39 ` Gavin Hu
  2020-05-01  9:47   ` Ananyev, Konstantin
  2020-05-03 14:54   ` David Marchand
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-26  8:39 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.

[1] http://patches.dpdk.org/cover/62703/

Gavin Hu (2):
  spinlock: use wfe to reduce contention on aarch64
  ring: use wfe to wait for ring tail update on aarch64

 lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
 lib/librte_ring/rte_ring_c11_mem.h            | 4 ++--
 lib/librte_ring/rte_ring_generic.h            | 3 +--
 3 files changed, 5 insertions(+), 6 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (2 preceding siblings ...)
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring Gavin Hu
@ 2020-04-26  8:39 ` Gavin Hu
  2020-04-28 12:22   ` Jerin Jacob
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 60+ messages in thread
From: Gavin Hu @ 2020-04-26  8:39 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/include/generic/rte_spinlock.h b/lib/librte_eal/include/generic/rte_spinlock.h
index 87ae7a4f1..40fe49d5a 100644
--- a/lib/librte_eal/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/include/generic/rte_spinlock.h
@@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
 	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
 				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-			rte_pause();
+		rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
+			       0, __ATOMIC_RELAXED);
 		exp = 0;
 	}
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (3 preceding siblings ...)
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
@ 2020-04-26  8:39 ` Gavin Hu
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-26  8:39 UTC (permalink / raw)
  To: dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
 lib/librte_ring/rte_ring_generic.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 0fb73a337..b37ef5995 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2019 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index 953cdbbd5..afce89aec 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	ht->tail = new_val;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64
  2020-04-24 13:23   ` Ananyev, Konstantin
@ 2020-04-26  8:41     ` Gavin Hu
  0 siblings, 0 replies; 60+ messages in thread
From: Gavin Hu @ 2020-04-26  8:41 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa Nagarahalli,
	Ruifeng Wang, Phil Yang, Joyce Kong, nd

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Sent: Friday, April 24, 2020 9:24 PM
> To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net; jerinj@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> 
> >
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> >  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> >  lib/librte_ring/rte_ring_generic.h | 3 +--
> >  2 files changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> b/lib/librte_ring/rte_ring_c11_mem.h
> > index 0fb73a337..764d8f186 100644
> > --- a/lib/librte_ring/rte_ring_c11_mem.h
> > +++ b/lib/librte_ring/rte_ring_c11_mem.h
> > @@ -2,6 +2,7 @@
> >   *
> >   * Copyright (c) 2017,2018 HXT-semitech Corporation.
> >   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > + * Copyright (c) 2019 Arm Limited
> >   * All rights reserved.
> >   * Derived from FreeBSD's bufring.h
> >   * Used as BSD-3 Licensed with permission from Kip Macy.
> > @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
> >  	 * we need to wait for them to complete
> >  	 */
> >  	if (!single)
> > -		while (unlikely(ht->tail != old_val))
> > -			rte_pause();
> > +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> >  	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
> >  }
> > diff --git a/lib/librte_ring/rte_ring_generic.h
> b/lib/librte_ring/rte_ring_generic.h
> > index 953cdbbd5..682852783 100644
> > --- a/lib/librte_ring/rte_ring_generic.h
> > +++ b/lib/librte_ring/rte_ring_generic.h
> > @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
> >  	 * we need to wait for them to complete
> >  	 */
> >  	if (!single)
> > -		while (unlikely(ht->tail != old_val))
> > -			rte_pause();
> > +		rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> 
> ./lib/librte_ring/rte_ring_generic.h:26:3: error: implicit declaration of
> function 'rte_wait_until_equal_relaxed_32'; did you mean
> 'rte_wait_until_equal_32'? [-Werror=implicit-function-declaration]
>    rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
>    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sorry for the leakage, fixed in v2.
> 
> BTW, after patch #1 compilation fails also:
> ../lib/librte_eal/x86/include/rte_spinlock.h:125:24: error: pointer targets in
> passing argument 1 of 'rte_try_tm' differ in signedness [-Werror=pointer-
> sign]
>   if (likely(rte_try_tm(&sl->locked)))
Fixed in v2, thanks!
>                         ^
> 
> 
> 
> >
> >  	ht->tail = new_val;
> >  }
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
@ 2020-04-28 12:22   ` Jerin Jacob
  0 siblings, 0 replies; 60+ messages in thread
From: Jerin Jacob @ 2020-04-28 12:22 UTC (permalink / raw)
  To: Gavin Hu
  Cc: dpdk-dev, nd, David Marchand, Thomas Monjalon, Jerin Jacob,
	Honnappa Nagarahalli, Ruifeng Wang (Arm Technology China),
	Phil Yang, Joyce Kong

On Sun, Apr 26, 2020 at 2:09 PM Gavin Hu <gavin.hu@arm.com> wrote:
>
> In acquiring a spinlock, cores repeatedly poll the lock variable.
> This is replaced by rte_wait_until_equal API.
>
> Running the micro benchmarking and the testpmd and l3fwd traffic tests
> on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
> notable performance gain nor degradation was measured.
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

Acked-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/include/generic/rte_spinlock.h b/lib/librte_eal/include/generic/rte_spinlock.h
> index 87ae7a4f1..40fe49d5a 100644
> --- a/lib/librte_eal/include/generic/rte_spinlock.h
> +++ b/lib/librte_eal/include/generic/rte_spinlock.h
> @@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
>
>         while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
>                                 __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
> -               while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
> -                       rte_pause();
> +               rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
> +                              0, __ATOMIC_RELAXED);
>                 exp = 0;
>         }
>  }
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring Gavin Hu
@ 2020-05-01  9:47   ` Ananyev, Konstantin
  2020-05-03 14:54   ` David Marchand
  1 sibling, 0 replies; 60+ messages in thread
From: Ananyev, Konstantin @ 2020-05-01  9:47 UTC (permalink / raw)
  To: Gavin Hu, dev
  Cc: nd, david.marchand, thomas, jerinj, Honnappa.Nagarahalli,
	Ruifeng.Wang, Phil.Yang, Joyce.Kong

> 
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
> 
> Use the API for the rte spinlock and ring implementations.
> 
> [1] http://patches.dpdk.org/cover/62703/
> 
> Gavin Hu (2):
>   spinlock: use wfe to reduce contention on aarch64
>   ring: use wfe to wait for ring tail update on aarch64
> 
>  lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
>  lib/librte_ring/rte_ring_c11_mem.h            | 4 ++--
>  lib/librte_ring/rte_ring_generic.h            | 3 +--
>  3 files changed, 5 insertions(+), 6 deletions(-)
> 

Series Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> --
> 2.17.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring Gavin Hu
  2020-05-01  9:47   ` Ananyev, Konstantin
@ 2020-05-03 14:54   ` David Marchand
  2020-05-03 21:31     ` Honnappa Nagarahalli
  1 sibling, 1 reply; 60+ messages in thread
From: David Marchand @ 2020-05-03 14:54 UTC (permalink / raw)
  To: Gavin Hu
  Cc: dev, nd, Thomas Monjalon, Jerin Jacob Kollanukkaran,
	Honnappa Nagarahalli, Ruifeng Wang (Arm Technology China),
	Phil Yang, Joyce Kong, Ananyev, Konstantin

On Sun, Apr 26, 2020 at 10:39 AM Gavin Hu <gavin.hu@arm.com> wrote:
>
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
>
> Use the API for the rte spinlock and ring implementations.
>
> [1] http://patches.dpdk.org/cover/62703/
>
> Gavin Hu (2):
>   spinlock: use wfe to reduce contention on aarch64
>   ring: use wfe to wait for ring tail update on aarch64

This would result in rte_ring and rte_spinlock APIs becoming
experimental and this breaks compilation for external applications
using stable ring and spinlock APIs.
IIRC, it was the reason why these patches were dropped with the
introduction of the rte_wait_until_equal_* API.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-05-03 14:54   ` David Marchand
@ 2020-05-03 21:31     ` Honnappa Nagarahalli
  2020-09-25 14:30       ` David Marchand
  0 siblings, 1 reply; 60+ messages in thread
From: Honnappa Nagarahalli @ 2020-05-03 21:31 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, nd, thomas, jerinj, Ruifeng Wang, Phil Yang, Joyce Kong,
	Ananyev, Konstantin, Honnappa Nagarahalli, nd, nd

<snip>

> Subject: Re: [PATCH v2 0/2] Use WFE for spinlock and ring
> 
> On Sun, Apr 26, 2020 at 10:39 AM Gavin Hu <gavin.hu@arm.com> wrote:
> >
> > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > 'polling for a memory location to become equal to a given value'[1].
> >
> > Use the API for the rte spinlock and ring implementations.
> >
> > [1] http://patches.dpdk.org/cover/62703/
> >
> > Gavin Hu (2):
> >   spinlock: use wfe to reduce contention on aarch64
> >   ring: use wfe to wait for ring tail update on aarch64
> 
> This would result in rte_ring and rte_spinlock APIs becoming experimental
> and this breaks compilation for external applications using stable ring and
> spinlock APIs.
> IIRC, it was the reason why these patches were dropped with the introduction
> of the rte_wait_until_equal_* API.
Agreed, the rte_ring new sync modes are resulting in different use cases for these APIs. We need to take a relook at the APIs.

> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-05-03 21:31     ` Honnappa Nagarahalli
@ 2020-09-25 14:30       ` David Marchand
  2021-03-25 14:58         ` David Marchand
  0 siblings, 1 reply; 60+ messages in thread
From: David Marchand @ 2020-09-25 14:30 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: dev, nd, thomas, jerinj, Ruifeng Wang, Phil Yang, Joyce Kong,
	Ananyev, Konstantin

Hello Honnappa,

On Sun, May 3, 2020 at 11:32 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
> <snip>
>
> > Subject: Re: [PATCH v2 0/2] Use WFE for spinlock and ring
> >
> > On Sun, Apr 26, 2020 at 10:39 AM Gavin Hu <gavin.hu@arm.com> wrote:
> > >
> > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > 'polling for a memory location to become equal to a given value'[1].
> > >
> > > Use the API for the rte spinlock and ring implementations.
> > >
> > > [1] http://patches.dpdk.org/cover/62703/
> > >
> > > Gavin Hu (2):
> > >   spinlock: use wfe to reduce contention on aarch64
> > >   ring: use wfe to wait for ring tail update on aarch64
> >
> > This would result in rte_ring and rte_spinlock APIs becoming experimental
> > and this breaks compilation for external applications using stable ring and
> > spinlock APIs.
> > IIRC, it was the reason why these patches were dropped with the introduction
> > of the rte_wait_until_equal_* API.
> Agreed, the rte_ring new sync modes are resulting in different use cases for these APIs. We need to take a relook at the APIs.

Will we reconsider this series now that the wait until equal is going stable?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2020-09-25 14:30       ` David Marchand
@ 2021-03-25 14:58         ` David Marchand
  2021-04-25  6:06           ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: David Marchand @ 2021-03-25 14:58 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: dev, nd, thomas, jerinj, Ruifeng Wang, Phil Yang, Joyce Kong,
	Ananyev, Konstantin

Hello,

On Fri, Sep 25, 2020 at 4:30 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> Hello Honnappa,
>
> On Sun, May 3, 2020 at 11:32 PM Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com> wrote:
> >
> > <snip>
> >
> > > Subject: Re: [PATCH v2 0/2] Use WFE for spinlock and ring
> > >
> > > On Sun, Apr 26, 2020 at 10:39 AM Gavin Hu <gavin.hu@arm.com> wrote:
> > > >
> > > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > > 'polling for a memory location to become equal to a given value'[1].
> > > >
> > > > Use the API for the rte spinlock and ring implementations.
> > > >
> > > > [1] http://patches.dpdk.org/cover/62703/
> > > >
> > > > Gavin Hu (2):
> > > >   spinlock: use wfe to reduce contention on aarch64
> > > >   ring: use wfe to wait for ring tail update on aarch64
> > >
> > > This would result in rte_ring and rte_spinlock APIs becoming experimental
> > > and this breaks compilation for external applications using stable ring and
> > > spinlock APIs.
> > > IIRC, it was the reason why these patches were dropped with the introduction
> > > of the rte_wait_until_equal_* API.
> > Agreed, the rte_ring new sync modes are resulting in different use cases for these APIs. We need to take a relook at the APIs.
>
> Will we reconsider this series now that the wait until equal is going stable?

Any update?
I suppose this would need some rebasing after the ring library changes.

Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] Use WFE for spinlock and ring
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (4 preceding siblings ...)
  2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
@ 2021-04-25  5:56 ` Ruifeng Wang
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 1/2] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
                     ` (3 more replies)
  2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
  7 siblings, 4 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-25  5:56 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, jerinj, nd, honnappa.nagarahalli,
	ruifeng.wang

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.
With the wait until equal APIs being stable, changes will not impact ABI.

[1] http://patches.dpdk.org/cover/62703/

v3:
Series rebased. (David)

Gavin Hu (1):
  spinlock: use wfe to reduce contention on aarch64

Ruifeng Wang (1):
  ring: use wfe to wait for ring tail update on aarch64

 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 lib/ring/rte_ring_c11_pvt.h            | 4 ++--
 lib/ring/rte_ring_generic_pvt.h        | 3 +--
 3 files changed, 5 insertions(+), 6 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v3 1/2] spinlock: use wfe to reduce contention on aarch64
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
@ 2021-04-25  5:56   ` Ruifeng Wang
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update " Ruifeng Wang
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-25  5:56 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, jerinj, nd, honnappa.nagarahalli,
	ruifeng.wang, Gavin Hu, Phil Yang, Steve Capper, Ola Liljedahl,
	Pavan Nikhilesh, Konstantin Ananyev

From: Gavin Hu <gavin.hu@arm.com>

In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/include/generic/rte_spinlock.h b/lib/eal/include/generic/rte_spinlock.h
index 87ae7a4f18..40fe49d5ad 100644
--- a/lib/eal/include/generic/rte_spinlock.h
+++ b/lib/eal/include/generic/rte_spinlock.h
@@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
 	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
 				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-			rte_pause();
+		rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
+			       0, __ATOMIC_RELAXED);
 		exp = 0;
 	}
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 1/2] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
@ 2021-04-25  5:56   ` Ruifeng Wang
  2021-04-26  5:38     ` Jerin Jacob
  2021-04-28 17:17     ` Stephen Hemminger
  2021-04-28  7:42   ` [dpdk-dev] Use WFE for spinlock and ring David Marchand
  2021-07-07 14:47   ` Stephen Hemminger
  3 siblings, 2 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-25  5:56 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev
  Cc: dev, david.marchand, thomas, jerinj, nd, ruifeng.wang, Gavin Hu,
	Steve Capper, Ola Liljedahl

Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/ring/rte_ring_c11_pvt.h     | 4 ++--
 lib/ring/rte_ring_generic_pvt.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
index 759192f4c4..37e0b2afd6 100644
--- a/lib/ring/rte_ring_c11_pvt.h
+++ b/lib/ring/rte_ring_c11_pvt.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2021 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index 532deb5e7a..c95ad7e12c 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,8 +23,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	ht->tail = new_val;
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2021-03-25 14:58         ` David Marchand
@ 2021-04-25  6:06           ` Ruifeng Wang
  2021-04-25 15:56             ` Thomas Monjalon
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-25  6:06 UTC (permalink / raw)
  To: David Marchand, Honnappa Nagarahalli
  Cc: dev, nd, thomas, jerinj, Joyce Kong, Ananyev, Konstantin, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, March 25, 2021 10:58 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Cc: dev <dev@dpdk.org>; nd <nd@arm.com>; thomas@monjalon.net;
> jerinj@marvell.com; Ruifeng Wang <Ruifeng.Wang@arm.com>; Phil Yang
> <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>; Ananyev,
> Konstantin <konstantin.ananyev@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
> 
> Hello,
> 
> On Fri, Sep 25, 2020 at 4:30 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > Hello Honnappa,
> >
> > On Sun, May 3, 2020 at 11:32 PM Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com> wrote:
> > >
> > > <snip>
> > >
> > > > Subject: Re: [PATCH v2 0/2] Use WFE for spinlock and ring
> > > >
> > > > On Sun, Apr 26, 2020 at 10:39 AM Gavin Hu <gavin.hu@arm.com> wrote:
> > > > >
> > > > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > > > 'polling for a memory location to become equal to a given value'[1].
> > > > >
> > > > > Use the API for the rte spinlock and ring implementations.
> > > > >
> > > > > [1] http://patches.dpdk.org/cover/62703/
> > > > >
> > > > > Gavin Hu (2):
> > > > >   spinlock: use wfe to reduce contention on aarch64
> > > > >   ring: use wfe to wait for ring tail update on aarch64
> > > >
> > > > This would result in rte_ring and rte_spinlock APIs becoming
> > > > experimental and this breaks compilation for external applications
> > > > using stable ring and spinlock APIs.
> > > > IIRC, it was the reason why these patches were dropped with the
> > > > introduction of the rte_wait_until_equal_* API.
> > > Agreed, the rte_ring new sync modes are resulting in different use cases
> for these APIs. We need to take a relook at the APIs.
> >
> > Will we reconsider this series now that the wait until equal is going stable?
> 
> Any update?
> I suppose this would need some rebasing after the ring library changes.

Sorry for the late response.
This series has been rebased and v3 posted.
I have problem to mark v2 as superseded.

Thank you.
> 
> Thanks.
> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring
  2021-04-25  6:06           ` Ruifeng Wang
@ 2021-04-25 15:56             ` Thomas Monjalon
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Monjalon @ 2021-04-25 15:56 UTC (permalink / raw)
  To: Ruifeng Wang, David Marchand, Honnappa Nagarahalli
  Cc: dev, nd, Jerin Jacob Kollanukkaran, Joyce Kong, Konstantin Ananyev

Ruifeng Wang:
> From: David Marchand <david.marchand@redhat.com>
> > Any update?
> > I suppose this would need some rebasing after the ring library changes.
> 
> Sorry for the late response.
> This series has been rebased and v3 posted.
> I have problem to mark v2 as superseded.

I did change v2 as superseded.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update " Ruifeng Wang
@ 2021-04-26  5:38     ` Jerin Jacob
  2021-04-28 17:17     ` Stephen Hemminger
  1 sibling, 0 replies; 60+ messages in thread
From: Jerin Jacob @ 2021-04-26  5:38 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Honnappa Nagarahalli, Konstantin Ananyev, dpdk-dev,
	David Marchand, Thomas Monjalon, Jerin Jacob, nd, Gavin Hu,
	Steve Capper, Ola Liljedahl

On Sun, Apr 25, 2021 at 11:27 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote:
>
> Instead of polling for tail to be updated, use wfe instruction.
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Acked-by: Jerin Jacob <jerinj@marvell.com>



> ---
>  lib/ring/rte_ring_c11_pvt.h     | 4 ++--
>  lib/ring/rte_ring_generic_pvt.h | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 759192f4c4..37e0b2afd6 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -2,6 +2,7 @@
>   *
>   * Copyright (c) 2017,2018 HXT-semitech Corporation.
>   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * Copyright (c) 2021 Arm Limited
>   * All rights reserved.
>   * Derived from FreeBSD's bufring.h
>   * Used as BSD-3 Licensed with permission from Kip Macy.
> @@ -21,8 +22,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
>          * we need to wait for them to complete
>          */
>         if (!single)
> -               while (unlikely(ht->tail != old_val))
> -                       rte_pause();
> +               rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>
>         __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
>  }
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index 532deb5e7a..c95ad7e12c 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
> @@ -23,8 +23,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
>          * we need to wait for them to complete
>          */
>         if (!single)
> -               while (unlikely(ht->tail != old_val))
> -                       rte_pause();
> +               rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>
>         ht->tail = new_val;
>  }
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 1/2] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update " Ruifeng Wang
@ 2021-04-28  7:42   ` David Marchand
  2021-04-28  9:30     ` Ruifeng Wang
  2021-07-07 14:47   ` Stephen Hemminger
  3 siblings, 1 reply; 60+ messages in thread
From: David Marchand @ 2021-04-28  7:42 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: dev, Thomas Monjalon, Jerin Jacob Kollanukkaran, nd,
	Honnappa Nagarahalli

Hello Ruifeng,

On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote:
>
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
>
> Use the API for the rte spinlock and ring implementations.
> With the wait until equal APIs being stable, changes will not impact ABI.

Afaics, there is no ARM target with WFE enabled and we lost ability to
enable WFE support with removal of the make build system.

$ git grep RTE_ARM_USE_WFE
config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE

How did you enable WFE to test this series?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-28  7:42   ` [dpdk-dev] Use WFE for spinlock and ring David Marchand
@ 2021-04-28  9:30     ` Ruifeng Wang
  2021-04-28 11:13       ` Thomas Monjalon
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-28  9:30 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, thomas, jerinj, nd, Honnappa Nagarahalli, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Wednesday, April 28, 2021 3:42 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev <dev@dpdk.org>; thomas@monjalon.net; jerinj@marvell.com; nd
> <nd@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Subject: Re: Use WFE for spinlock and ring
> 
> Hello Ruifeng,

Hello David,
> 
> On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com>
> wrote:
> >
> > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > 'polling for a memory location to become equal to a given value'[1].
> >
> > Use the API for the rte spinlock and ring implementations.
> > With the wait until equal APIs being stable, changes will not impact ABI.
> 
> Afaics, there is no ARM target with WFE enabled and we lost ability to enable
> WFE support with removal of the make build system.

WFE can be enabled with direct meson file change.
WFE is not intended to be enabled by default. It can be enabled based on benchmarking
result on hardware.
> 
> $ git grep RTE_ARM_USE_WFE
> config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> 
> How did you enable WFE to test this series?

I modified meson file to test.
Tests were also done with WFE disabled to make sure no degradation with generic implementation.
> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-28  9:30     ` Ruifeng Wang
@ 2021-04-28 11:13       ` Thomas Monjalon
  2021-04-29 14:28         ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Monjalon @ 2021-04-28 11:13 UTC (permalink / raw)
  To: Ruifeng Wang; +Cc: David Marchand, dev, jerinj, nd, Honnappa Nagarahalli

28/04/2021 11:30, Ruifeng Wang:
> From: David Marchand <david.marchand@redhat.com>
> > On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com>
> > wrote:
> > >
> > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > 'polling for a memory location to become equal to a given value'[1].
> > >
> > > Use the API for the rte spinlock and ring implementations.
> > > With the wait until equal APIs being stable, changes will not impact ABI.
> > 
> > Afaics, there is no ARM target with WFE enabled and we lost ability to enable
> > WFE support with removal of the make build system.
> 
> WFE can be enabled with direct meson file change.
> WFE is not intended to be enabled by default. It can be enabled based on benchmarking
> result on hardware.
> > 
> > $ git grep RTE_ARM_USE_WFE
> > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > 
> > How did you enable WFE to test this series?
> 
> I modified meson file to test.
> Tests were also done with WFE disabled to make sure no degradation with generic implementation.

I don't understand the usage.
Which platform should use it?
Should it be a compile-time option?



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update " Ruifeng Wang
  2021-04-26  5:38     ` Jerin Jacob
@ 2021-04-28 17:17     ` Stephen Hemminger
  2021-04-29 14:35       ` Ruifeng Wang
  1 sibling, 1 reply; 60+ messages in thread
From: Stephen Hemminger @ 2021-04-28 17:17 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Honnappa Nagarahalli, Konstantin Ananyev, dev, david.marchand,
	thomas, jerinj, nd, Gavin Hu, Steve Capper, Ola Liljedahl

On Sun, 25 Apr 2021 05:56:53 +0000
Ruifeng Wang <ruifeng.wang@arm.com> wrote:

> Instead of polling for tail to be updated, use wfe instruction.
> 
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Looks ok to me, but it does raise an interesting question.
Shouldn't the original code have been using atomic load to look at ht->tail.

This another place where "volatile considered harmful" applies.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-28 11:13       ` Thomas Monjalon
@ 2021-04-29 14:28         ` Ruifeng Wang
  2021-04-29 15:20           ` Thomas Monjalon
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-29 14:28 UTC (permalink / raw)
  To: thomas; +Cc: David Marchand, dev, jerinj, nd, Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, April 28, 2021 7:14 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: David Marchand <david.marchand@redhat.com>; dev <dev@dpdk.org>;
> jerinj@marvell.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: Use WFE for spinlock and ring
> 
> 28/04/2021 11:30, Ruifeng Wang:
> > From: David Marchand <david.marchand@redhat.com>
> > > On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com>
> > > wrote:
> > > >
> > > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > > 'polling for a memory location to become equal to a given value'[1].
> > > >
> > > > Use the API for the rte spinlock and ring implementations.
> > > > With the wait until equal APIs being stable, changes will not impact ABI.
> > >
> > > Afaics, there is no ARM target with WFE enabled and we lost ability
> > > to enable WFE support with removal of the make build system.
> >
> > WFE can be enabled with direct meson file change.
> > WFE is not intended to be enabled by default. It can be enabled based
> > on benchmarking result on hardware.
> > >
> > > $ git grep RTE_ARM_USE_WFE
> > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > >
> > > How did you enable WFE to test this series?
> >
> > I modified meson file to test.
> > Tests were also done with WFE disabled to make sure no degradation with
> generic implementation.
> 
> I don't understand the usage.
> Which platform should use it?

Platforms that implement WFE semantic (e.g. N1) can use.
The user can enable this feature for power efficiency purpose. But there is something to
note as described in commit message 1be7855d77 when the API was introduced. 

> Should it be a compile-time option?

Yes, it should be a compile-time option.
It can be configured via c_args meson option?
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-28 17:17     ` Stephen Hemminger
@ 2021-04-29 14:35       ` Ruifeng Wang
  2021-04-29 15:05         ` Stephen Hemminger
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-04-29 14:35 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Honnappa Nagarahalli, Konstantin Ananyev, dev, david.marchand,
	thomas, jerinj, nd, Gavin Hu, Steve Capper, Ola Liljedahl, nd

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, April 29, 2021 1:17 AM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> On Sun, 25 Apr 2021 05:56:53 +0000
> Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> 
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> Looks ok to me, but it does raise an interesting question.
> Shouldn't the original code have been using atomic load to look at ht->tail.
> 
> This another place where "volatile considered harmful" applies.

Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx parameters?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-29 14:35       ` Ruifeng Wang
@ 2021-04-29 15:05         ` Stephen Hemminger
  2021-05-07  8:25           ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Stephen Hemminger @ 2021-04-29 15:05 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Honnappa Nagarahalli, Konstantin Ananyev, dev, david.marchand,
	thomas, jerinj, nd, Gavin Hu, Steve Capper, Ola Liljedahl

On Thu, 29 Apr 2021 14:35:35 +0000
Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Thursday, April 29, 2021 1:17 AM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> > Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> > david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> > nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> > <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> > update on aarch64
> > 
> > On Sun, 25 Apr 2021 05:56:53 +0000
> > Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> >   
> > > Instead of polling for tail to be updated, use wfe instruction.
> > >
> > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>  
> > 
> > Looks ok to me, but it does raise an interesting question.
> > Shouldn't the original code have been using atomic load to look at ht->tail.
> > 
> > This another place where "volatile considered harmful" applies.  
> 
> Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx parameters?
> 

I meant that all access to tail should be via C11 atomic builtin. At that point,
the volatile on the data structure elements does not matter.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-29 14:28         ` Ruifeng Wang
@ 2021-04-29 15:20           ` Thomas Monjalon
  2021-04-30  9:16             ` Bruce Richardson
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Monjalon @ 2021-04-29 15:20 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: David Marchand, dev, jerinj, nd, Honnappa Nagarahalli, bruce.richardson

29/04/2021 16:28, Ruifeng Wang:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 28/04/2021 11:30, Ruifeng Wang:
> > > From: David Marchand <david.marchand@redhat.com>
> > > > On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com>
> > > > wrote:
> > > > >
> > > > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > > > 'polling for a memory location to become equal to a given value'[1].
> > > > >
> > > > > Use the API for the rte spinlock and ring implementations.
> > > > > With the wait until equal APIs being stable, changes will not impact ABI.
> > > >
> > > > Afaics, there is no ARM target with WFE enabled and we lost ability
> > > > to enable WFE support with removal of the make build system.
> > >
> > > WFE can be enabled with direct meson file change.
> > > WFE is not intended to be enabled by default. It can be enabled based
> > > on benchmarking result on hardware.
> > > >
> > > > $ git grep RTE_ARM_USE_WFE
> > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > >
> > > > How did you enable WFE to test this series?
> > >
> > > I modified meson file to test.
> > > Tests were also done with WFE disabled to make sure no degradation with
> > generic implementation.
> > 
> > I don't understand the usage.
> > Which platform should use it?
> 
> Platforms that implement WFE semantic (e.g. N1) can use.
> The user can enable this feature for power efficiency purpose. But there is something to
> note as described in commit message 1be7855d77 when the API was introduced. 
> 
> > Should it be a compile-time option?
> 
> Yes, it should be a compile-time option.
> It can be configured via c_args meson option?

+Cc Bruce for discussing how to enable such feature.

The problem with c_args is that the application has no way to know.



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-29 15:20           ` Thomas Monjalon
@ 2021-04-30  9:16             ` Bruce Richardson
  2021-04-30 13:41               ` Honnappa Nagarahalli
  0 siblings, 1 reply; 60+ messages in thread
From: Bruce Richardson @ 2021-04-30  9:16 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ruifeng Wang, David Marchand, dev, jerinj, nd, Honnappa Nagarahalli

On Thu, Apr 29, 2021 at 05:20:05PM +0200, Thomas Monjalon wrote:
> 29/04/2021 16:28, Ruifeng Wang:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 28/04/2021 11:30, Ruifeng Wang:
> > > > From: David Marchand <david.marchand@redhat.com>
> > > > > On Sun, Apr 25, 2021 at 7:57 AM Ruifeng Wang <ruifeng.wang@arm.com>
> > > > > wrote:
> > > > > >
> > > > > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > > > > 'polling for a memory location to become equal to a given value'[1].
> > > > > >
> > > > > > Use the API for the rte spinlock and ring implementations.
> > > > > > With the wait until equal APIs being stable, changes will not impact ABI.
> > > > >
> > > > > Afaics, there is no ARM target with WFE enabled and we lost ability
> > > > > to enable WFE support with removal of the make build system.
> > > >
> > > > WFE can be enabled with direct meson file change.
> > > > WFE is not intended to be enabled by default. It can be enabled based
> > > > on benchmarking result on hardware.
> > > > >
> > > > > $ git grep RTE_ARM_USE_WFE
> > > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > > >
> > > > > How did you enable WFE to test this series?
> > > >
> > > > I modified meson file to test.
> > > > Tests were also done with WFE disabled to make sure no degradation with
> > > generic implementation.
> > > 
> > > I don't understand the usage.
> > > Which platform should use it?
> > 
> > Platforms that implement WFE semantic (e.g. N1) can use.
> > The user can enable this feature for power efficiency purpose. But there is something to
> > note as described in commit message 1be7855d77 when the API was introduced. 
> > 
> > > Should it be a compile-time option?
> > 
> > Yes, it should be a compile-time option.
> > It can be configured via c_args meson option?
> 
> +Cc Bruce for discussing how to enable such feature.
> 
> The problem with c_args is that the application has no way to know.
> 
Agree about c_args not being a great choice. Why does this need to be a
compile-time option? Can runtime support not be detected in some manner?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-30  9:16             ` Bruce Richardson
@ 2021-04-30 13:41               ` Honnappa Nagarahalli
  2021-04-30 14:19                 ` Bruce Richardson
  0 siblings, 1 reply; 60+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-30 13:41 UTC (permalink / raw)
  To: Bruce Richardson, thomas
  Cc: Ruifeng Wang, David Marchand, dev, jerinj, nd, Honnappa Nagarahalli, nd

<snip>

> > > > > > >
> > > > > > > The rte_wait_until_equal_xxx APIs abstract the functionality
> > > > > > > of 'polling for a memory location to become equal to a given
> value'[1].
> > > > > > >
> > > > > > > Use the API for the rte spinlock and ring implementations.
> > > > > > > With the wait until equal APIs being stable, changes will not impact
> ABI.
> > > > > >
> > > > > > Afaics, there is no ARM target with WFE enabled and we lost
> > > > > > ability to enable WFE support with removal of the make build
> system.
> > > > >
> > > > > WFE can be enabled with direct meson file change.
> > > > > WFE is not intended to be enabled by default. It can be enabled
> > > > > based on benchmarking result on hardware.
> > > > > >
> > > > > > $ git grep RTE_ARM_USE_WFE
> > > > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > > > >
> > > > > > How did you enable WFE to test this series?
> > > > >
> > > > > I modified meson file to test.
> > > > > Tests were also done with WFE disabled to make sure no
> > > > > degradation with
> > > > generic implementation.
> > > >
> > > > I don't understand the usage.
> > > > Which platform should use it?
> > >
> > > Platforms that implement WFE semantic (e.g. N1) can use.
> > > The user can enable this feature for power efficiency purpose. But
> > > there is something to note as described in commit message 1be7855d77
> when the API was introduced.
> > >
> > > > Should it be a compile-time option?
> > >
> > > Yes, it should be a compile-time option.
> > > It can be configured via c_args meson option?
> >
> > +Cc Bruce for discussing how to enable such feature.
> >
> > The problem with c_args is that the application has no way to know.
> >
> Agree about c_args not being a great choice. Why does this need to be a
> compile-time option? Can runtime support not be detected in some
> manner?
The problem is inconsistency in performance on different Arm platforms. We had decided that each platform needs to enable it after some testing.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-30 13:41               ` Honnappa Nagarahalli
@ 2021-04-30 14:19                 ` Bruce Richardson
  2021-05-07 10:18                   ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Bruce Richardson @ 2021-04-30 14:19 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: thomas, Ruifeng Wang, David Marchand, dev, jerinj, nd

On Fri, Apr 30, 2021 at 01:41:22PM +0000, Honnappa Nagarahalli wrote:
> <snip>
> 
> > > > > > > >
> > > > > > > > The rte_wait_until_equal_xxx APIs abstract the functionality
> > > > > > > > of 'polling for a memory location to become equal to a given
> > value'[1].
> > > > > > > >
> > > > > > > > Use the API for the rte spinlock and ring implementations.
> > > > > > > > With the wait until equal APIs being stable, changes will not impact
> > ABI.
> > > > > > >
> > > > > > > Afaics, there is no ARM target with WFE enabled and we lost
> > > > > > > ability to enable WFE support with removal of the make build
> > system.
> > > > > >
> > > > > > WFE can be enabled with direct meson file change.
> > > > > > WFE is not intended to be enabled by default. It can be enabled
> > > > > > based on benchmarking result on hardware.
> > > > > > >
> > > > > > > $ git grep RTE_ARM_USE_WFE
> > > > > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > > > > >
> > > > > > > How did you enable WFE to test this series?
> > > > > >
> > > > > > I modified meson file to test.
> > > > > > Tests were also done with WFE disabled to make sure no
> > > > > > degradation with
> > > > > generic implementation.
> > > > >
> > > > > I don't understand the usage.
> > > > > Which platform should use it?
> > > >
> > > > Platforms that implement WFE semantic (e.g. N1) can use.
> > > > The user can enable this feature for power efficiency purpose. But
> > > > there is something to note as described in commit message 1be7855d77
> > when the API was introduced.
> > > >
> > > > > Should it be a compile-time option?
> > > >
> > > > Yes, it should be a compile-time option.
> > > > It can be configured via c_args meson option?
> > >
> > > +Cc Bruce for discussing how to enable such feature.
> > >
> > > The problem with c_args is that the application has no way to know.
> > >
> > Agree about c_args not being a great choice. Why does this need to be a
> > compile-time option? Can runtime support not be detected in some
> > manner?
> The problem is inconsistency in performance on different Arm platforms. We had decided that each platform needs to enable it after some testing.

Then it sounds like it does indeed need to be a build option. Does it need
to be added to the meson_options.txt, or can it just be specified in
cross-files and optionally via c_args?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update on aarch64
  2021-04-29 15:05         ` Stephen Hemminger
@ 2021-05-07  8:25           ` Ruifeng Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-05-07  8:25 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Honnappa Nagarahalli, Konstantin Ananyev, dev, david.marchand,
	thomas, jerinj, nd, Gavin Hu, Steve Capper, Ola Liljedahl, nd

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, April 29, 2021 11:06 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> On Thu, 29 Apr 2021 14:35:35 +0000
> Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
> 
> > > -----Original Message-----
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > Sent: Thursday, April 29, 2021 1:17 AM
> > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > > Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> > > Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> > > david.marchand@redhat.com; thomas@monjalon.net;
> jerinj@marvell.com;
> > > nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> > > <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> > > Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for
> > > ring tail update on aarch64
> > >
> > > On Sun, 25 Apr 2021 05:56:53 +0000
> > > Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> > >
> > > > Instead of polling for tail to be updated, use wfe instruction.
> > > >
> > > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > > > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > >
> > > Looks ok to me, but it does raise an interesting question.
> > > Shouldn't the original code have been using atomic load to look at ht->tail.
> > >
> > > This another place where "volatile considered harmful" applies.
> >
> > Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx
> parameters?
> >
> 
> I meant that all access to tail should be via C11 atomic builtin. At that point,
> the volatile on the data structure elements does not matter.

Agreed. If synchronization is ensured by using C11 atomic builtin, 'volatile' on elements can be removed.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-30 14:19                 ` Bruce Richardson
@ 2021-05-07 10:18                   ` Ruifeng Wang
  2021-05-07 10:24                     ` Bruce Richardson
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-05-07 10:18 UTC (permalink / raw)
  To: Bruce Richardson, Honnappa Nagarahalli
  Cc: thomas, David Marchand, dev, jerinj, nd, nd

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Friday, April 30, 2021 10:19 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Cc: thomas@monjalon.net; Ruifeng Wang <Ruifeng.Wang@arm.com>; David
> Marchand <david.marchand@redhat.com>; dev <dev@dpdk.org>;
> jerinj@marvell.com; nd <nd@arm.com>
> Subject: Re: Use WFE for spinlock and ring
> 
> On Fri, Apr 30, 2021 at 01:41:22PM +0000, Honnappa Nagarahalli wrote:
> > <snip>
> >
> > > > > > > > >
> > > > > > > > > The rte_wait_until_equal_xxx APIs abstract the
> > > > > > > > > functionality of 'polling for a memory location to
> > > > > > > > > become equal to a given
> > > value'[1].
> > > > > > > > >
> > > > > > > > > Use the API for the rte spinlock and ring implementations.
> > > > > > > > > With the wait until equal APIs being stable, changes
> > > > > > > > > will not impact
> > > ABI.
> > > > > > > >
> > > > > > > > Afaics, there is no ARM target with WFE enabled and we
> > > > > > > > lost ability to enable WFE support with removal of the
> > > > > > > > make build
> > > system.
> > > > > > >
> > > > > > > WFE can be enabled with direct meson file change.
> > > > > > > WFE is not intended to be enabled by default. It can be
> > > > > > > enabled based on benchmarking result on hardware.
> > > > > > > >
> > > > > > > > $ git grep RTE_ARM_USE_WFE
> > > > > > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > > > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > > > > > >
> > > > > > > > How did you enable WFE to test this series?
> > > > > > >
> > > > > > > I modified meson file to test.
> > > > > > > Tests were also done with WFE disabled to make sure no
> > > > > > > degradation with
> > > > > > generic implementation.
> > > > > >
> > > > > > I don't understand the usage.
> > > > > > Which platform should use it?
> > > > >
> > > > > Platforms that implement WFE semantic (e.g. N1) can use.
> > > > > The user can enable this feature for power efficiency purpose.
> > > > > But there is something to note as described in commit message
> > > > > 1be7855d77
> > > when the API was introduced.
> > > > >
> > > > > > Should it be a compile-time option?
> > > > >
> > > > > Yes, it should be a compile-time option.
> > > > > It can be configured via c_args meson option?
> > > >
> > > > +Cc Bruce for discussing how to enable such feature.
> > > >
> > > > The problem with c_args is that the application has no way to know.
> > > >
> > > Agree about c_args not being a great choice. Why does this need to
> > > be a compile-time option? Can runtime support not be detected in
> > > some manner?
> > The problem is inconsistency in performance on different Arm platforms.
> We had decided that each platform needs to enable it after some testing.
> 
> Then it sounds like it does indeed need to be a build option. Does it need to
> be added to the meson_options.txt, or can it just be specified in cross-files
> and optionally via c_args?

Add it to the meson_options.txt is good as the option will be clearly exposed.
My concern is more options (e.g. RTE_ARCH_ARM64_MEMCPY) need to be added in.
Will the options bloat meson_options.txt?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-05-07 10:18                   ` Ruifeng Wang
@ 2021-05-07 10:24                     ` Bruce Richardson
  2021-07-05  8:51                       ` David Marchand
  0 siblings, 1 reply; 60+ messages in thread
From: Bruce Richardson @ 2021-05-07 10:24 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Honnappa Nagarahalli, thomas, David Marchand, dev, jerinj, nd

On Fri, May 07, 2021 at 10:18:52AM +0000, Ruifeng Wang wrote:
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Friday, April 30, 2021 10:19 PM
> > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Cc: thomas@monjalon.net; Ruifeng Wang <Ruifeng.Wang@arm.com>; David
> > Marchand <david.marchand@redhat.com>; dev <dev@dpdk.org>;
> > jerinj@marvell.com; nd <nd@arm.com>
> > Subject: Re: Use WFE for spinlock and ring
> > 
> > On Fri, Apr 30, 2021 at 01:41:22PM +0000, Honnappa Nagarahalli wrote:
> > > <snip>
> > >
> > > > > > > > > >
> > > > > > > > > > The rte_wait_until_equal_xxx APIs abstract the
> > > > > > > > > > functionality of 'polling for a memory location to
> > > > > > > > > > become equal to a given
> > > > value'[1].
> > > > > > > > > >
> > > > > > > > > > Use the API for the rte spinlock and ring implementations.
> > > > > > > > > > With the wait until equal APIs being stable, changes
> > > > > > > > > > will not impact
> > > > ABI.
> > > > > > > > >
> > > > > > > > > Afaics, there is no ARM target with WFE enabled and we
> > > > > > > > > lost ability to enable WFE support with removal of the
> > > > > > > > > make build
> > > > system.
> > > > > > > >
> > > > > > > > WFE can be enabled with direct meson file change.
> > > > > > > > WFE is not intended to be enabled by default. It can be
> > > > > > > > enabled based on benchmarking result on hardware.
> > > > > > > > >
> > > > > > > > > $ git grep RTE_ARM_USE_WFE
> > > > > > > > > config/arm/meson.build:        ['RTE_ARM_USE_WFE', false],
> > > > > > > > > lib/eal/arm/include/rte_pause_64.h:#ifdef RTE_ARM_USE_WFE
> > > > > > > > >
> > > > > > > > > How did you enable WFE to test this series?
> > > > > > > >
> > > > > > > > I modified meson file to test.
> > > > > > > > Tests were also done with WFE disabled to make sure no
> > > > > > > > degradation with
> > > > > > > generic implementation.
> > > > > > >
> > > > > > > I don't understand the usage.
> > > > > > > Which platform should use it?
> > > > > >
> > > > > > Platforms that implement WFE semantic (e.g. N1) can use.
> > > > > > The user can enable this feature for power efficiency purpose.
> > > > > > But there is something to note as described in commit message
> > > > > > 1be7855d77
> > > > when the API was introduced.
> > > > > >
> > > > > > > Should it be a compile-time option?
> > > > > >
> > > > > > Yes, it should be a compile-time option.
> > > > > > It can be configured via c_args meson option?
> > > > >
> > > > > +Cc Bruce for discussing how to enable such feature.
> > > > >
> > > > > The problem with c_args is that the application has no way to know.
> > > > >
> > > > Agree about c_args not being a great choice. Why does this need to
> > > > be a compile-time option? Can runtime support not be detected in
> > > > some manner?
> > > The problem is inconsistency in performance on different Arm platforms.
> > We had decided that each platform needs to enable it after some testing.
> > 
> > Then it sounds like it does indeed need to be a build option. Does it need to
> > be added to the meson_options.txt, or can it just be specified in cross-files
> > and optionally via c_args?
> 
> Add it to the meson_options.txt is good as the option will be clearly exposed.
> My concern is more options (e.g. RTE_ARCH_ARM64_MEMCPY) need to be added in.
> Will the options bloat meson_options.txt?

That bloat would indeed become a concern. We may need to look at more use
of auto-detection and cross files for such options.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-05-07 10:24                     ` Bruce Richardson
@ 2021-07-05  8:51                       ` David Marchand
  2021-07-05 10:21                         ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: David Marchand @ 2021-07-05  8:51 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Bruce Richardson, Honnappa Nagarahalli, thomas, dev, jerinj, nd

On Fri, May 7, 2021 at 12:24 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
> > > > The problem is inconsistency in performance on different Arm platforms.
> > > We had decided that each platform needs to enable it after some testing.
> > >
> > > Then it sounds like it does indeed need to be a build option. Does it need to
> > > be added to the meson_options.txt, or can it just be specified in cross-files
> > > and optionally via c_args?
> >
> > Add it to the meson_options.txt is good as the option will be clearly exposed.
> > My concern is more options (e.g. RTE_ARCH_ARM64_MEMCPY) need to be added in.
> > Will the options bloat meson_options.txt?
>
> That bloat would indeed become a concern. We may need to look at more use
> of auto-detection and cross files for such options.

Ruifeng,

This series is blocked on this last topic.
Do you intend to send a n+1?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-07-05  8:51                       ` David Marchand
@ 2021-07-05 10:21                         ` Ruifeng Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-05 10:21 UTC (permalink / raw)
  To: David Marchand
  Cc: Bruce Richardson, Honnappa Nagarahalli, thomas, dev, jerinj, nd, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Monday, July 5, 2021 4:52 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Bruce Richardson <bruce.richardson@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; thomas@monjalon.net; dev
> <dev@dpdk.org>; jerinj@marvell.com; nd <nd@arm.com>
> Subject: Re: Use WFE for spinlock and ring
> 
> On Fri, May 7, 2021 at 12:24 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> > > > > The problem is inconsistency in performance on different Arm
> platforms.
> > > > We had decided that each platform needs to enable it after some
> testing.
> > > >
> > > > Then it sounds like it does indeed need to be a build option. Does
> > > > it need to be added to the meson_options.txt, or can it just be
> > > > specified in cross-files and optionally via c_args?
> > >
> > > Add it to the meson_options.txt is good as the option will be clearly
> exposed.
> > > My concern is more options (e.g. RTE_ARCH_ARM64_MEMCPY) need to
> be added in.
> > > Will the options bloat meson_options.txt?
> >
> > That bloat would indeed become a concern. We may need to look at more
> > use of auto-detection and cross files for such options.
> 
> Ruifeng,
> 
> This series is blocked on this last topic.
> Do you intend to send a n+1?
I think the best way to expose this config is adding it to meson_options.
Will send v4 to update.

> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (5 preceding siblings ...)
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
@ 2021-07-07  5:43 ` Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
                     ` (2 more replies)
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
  7 siblings, 3 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:43 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.
With the wait until equal APIs being stable, changes will not impact ABI.

[1] http://patches.dpdk.org/cover/62703/

v4:
Added meson option to expose WFE. (David, Bruce)

v3:
Series rebased. (David)

Gavin Hu (1):
  spinlock: use wfe to reduce contention on aarch64

Ruifeng Wang (2):
  ring: use wfe to wait for ring tail update on aarch64
  build: add option to enable wait until equal

 config/arm/meson.build                 | 2 +-
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 lib/ring/rte_ring_c11_pvt.h            | 4 ++--
 lib/ring/rte_ring_generic_pvt.h        | 3 +--
 meson_options.txt                      | 2 ++
 5 files changed, 8 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64
  2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
@ 2021-07-07  5:43   ` Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
  2 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:43 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang, Gavin Hu, Phil Yang,
	Steve Capper, Ola Liljedahl, Pavan Nikhilesh

From: Gavin Hu <gavin.hu@arm.com>

In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/include/generic/rte_spinlock.h b/lib/eal/include/generic/rte_spinlock.h
index 87ae7a4f18..40fe49d5ad 100644
--- a/lib/eal/include/generic/rte_spinlock.h
+++ b/lib/eal/include/generic/rte_spinlock.h
@@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
 	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
 				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-			rte_pause();
+		rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
+			       0, __ATOMIC_RELAXED);
 		exp = 0;
 	}
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update on aarch64
  2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
@ 2021-07-07  5:43   ` Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
  2 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:43 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	ruifeng.wang, Gavin Hu, Steve Capper, Ola Liljedahl

Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 lib/ring/rte_ring_c11_pvt.h     | 4 ++--
 lib/ring/rte_ring_generic_pvt.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
index 759192f4c4..37e0b2afd6 100644
--- a/lib/ring/rte_ring_c11_pvt.h
+++ b/lib/ring/rte_ring_c11_pvt.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2021 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index 532deb5e7a..c95ad7e12c 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,8 +23,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	ht->tail = new_val;
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
@ 2021-07-07  5:43   ` Ruifeng Wang
  2021-07-07 12:15     ` David Marchand
  2 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:43 UTC (permalink / raw)
  To: Jerin Jacob, Ruifeng Wang, Jan Viktorin, Bruce Richardson
  Cc: dev, david.marchand, thomas, nd, honnappa.nagarahalli

Introduce a meson option 'use_wfe' to select wait until equal method.
The default is disable. Traditional polling loop is used.
When enabled, architecture specific mechanism is relied on to do the
wait.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 config/arm/meson.build | 2 +-
 meson_options.txt      | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 9b147c0b93..9ea478fb77 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -18,7 +18,6 @@ flags_common = [
         #    ['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
 
         ['RTE_SCHED_VECTOR', false],
-        ['RTE_ARM_USE_WFE', false],
         ['RTE_ARCH_ARM64', true],
         ['RTE_CACHE_LINE_SIZE', 128]
 ]
@@ -371,6 +370,7 @@ socs = {
 
 dpdk_conf.set('RTE_ARCH_ARM', 1)
 dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
+dpdk_conf.set('RTE_ARM_USE_WFE', get_option('use_wfe'))
 
 if dpdk_conf.get('RTE_ARCH_32')
     # armv7 build
diff --git a/meson_options.txt b/meson_options.txt
index 56bdfd0f0a..5012803c77 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -44,3 +44,5 @@ option('tests', type: 'boolean', value: true, description:
        'build unit tests')
 option('use_hpet', type: 'boolean', value: false, description:
        'use HPET timer in EAL')
+option('use_wfe', type: 'boolean', value: false, description:
+       'use Wait Until Equal for polling loops')
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring
  2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
                   ` (6 preceding siblings ...)
  2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
@ 2021-07-07  5:48 ` Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
                     ` (3 more replies)
  7 siblings, 4 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:48 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.
With the wait until equal APIs being stable, changes will not impact ABI.

[1] http://patches.dpdk.org/cover/62703/

v4:
Added meson option to expose WFE. (David, Bruce)

v3:
Series rebased. (David)

Gavin Hu (1):
  spinlock: use wfe to reduce contention on aarch64

Ruifeng Wang (2):
  ring: use wfe to wait for ring tail update on aarch64
  build: add option to enable wait until equal

 config/arm/meson.build                 | 2 +-
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 lib/ring/rte_ring_c11_pvt.h            | 4 ++--
 lib/ring/rte_ring_generic_pvt.h        | 3 +--
 meson_options.txt                      | 2 ++
 5 files changed, 8 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
@ 2021-07-07  5:48   ` Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:48 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang, Gavin Hu, Phil Yang,
	Steve Capper, Ola Liljedahl, Pavan Nikhilesh, Konstantin Ananyev

From: Gavin Hu <gavin.hu@arm.com>

In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/include/generic/rte_spinlock.h b/lib/eal/include/generic/rte_spinlock.h
index 87ae7a4f18..40fe49d5ad 100644
--- a/lib/eal/include/generic/rte_spinlock.h
+++ b/lib/eal/include/generic/rte_spinlock.h
@@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
 	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
 				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-			rte_pause();
+		rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
+			       0, __ATOMIC_RELAXED);
 		exp = 0;
 	}
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update on aarch64
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
@ 2021-07-07  5:48   ` Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
  2021-07-09 18:39   ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Thomas Monjalon
  3 siblings, 0 replies; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:48 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Konstantin Ananyev
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	ruifeng.wang, Gavin Hu, Steve Capper, Ola Liljedahl

Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/ring/rte_ring_c11_pvt.h     | 4 ++--
 lib/ring/rte_ring_generic_pvt.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
index 759192f4c4..37e0b2afd6 100644
--- a/lib/ring/rte_ring_c11_pvt.h
+++ b/lib/ring/rte_ring_c11_pvt.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2021 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index 532deb5e7a..c95ad7e12c 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,8 +23,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	ht->tail = new_val;
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
@ 2021-07-07  5:48   ` Ruifeng Wang
  2021-07-07  6:32     ` Thomas Monjalon
  2021-07-09 18:39   ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Thomas Monjalon
  3 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  5:48 UTC (permalink / raw)
  To: Jan Viktorin, Ruifeng Wang, Jerin Jacob, Bruce Richardson
  Cc: dev, david.marchand, thomas, nd, honnappa.nagarahalli

Introduce a meson option 'use_wfe' to select wait until equal method.
The default is disable. Traditional polling loop is used.
When enabled, architecture specific mechanism is relied on to do the
wait.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 config/arm/meson.build | 2 +-
 meson_options.txt      | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 9b147c0b93..9ea478fb77 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -18,7 +18,6 @@ flags_common = [
         #    ['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
 
         ['RTE_SCHED_VECTOR', false],
-        ['RTE_ARM_USE_WFE', false],
         ['RTE_ARCH_ARM64', true],
         ['RTE_CACHE_LINE_SIZE', 128]
 ]
@@ -371,6 +370,7 @@ socs = {
 
 dpdk_conf.set('RTE_ARCH_ARM', 1)
 dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
+dpdk_conf.set('RTE_ARM_USE_WFE', get_option('use_wfe'))
 
 if dpdk_conf.get('RTE_ARCH_32')
     # armv7 build
diff --git a/meson_options.txt b/meson_options.txt
index 56bdfd0f0a..5012803c77 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -44,3 +44,5 @@ option('tests', type: 'boolean', value: true, description:
        'build unit tests')
 option('use_hpet', type: 'boolean', value: false, description:
        'use HPET timer in EAL')
+option('use_wfe', type: 'boolean', value: false, description:
+       'use Wait Until Equal for polling loops')
-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
@ 2021-07-07  6:32     ` Thomas Monjalon
  2021-07-07  6:46       ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Monjalon @ 2021-07-07  6:32 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Jan Viktorin, Jerin Jacob, Bruce Richardson, dev, david.marchand,
	nd, honnappa.nagarahalli

07/07/2021 07:48, Ruifeng Wang:
> Introduce a meson option 'use_wfe' to select wait until equal method.
> The default is disable. Traditional polling loop is used.
> When enabled, architecture specific mechanism is relied on to do the
> wait.

Why do we need an option?
Can it be automatic to enable it when supported?



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  6:32     ` Thomas Monjalon
@ 2021-07-07  6:46       ` Ruifeng Wang
  2021-07-07 12:27         ` Bruce Richardson
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-07  6:46 UTC (permalink / raw)
  To: thomas
  Cc: Jan Viktorin, jerinj, Bruce Richardson, dev, david.marchand, nd,
	Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, July 7, 2021 2:32 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Jan Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com; Bruce
> Richardson <bruce.richardson@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [PATCH v4 3/3] build: add option to enable wait until equal
> 
> 07/07/2021 07:48, Ruifeng Wang:
> > Introduce a meson option 'use_wfe' to select wait until equal method.
> > The default is disable. Traditional polling loop is used.
> > When enabled, architecture specific mechanism is relied on to do the
> > wait.
> 
> Why do we need an option?
> Can it be automatic to enable it when supported?
> 
The problem is inconsistency in performance on different Arm platforms. We had decided that each platform needs to enable it after some testing.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
@ 2021-07-07 12:15     ` David Marchand
  0 siblings, 0 replies; 60+ messages in thread
From: David Marchand @ 2021-07-07 12:15 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Jerin Jacob, Jan Viktorin, Bruce Richardson, dev,
	Thomas Monjalon, nd, Honnappa Nagarahalli

On Wed, Jul 7, 2021 at 7:44 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> diff --git a/meson_options.txt b/meson_options.txt
> index 56bdfd0f0a..5012803c77 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -44,3 +44,5 @@ option('tests', type: 'boolean', value: true, description:
>         'build unit tests')
>  option('use_hpet', type: 'boolean', value: false, description:
>         'use HPET timer in EAL')
> +option('use_wfe', type: 'boolean', value: false, description:
> +       'use Wait Until Equal for polling loops')

This is badly described.
We have a wait until equal API.

What you want to enable with this option is the hw assisted implementation.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07  6:46       ` Ruifeng Wang
@ 2021-07-07 12:27         ` Bruce Richardson
  2021-07-07 12:36           ` Jerin Jacob
  0 siblings, 1 reply; 60+ messages in thread
From: Bruce Richardson @ 2021-07-07 12:27 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: thomas, Jan Viktorin, jerinj, dev, david.marchand, nd,
	Honnappa Nagarahalli

On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Wednesday, July 7, 2021 2:32 PM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > Cc: Jan Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com; Bruce
> > Richardson <bruce.richardson@intel.com>; dev@dpdk.org;
> > david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>
> > Subject: Re: [PATCH v4 3/3] build: add option to enable wait until equal
> > 
> > 07/07/2021 07:48, Ruifeng Wang:
> > > Introduce a meson option 'use_wfe' to select wait until equal method.
> > > The default is disable. Traditional polling loop is used.
> > > When enabled, architecture specific mechanism is relied on to do the
> > > wait.
> > 
> > Why do we need an option?
> > Can it be automatic to enable it when supported?
> > 
> The problem is inconsistency in performance on different Arm platforms. We had decided that each platform needs to enable it after some testing.
> 
Can that not be done via variables in the cross-file for the builds, or via
automatic detection if it's a native build? Is it likely that individual
users of DPDK will be knowledgable enough to use this option correctly?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07 12:27         ` Bruce Richardson
@ 2021-07-07 12:36           ` Jerin Jacob
  2021-07-08  6:25             ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Jerin Jacob @ 2021-07-07 12:36 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Ruifeng Wang, thomas, Jan Viktorin, jerinj, dev, david.marchand,
	nd, Honnappa Nagarahalli

On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > -----Original Message-----
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Wednesday, July 7, 2021 2:32 PM
> > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > > Cc: Jan Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com; Bruce
> > > Richardson <bruce.richardson@intel.com>; dev@dpdk.org;
> > > david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>
> > > Subject: Re: [PATCH v4 3/3] build: add option to enable wait until equal
> > >
> > > 07/07/2021 07:48, Ruifeng Wang:
> > > > Introduce a meson option 'use_wfe' to select wait until equal method.
> > > > The default is disable. Traditional polling loop is used.
> > > > When enabled, architecture specific mechanism is relied on to do the
> > > > wait.
> > >
> > > Why do we need an option?
> > > Can it be automatic to enable it when supported?
> > >
> > The problem is inconsistency in performance on different Arm platforms. We had decided that each platform needs to enable it after some testing.
> >
> Can that not be done via variables in the cross-file for the builds, or via
> automatic detection if it's a native build? Is it likely that individual
> users of DPDK will be knowledgable enough to use this option correctly?

+1 to add this in cross-file instead of the top of config option as
scope if is only for arm64 builds.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
                     ` (2 preceding siblings ...)
  2021-04-28  7:42   ` [dpdk-dev] Use WFE for spinlock and ring David Marchand
@ 2021-07-07 14:47   ` Stephen Hemminger
  2021-07-08  9:41     ` Ruifeng Wang
  3 siblings, 1 reply; 60+ messages in thread
From: Stephen Hemminger @ 2021-07-07 14:47 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: dev, david.marchand, thomas, jerinj, nd, honnappa.nagarahalli

On Sun, 25 Apr 2021 05:56:51 +0000
Ruifeng Wang <ruifeng.wang@arm.com> wrote:

> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
> 
> Use the API for the rte spinlock and ring implementations.
> With the wait until equal APIs being stable, changes will not impact ABI.
> 
> [1] http://patches.dpdk.org/cover/62703/
> 
> v3:
> Series rebased. (David)
> 
> Gavin Hu (1):
>   spinlock: use wfe to reduce contention on aarch64
> 
> Ruifeng Wang (1):
>   ring: use wfe to wait for ring tail update on aarch64
> 
>  lib/eal/include/generic/rte_spinlock.h | 4 ++--
>  lib/ring/rte_ring_c11_pvt.h            | 4 ++--
>  lib/ring/rte_ring_generic_pvt.h        | 3 +--
>  3 files changed, 5 insertions(+), 6 deletions(-)
> 

Other places that should use WFE:

rte_mcslock.h:rte_mcslock_lock()
rte_mcslock_unlock:rte_mcslock_unlock()

rte_pflock.h:rte_pflock_lock()
rte_rwlock.h:rte_rwlock_read_lock()
rte_rwlock.h:rte_rwlock_write_lock()


You should also introduce rte_wait_while_XXX variants to handle some
of these cases.




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-07 12:36           ` Jerin Jacob
@ 2021-07-08  6:25             ` Ruifeng Wang
  2021-07-08  6:32               ` Jerin Jacob
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-08  6:25 UTC (permalink / raw)
  To: Jerin Jacob, Bruce Richardson
  Cc: thomas, Jan Viktorin, jerinj, dev, david.marchand, nd,
	Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Wednesday, July 7, 2021 8:37 PM
> To: Bruce Richardson <bruce.richardson@intel.com>
> Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; Jan
> Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com; dev@dpdk.org;
> david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until
> equal
> 
> On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > > -----Original Message-----
> > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > Sent: Wednesday, July 7, 2021 2:32 PM
> > > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > > > Cc: Jan Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com;
> > > > Bruce Richardson <bruce.richardson@intel.com>; dev@dpdk.org;
> > > > david.marchand@redhat.com; nd <nd@arm.com>; Honnappa
> Nagarahalli
> > > > <Honnappa.Nagarahalli@arm.com>
> > > > Subject: Re: [PATCH v4 3/3] build: add option to enable wait until
> > > > equal
> > > >
> > > > 07/07/2021 07:48, Ruifeng Wang:
> > > > > Introduce a meson option 'use_wfe' to select wait until equal method.
> > > > > The default is disable. Traditional polling loop is used.
> > > > > When enabled, architecture specific mechanism is relied on to do
> > > > > the wait.
> > > >
> > > > Why do we need an option?
> > > > Can it be automatic to enable it when supported?
> > > >
> > > The problem is inconsistency in performance on different Arm platforms.
> We had decided that each platform needs to enable it after some testing.
> > >
> > Can that not be done via variables in the cross-file for the builds,
> > or via automatic detection if it's a native build? Is it likely that
> > individual users of DPDK will be knowledgable enough to use this option
> correctly?
> 
> +1 to add this in cross-file instead of the top of config option as
> scope if is only for arm64 builds.

Currently this option is in config/arm/meson.build (flags_common). SoCs can build
with this option enabled/disabled. And the ability is available for both native build
and cross build as cross build also goes through meson.build. 
If a SoC needs to enable the option by default, an entry and be added to the SoC flags.

The key difference here is whether this option need to be exposed to the top level config.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-08  6:25             ` Ruifeng Wang
@ 2021-07-08  6:32               ` Jerin Jacob
  2021-07-08  7:32                 ` Thomas Monjalon
  0 siblings, 1 reply; 60+ messages in thread
From: Jerin Jacob @ 2021-07-08  6:32 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Bruce Richardson, thomas, Jan Viktorin, jerinj, dev,
	david.marchand, nd, Honnappa Nagarahalli

On Thu, Jul 8, 2021 at 11:55 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Wednesday, July 7, 2021 8:37 PM
> > To: Bruce Richardson <bruce.richardson@intel.com>
> > Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; Jan
> > Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com; dev@dpdk.org;
> > david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>
> > Subject: Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until
> > equal
> >
> > On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > > > -----Original Message-----
> > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > Sent: Wednesday, July 7, 2021 2:32 PM
> > > > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > > > > Cc: Jan Viktorin <viktorin@rehivetech.com>; jerinj@marvell.com;
> > > > > Bruce Richardson <bruce.richardson@intel.com>; dev@dpdk.org;
> > > > > david.marchand@redhat.com; nd <nd@arm.com>; Honnappa
> > Nagarahalli
> > > > > <Honnappa.Nagarahalli@arm.com>
> > > > > Subject: Re: [PATCH v4 3/3] build: add option to enable wait until
> > > > > equal
> > > > >
> > > > > 07/07/2021 07:48, Ruifeng Wang:
> > > > > > Introduce a meson option 'use_wfe' to select wait until equal method.
> > > > > > The default is disable. Traditional polling loop is used.
> > > > > > When enabled, architecture specific mechanism is relied on to do
> > > > > > the wait.
> > > > >
> > > > > Why do we need an option?
> > > > > Can it be automatic to enable it when supported?
> > > > >
> > > > The problem is inconsistency in performance on different Arm platforms.
> > We had decided that each platform needs to enable it after some testing.
> > > >
> > > Can that not be done via variables in the cross-file for the builds,
> > > or via automatic detection if it's a native build? Is it likely that
> > > individual users of DPDK will be knowledgable enough to use this option
> > correctly?
> >
> > +1 to add this in cross-file instead of the top of config option as
> > scope if is only for arm64 builds.
>
> Currently this option is in config/arm/meson.build (flags_common). SoCs can build
> with this option enabled/disabled. And the ability is available for both native build
> and cross build as cross build also goes through meson.build.
> If a SoC needs to enable the option by default, an entry and be added to the SoC flags.
>
> The key difference here is whether this option need to be exposed to the top level config.

In the view of limiting top-level config options and it is specific to
Arm, I think, it better to
be a cross file only option.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-08  6:32               ` Jerin Jacob
@ 2021-07-08  7:32                 ` Thomas Monjalon
  2021-07-08  9:21                   ` Ruifeng Wang
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Monjalon @ 2021-07-08  7:32 UTC (permalink / raw)
  To: Ruifeng Wang, Jerin Jacob
  Cc: Bruce Richardson, Jan Viktorin, jerinj, dev, david.marchand, nd,
	Honnappa Nagarahalli

08/07/2021 08:32, Jerin Jacob:
> On Thu, Jul 8, 2021 at 11:55 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > > On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > 07/07/2021 07:48, Ruifeng Wang:
> > > > > > > Introduce a meson option 'use_wfe' to select wait until equal method.
> > > > > > > The default is disable. Traditional polling loop is used.
> > > > > > > When enabled, architecture specific mechanism is relied on to do
> > > > > > > the wait.
> > > > > >
> > > > > > Why do we need an option?
> > > > > > Can it be automatic to enable it when supported?
> > > > > >
> > > > > The problem is inconsistency in performance on different Arm platforms.
> > > We had decided that each platform needs to enable it after some testing.
> > > > >
> > > > Can that not be done via variables in the cross-file for the builds,
> > > > or via automatic detection if it's a native build? Is it likely that
> > > > individual users of DPDK will be knowledgable enough to use this option
> > > correctly?
> > >
> > > +1 to add this in cross-file instead of the top of config option as
> > > scope if is only for arm64 builds.
> >
> > Currently this option is in config/arm/meson.build (flags_common). SoCs can build
> > with this option enabled/disabled. And the ability is available for both native build
> > and cross build as cross build also goes through meson.build.
> > If a SoC needs to enable the option by default, an entry and be added to the SoC flags.
> >
> > The key difference here is whether this option need to be exposed to the top level config.
> 
> In the view of limiting top-level config options and it is specific to
> Arm, I think, it better to
> be a cross file only option.

+1, sorry for the late notice.
I would advocate to take this patch in 21.08-rc2.




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-08  7:32                 ` Thomas Monjalon
@ 2021-07-08  9:21                   ` Ruifeng Wang
  2021-07-08 10:28                     ` Thomas Monjalon
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-08  9:21 UTC (permalink / raw)
  To: thomas, Jerin Jacob
  Cc: Bruce Richardson, Jan Viktorin, jerinj, dev, david.marchand, nd,
	Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, July 8, 2021 3:33 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Jerin Jacob
> <jerinjacobk@gmail.com>
> Cc: Bruce Richardson <bruce.richardson@intel.com>; Jan Viktorin
> <viktorin@rehivetech.com>; jerinj@marvell.com; dev@dpdk.org;
> david.marchand@redhat.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until
> equal
> 
> 08/07/2021 08:32, Jerin Jacob:
> > On Thu, Jul 8, 2021 at 11:55 AM Ruifeng Wang <Ruifeng.Wang@arm.com>
> wrote:
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:
> > > > > On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > > 07/07/2021 07:48, Ruifeng Wang:
> > > > > > > > Introduce a meson option 'use_wfe' to select wait until equal
> method.
> > > > > > > > The default is disable. Traditional polling loop is used.
> > > > > > > > When enabled, architecture specific mechanism is relied on
> > > > > > > > to do the wait.
> > > > > > >
> > > > > > > Why do we need an option?
> > > > > > > Can it be automatic to enable it when supported?
> > > > > > >
> > > > > > The problem is inconsistency in performance on different Arm
> platforms.
> > > > We had decided that each platform needs to enable it after some
> testing.
> > > > > >
> > > > > Can that not be done via variables in the cross-file for the
> > > > > builds, or via automatic detection if it's a native build? Is it
> > > > > likely that individual users of DPDK will be knowledgable enough
> > > > > to use this option
> > > > correctly?
> > > >
> > > > +1 to add this in cross-file instead of the top of config option
> > > > +as
> > > > scope if is only for arm64 builds.
> > >
> > > Currently this option is in config/arm/meson.build (flags_common).
> > > SoCs can build with this option enabled/disabled. And the ability is
> > > available for both native build and cross build as cross build also goes
> through meson.build.
> > > If a SoC needs to enable the option by default, an entry and be added to
> the SoC flags.
> > >
> > > The key difference here is whether this option need to be exposed to
> the top level config.
> >
> > In the view of limiting top-level config options and it is specific to
> > Arm, I think, it better to be a cross file only option.
> 
> +1, sorry for the late notice.
> I would advocate to take this patch in 21.08-rc2.
> 
If the decision is not to expose this option in top-level config, we can just drop 3/3 patch.
Do I need to send a new version?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-07-07 14:47   ` Stephen Hemminger
@ 2021-07-08  9:41     ` Ruifeng Wang
  2021-07-08 16:58       ` Honnappa Nagarahalli
  0 siblings, 1 reply; 60+ messages in thread
From: Ruifeng Wang @ 2021-07-08  9:41 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, david.marchand, thomas, jerinj, nd, Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, July 7, 2021 10:48 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; david.marchand@redhat.com; thomas@monjalon.net;
> jerinj@marvell.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [dpdk-dev] Use WFE for spinlock and ring
> 
> On Sun, 25 Apr 2021 05:56:51 +0000
> Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> 
> > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > 'polling for a memory location to become equal to a given value'[1].
> >
> > Use the API for the rte spinlock and ring implementations.
> > With the wait until equal APIs being stable, changes will not impact ABI.
> >
> > [1] http://patches.dpdk.org/cover/62703/
> >
> > v3:
> > Series rebased. (David)
> >
> > Gavin Hu (1):
> >   spinlock: use wfe to reduce contention on aarch64
> >
> > Ruifeng Wang (1):
> >   ring: use wfe to wait for ring tail update on aarch64
> >
> >  lib/eal/include/generic/rte_spinlock.h | 4 ++--
> >  lib/ring/rte_ring_c11_pvt.h            | 4 ++--
> >  lib/ring/rte_ring_generic_pvt.h        | 3 +--
> >  3 files changed, 5 insertions(+), 6 deletions(-)
> >
> 
> Other places that should use WFE:
Thank you Stephen for looking into this.

> 
> rte_mcslock.h:rte_mcslock_lock()
Existing API can be used in this one.

> rte_mcslock_unlock:rte_mcslock_unlock()
This one needs rte_wait_while_xxx variant.

> 
> rte_pflock.h:rte_pflock_lock()
> rte_rwlock.h:rte_rwlock_read_lock()
> rte_rwlock.h:rte_rwlock_write_lock()
These occurrences have extra logic (AND, conditional branch, CAS) in the loop.
I'm not sure generic API can be abstracted from these use cases.

> 
> 
> You should also introduce rte_wait_while_XXX variants to handle some of
> these cases.
> 



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal
  2021-07-08  9:21                   ` Ruifeng Wang
@ 2021-07-08 10:28                     ` Thomas Monjalon
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Monjalon @ 2021-07-08 10:28 UTC (permalink / raw)
  To: Jerin Jacob, Ruifeng Wang
  Cc: Bruce Richardson, Jan Viktorin, jerinj, dev, david.marchand, nd,
	Honnappa Nagarahalli, nd

08/07/2021 11:21, Ruifeng Wang:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 08/07/2021 08:32, Jerin Jacob:
> > > On Thu, Jul 8, 2021 at 11:55 AM Ruifeng Wang <Ruifeng.Wang@arm.com>
> > wrote:
> > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > On Wed, Jul 7, 2021 at 5:57 PM Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > > On Wed, Jul 07, 2021 at 06:46:33AM +0000, Ruifeng Wang wrote:
> > > > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > > > > 07/07/2021 07:48, Ruifeng Wang:
> > > > > > > > > Introduce a meson option 'use_wfe' to select wait until equal
> > method.
> > > > > > > > > The default is disable. Traditional polling loop is used.
> > > > > > > > > When enabled, architecture specific mechanism is relied on
> > > > > > > > > to do the wait.
> > > > > > > >
> > > > > > > > Why do we need an option?
> > > > > > > > Can it be automatic to enable it when supported?
> > > > > > > >
> > > > > > > The problem is inconsistency in performance on different Arm
> > platforms.
> > > > > We had decided that each platform needs to enable it after some
> > testing.
> > > > > > >
> > > > > > Can that not be done via variables in the cross-file for the
> > > > > > builds, or via automatic detection if it's a native build? Is it
> > > > > > likely that individual users of DPDK will be knowledgable enough
> > > > > > to use this option
> > > > > correctly?
> > > > >
> > > > > +1 to add this in cross-file instead of the top of config option
> > > > > +as
> > > > > scope if is only for arm64 builds.
> > > >
> > > > Currently this option is in config/arm/meson.build (flags_common).
> > > > SoCs can build with this option enabled/disabled. And the ability is
> > > > available for both native build and cross build as cross build also goes
> > through meson.build.
> > > > If a SoC needs to enable the option by default, an entry and be added to
> > the SoC flags.
> > > >
> > > > The key difference here is whether this option need to be exposed to
> > the top level config.
> > >
> > > In the view of limiting top-level config options and it is specific to
> > > Arm, I think, it better to be a cross file only option.
> > 
> > +1, sorry for the late notice.
> > I would advocate to take this patch in 21.08-rc2.
> > 
> If the decision is not to expose this option in top-level config, we can just drop 3/3 patch.
> Do I need to send a new version?

No it looks not needed to send a new version, I can take patches 1 & 2.



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-07-08  9:41     ` Ruifeng Wang
@ 2021-07-08 16:58       ` Honnappa Nagarahalli
  0 siblings, 0 replies; 60+ messages in thread
From: Honnappa Nagarahalli @ 2021-07-08 16:58 UTC (permalink / raw)
  To: Ruifeng Wang, Stephen Hemminger
  Cc: dev, david.marchand, thomas, jerinj, nd, Honnappa Nagarahalli, nd

<snip>

> >
> > On Sun, 25 Apr 2021 05:56:51 +0000
> > Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> >
> > > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > > 'polling for a memory location to become equal to a given value'[1].
> > >
> > > Use the API for the rte spinlock and ring implementations.
> > > With the wait until equal APIs being stable, changes will not impact ABI.
> > >
> > > [1] http://patches.dpdk.org/cover/62703/
> > >
> > > v3:
> > > Series rebased. (David)
> > >
> > > Gavin Hu (1):
> > >   spinlock: use wfe to reduce contention on aarch64
> > >
> > > Ruifeng Wang (1):
> > >   ring: use wfe to wait for ring tail update on aarch64
> > >
> > >  lib/eal/include/generic/rte_spinlock.h | 4 ++--
> > >  lib/ring/rte_ring_c11_pvt.h            | 4 ++--
> > >  lib/ring/rte_ring_generic_pvt.h        | 3 +--
> > >  3 files changed, 5 insertions(+), 6 deletions(-)
> > >
> >
> > Other places that should use WFE:
> Thank you Stephen for looking into this.
> 
> >
> > rte_mcslock.h:rte_mcslock_lock()
> Existing API can be used in this one.
> 
> > rte_mcslock_unlock:rte_mcslock_unlock()
> This one needs rte_wait_while_xxx variant.
> 
> >
> > rte_pflock.h:rte_pflock_lock()
> > rte_rwlock.h:rte_rwlock_read_lock()
> > rte_rwlock.h:rte_rwlock_write_lock()
> These occurrences have extra logic (AND, conditional branch, CAS) in the loop.
> I'm not sure generic API can be abstracted from these use cases.
I think it is possible to create additional abstractions to address these cases.

> 
> >
> >
> > You should also introduce rte_wait_while_XXX variants to handle some
> > of these cases.
> >
> 
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring
  2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
                     ` (2 preceding siblings ...)
  2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
@ 2021-07-09 18:39   ` Thomas Monjalon
  3 siblings, 0 replies; 60+ messages in thread
From: Thomas Monjalon @ 2021-07-09 18:39 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: dev, david.marchand, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang

07/07/2021 07:48, Ruifeng Wang:
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
> 
> Use the API for the rte spinlock and ring implementations.
> With the wait until equal APIs being stable, changes will not impact ABI.
> 
> Gavin Hu (1):
>   spinlock: use wfe to reduce contention on aarch64
> 
> Ruifeng Wang (2):
>   ring: use wfe to wait for ring tail update on aarch64
>   build: add option to enable wait until equal

As discussed in the thread, patches 1 & 2 are applied.
The patch 3 (meson option) is rejected.



^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2021-07-09 18:39 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24  7:07 [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
2020-04-24  7:07 ` [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
2020-04-24 13:23   ` Ananyev, Konstantin
2020-04-26  8:41     ` Gavin Hu
2020-04-24 14:46   ` Honnappa Nagarahalli
2020-04-26  8:33     ` Gavin Hu
2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring Gavin Hu
2020-05-01  9:47   ` Ananyev, Konstantin
2020-05-03 14:54   ` David Marchand
2020-05-03 21:31     ` Honnappa Nagarahalli
2020-09-25 14:30       ` David Marchand
2021-03-25 14:58         ` David Marchand
2021-04-25  6:06           ` Ruifeng Wang
2021-04-25 15:56             ` Thomas Monjalon
2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64 Gavin Hu
2020-04-28 12:22   ` Jerin Jacob
2020-04-26  8:39 ` [dpdk-dev] [PATCH v2 2/2] ring: use wfe to wait for ring tail update " Gavin Hu
2021-04-25  5:56 ` [dpdk-dev] Use WFE for spinlock and ring Ruifeng Wang
2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 1/2] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
2021-04-25  5:56   ` [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail update " Ruifeng Wang
2021-04-26  5:38     ` Jerin Jacob
2021-04-28 17:17     ` Stephen Hemminger
2021-04-29 14:35       ` Ruifeng Wang
2021-04-29 15:05         ` Stephen Hemminger
2021-05-07  8:25           ` Ruifeng Wang
2021-04-28  7:42   ` [dpdk-dev] Use WFE for spinlock and ring David Marchand
2021-04-28  9:30     ` Ruifeng Wang
2021-04-28 11:13       ` Thomas Monjalon
2021-04-29 14:28         ` Ruifeng Wang
2021-04-29 15:20           ` Thomas Monjalon
2021-04-30  9:16             ` Bruce Richardson
2021-04-30 13:41               ` Honnappa Nagarahalli
2021-04-30 14:19                 ` Bruce Richardson
2021-05-07 10:18                   ` Ruifeng Wang
2021-05-07 10:24                     ` Bruce Richardson
2021-07-05  8:51                       ` David Marchand
2021-07-05 10:21                         ` Ruifeng Wang
2021-07-07 14:47   ` Stephen Hemminger
2021-07-08  9:41     ` Ruifeng Wang
2021-07-08 16:58       ` Honnappa Nagarahalli
2021-07-07  5:43 ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
2021-07-07  5:43   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
2021-07-07 12:15     ` David Marchand
2021-07-07  5:48 ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Ruifeng Wang
2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 1/3] spinlock: use wfe to reduce contention on aarch64 Ruifeng Wang
2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 2/3] ring: use wfe to wait for ring tail update " Ruifeng Wang
2021-07-07  5:48   ` [dpdk-dev] [PATCH v4 3/3] build: add option to enable wait until equal Ruifeng Wang
2021-07-07  6:32     ` Thomas Monjalon
2021-07-07  6:46       ` Ruifeng Wang
2021-07-07 12:27         ` Bruce Richardson
2021-07-07 12:36           ` Jerin Jacob
2021-07-08  6:25             ` Ruifeng Wang
2021-07-08  6:32               ` Jerin Jacob
2021-07-08  7:32                 ` Thomas Monjalon
2021-07-08  9:21                   ` Ruifeng Wang
2021-07-08 10:28                     ` Thomas Monjalon
2021-07-09 18:39   ` [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).