* Re: [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail update on aarch64
@ 2019-07-20 6:50 Pavan Nikhilesh Bhagavatula
0 siblings, 0 replies; 2+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2019-07-20 6:50 UTC (permalink / raw)
To: Gavin Hu, dev; +Cc: nd
>-----Original Message-----
>From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu
>Sent: Wednesday, July 3, 2019 2:29 PM
>To: dev@dpdk.org
>Cc: nd@arm.com
>Subject: [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail
>update on aarch64
>Instead of polling for tail to be updated, use wfe instruction.
>
>Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>Reviewed-by: Steve Capper <steve.capper@arm.com>
>Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
>Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested on octeontx2 board and noticed 2-5% perf improvement
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>---
> lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> lib/librte_ring/rte_ring_generic.h | 3 +--
> 2 files changed, 3 insertions(+), 4 deletions(-)
>
>2.7.4
^ permalink raw reply [flat|nested] 2+ messages in thread
* [dpdk-dev] [RFC v2 0/5] use WFE for locks and ring on aarch64
2019-06-30 16:21 ` [dpdk-dev] [RFC " Gavin Hu
@ 2019-07-03 8:58 Gavin Hu
2019-06-30 16:21 ` [dpdk-dev] [RFC " Gavin Hu
0 siblings, 1 reply; 2+ messages in thread
From: Gavin Hu @ 2019-07-03 8:58 UTC (permalink / raw)
To: dev; +Cc: nd
DPDK has multiple use cases where the core repeatedly polls a location in
memory. This polling results in many cache and memory transactions.
Arm architecture provides WFE (Wait For Event) instruction, which allows
the cpu core to enter a low power state until woken up by the update to the
memory location being polled. Thus reducing the cache and memory
transactions.
x86 has the PAUSE hint instruction to reduce such overhead.
The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'.
For non-Arm platforms, these APIs are just wrappers around do-while loop
with rte_pause, so there are no performance differences.
For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE
option. It is disabled by default.
Currently, use of WFE is supported only for aarch64 platforms. armv7
platforms do support the WFE instruction, but they require explicit wake up
events(sev) and are less performannt.
Testing shows that, performance varies across different platforms, with
some showing degradation.
CONFIG_RTE_USE_WFE should be enabled depending on the performance on the
target platforms.
V2:
* Use inline functions instead of marcos
* Add load and compare in the beginning of the APIs
* Fix some style errors in asm inline
V1:
* Add the new APIs and use it for ring and locks
Gavin Hu (5):
eal: add the APIs to wait until equal
ticketlock: use new API to reduce contention on aarch64
ring: use wfe to wait for ring tail update on aarch64
spinlock: use wfe to reduce contention on aarch64
config: add WFE config entry for aarch64
config/arm/meson.build | 1 +
config/common_armv8a_linux | 6 ++
.../common/include/arch/arm/rte_atomic_64.h | 4 +
.../common/include/arch/arm/rte_pause_64.h | 106 +++++++++++++++++++++
.../common/include/arch/arm/rte_spinlock.h | 25 +++++
lib/librte_eal/common/include/generic/rte_pause.h | 39 +++++++-
.../common/include/generic/rte_spinlock.h | 2 +-
.../common/include/generic/rte_ticketlock.h | 3 +-
lib/librte_ring/rte_ring_c11_mem.h | 4 +-
lib/librte_ring/rte_ring_generic.h | 3 +-
10 files changed, 185 insertions(+), 8 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 2+ messages in thread
* [dpdk-dev] [RFC 0/5] use WFE for locks and ring on aarch64
@ 2019-06-30 16:21 ` Gavin Hu
2019-07-03 8:58 ` [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail update " Gavin Hu
0 siblings, 1 reply; 2+ messages in thread
From: Gavin Hu @ 2019-06-30 16:21 UTC (permalink / raw)
To: dev
Cc: thomas, jerinj, hemant.agrawal, bruce.richardson, chaozhu,
Honnappa.Nagarahalli, nd, gavin.hu
DPDK has multiple use cases where the core repeatedly polls a location in
memory. This polling results in many cache and memory transactions.
Arm architecture provides WFE (Wait For Event) instruction, which allows
the cpu core to enter a low power state until woken up by the update to the
memory location being polled. Thus reducing the cache and memory
transactions.
x86 has the PAUSE hint instruction to reduce such overhead.
The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'.
For non-Arm platforms, these APIs are just wrappers around do-while loop
with rte_pause, so there are no performance differences.
For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE
option. It is disabled by default.
Currently, use of WFE is supported only for aarch64 platforms. armv7
platforms do support the WFE instruction, but they require explicit wake up
events(sev) and are less performannt.
Testing shows that, performance varies across different platforms, with
some showing degradation.
CONFIG_RTE_USE_WFE should be enabled depending on the performance on the
target platforms.
Gavin Hu (5):
eal: add the APIs to wait until equal
ticketlock: use new API to reduce contention on aarch64
ring: use wfe to wait for ring tail update on aarch64
spinlock: use wfe to reduce contention on aarch64
config: add WFE config entry for aarch64
config/arm/meson.build | 1 +
config/common_armv8a_linux | 6 +
.../common/include/arch/arm/rte_pause_64.h | 143 +++++++++++++++++++++
.../common/include/arch/arm/rte_spinlock.h | 25 ++++
lib/librte_eal/common/include/generic/rte_pause.h | 20 +++
.../common/include/generic/rte_spinlock.h | 2 +-
.../common/include/generic/rte_ticketlock.h | 4 +-
lib/librte_ring/rte_ring_c11_mem.h | 5 +-
lib/librte_ring/rte_ring_generic.h | 4 +-
9 files changed, 203 insertions(+), 7 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 2+ messages in thread
* [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail update on aarch64
2019-06-30 16:21 ` [dpdk-dev] [RFC " Gavin Hu
@ 2019-07-03 8:58 ` Gavin Hu
0 siblings, 0 replies; 2+ messages in thread
From: Gavin Hu @ 2019-07-03 8:58 UTC (permalink / raw)
To: dev; +Cc: nd
Instead of polling for tail to be updated, use wfe instruction.
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
lib/librte_ring/rte_ring_generic.h | 3 +--
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 0fb73a3..037811e 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -2,6 +2,7 @@
*
* Copyright (c) 2017,2018 HXT-semitech Corporation.
* Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2019 Arm Limited
* All rights reserved.
* Derived from FreeBSD's bufring.h
* Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
* we need to wait for them to complete
*/
if (!single)
- while (unlikely(ht->tail != old_val))
- rte_pause();
+ rte_wait_until_equal32(&ht->tail, old_val, __ATOMIC_RELAXED);
__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
}
diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index 953cdbb..570765c 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
* we need to wait for them to complete
*/
if (!single)
- while (unlikely(ht->tail != old_val))
- rte_pause();
+ rte_wait_until_equal32(&ht->tail, old_val, __ATOMIC_RELAXED);
ht->tail = new_val;
}
--
2.7.4
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-07-20 6:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-20 6:50 [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail update on aarch64 Pavan Nikhilesh Bhagavatula
-- strict thread matches above, loose matches on Subject: below --
2019-07-03 8:58 [dpdk-dev] [RFC v2 0/5] use WFE for locks and ring " Gavin Hu
2019-06-30 16:21 ` [dpdk-dev] [RFC " Gavin Hu
2019-07-03 8:58 ` [dpdk-dev] [RFC v2 3/5] ring: use wfe to wait for ring tail update " Gavin Hu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).