From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 299644400A; Sun, 12 May 2024 09:11:04 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A2389402BA; Sun, 12 May 2024 09:11:03 +0200 (CEST) Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by mails.dpdk.org (Postfix) with ESMTP id 2DB35402B5 for ; Sun, 12 May 2024 09:11:02 +0200 (CEST) Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a5a1054cf61so855775866b.1 for ; Sun, 12 May 2024 00:11:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715497862; x=1716102662; darn=dpdk.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+GwpQEiRKlwP3Q+LL14R0UvDMA+4Koc0GYs5vmwWEZY=; b=ISZZjrNrAVvhMJAhoQdDAc2AQhEjL7TZiz1Iv9jc9drrL/Y8SqpH1TvmLOQQXOk09H CuMV5SQfLeu05aFE4n4TepeevePw/uMUOaJaj7uY2wYvgVkapOlUBH2kXtudEUhmRAZC RRgng06RAzbCH0enBMSXWpdcvxHnH04v/x8LtT0D62bv2lHr+W7umj0it4Cen2ADT2mU WzyN/182SQsXou2sEna38Qyg06JViBVONYWm8RP1qg5FeVdMIGess+JFYAk9clLxphT3 Vq5porRvSpR6yVPk8KN/dDF6O5d+aJcxQWc3YsBL+eHspJ2KyPQlWjWQ3t9nW71M/24x unpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715497862; x=1716102662; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+GwpQEiRKlwP3Q+LL14R0UvDMA+4Koc0GYs5vmwWEZY=; b=A6QuyWMuJIh+nP/FwmHSVKq9Ozfhv7OWEFYopjAD44sj2alMdQg0WwyHmkVFDmzyxi m/l4graWHeNtGgGq3TuknnMNBAvEbxkSGE4dwZ6umuhcfo7L/gX/kYYDLKry96VWINPU mjH18bjdLykfw0Yq+45poPaM4SUdgXCiefFzuEP/p2DeLBQGvgjlrYqO0OnNZvce1yvd 2mlHe6IZhuNcBQB1aPFVuGoFlwcc0s8nX2HsMfk358s6pEU9kw8h/DKYz1zxwMUsqDkp o7T8M85lRgjaj+vcvKtLx79KxhUZmwsS8ebrjCyqq3HLQmRGww8/a95EeAivDbDzY89N I/0g== X-Forwarded-Encrypted: i=1; AJvYcCXDqqgwlgGHFz9BhKONoy1Zf2/OWUijGpwfxAS8ZoRKagrKnZ3u9iUiODdIv9LAovjpPHgpwiYWWbi+V/o= X-Gm-Message-State: AOJu0YzSIC8KpPOJWpqwJ9d/9hcGDlgkarQ0Fl26qjuEfWW/c3UOGH66 Xhu7JV2snKLaCC/JekHm16QDCW6mDWTJXF4uMCr/UCoRd1597l/GPVYgAmZRgNFl+0ClqAECu6i 4fU80ZtC1+Z1rwn0fx0Wa47J1IYw= X-Google-Smtp-Source: AGHT+IFdqvLqBX4xi2I3dp9wlgcP/V1U06Un46K3fVxUY8dnFBJCd4GeIR1YL0c52hllr5UTKJc2T1iF3egd6csqhic= X-Received: by 2002:a17:906:dace:b0:a5a:5bc8:9fcd with SMTP id a640c23a62f3a-a5a5bc8a0bbmr117124566b.43.1715497861461; Sun, 12 May 2024 00:11:01 -0700 (PDT) MIME-Version: 1.0 References: <20240502144149.66446-1-daniel.gregory@bytedance.com> In-Reply-To: <20240502144149.66446-1-daniel.gregory@bytedance.com> From: =?UTF-8?Q?Stanis=C5=82aw_Kardach?= Date: Sun, 12 May 2024 09:10:49 +0200 Message-ID: Subject: Re: [RFC PATCH] eal/riscv: add support for zawrs extension To: Daniel Gregory Cc: Bruce Richardson , dev@dpdk.org, Liang Ma , Punit Agrawal Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, May 2, 2024 at 4:44=E2=80=AFPM Daniel Gregory wrote: > > The zawrs extension adds a pair of instructions that stall a core until > a memory location is written to. This patch uses one of them to > implement RISCV-specific versions of the rte_wait_until_equal_* > functions. This is potentially more energy efficient than the default > implementation that uses rte_pause/Zihintpause. > > The technique works as follows: > > * Create a reservation set containing the address we want to wait on > using an atomic load (lr.dw) > * Call wrs.nto - this blocks until the reservation set is invalidated by > someone else writing to that address > * Execution can also resume arbitrarily, so we still need to check > whether a change occurred and loop if not > > Due to RISC-V atomics only supporting naturally aligned word (32 bit) > and double word (64 bit) loads, I've used pointer rounding and bit > shifting to implement waiting on 16-bit values. > > This new functionality is controlled by a Meson flag that is disabled by > default. > > Signed-off-by: Daniel Gregory > Suggested-by: Punit Agrawal > --- > > Posting as an RFC to get early feedback and enable testing by others > with Zawrs-enabled hardware. Whilst I have been able to test it compiles > & passes tests using QEMU, I am waiting on some Zawrs-enabled hardware > to become available before I carry out performance tests. > > Nonetheless, I would be glad to hear any feedback on the general > approach. Thanks, Daniel > > config/riscv/meson.build | 5 ++ > lib/eal/riscv/include/rte_pause.h | 139 ++++++++++++++++++++++++++++++ > 2 files changed, 144 insertions(+) > > diff --git a/config/riscv/meson.build b/config/riscv/meson.build > index 07d7d9da23..4cfdc42ecb 100644 > --- a/config/riscv/meson.build > +++ b/config/riscv/meson.build > @@ -26,6 +26,11 @@ flags_common =3D [ > # read from /proc/device-tree/cpus/timebase-frequency. This property= is > # guaranteed on Linux, as riscv time_init() requires it. > ['RTE_RISCV_TIME_FREQ', 0], > + > + # Enable use of RISC-V Wait-on-Reservation-Set extension (Zawrs) > + # Mitigates looping when polling on memory locations > + # Make sure to add '_zawrs' to your target's -march below > + ['RTE_RISCV_ZAWRS', false] A bit orthogonal to this patch (or maybe not?) Should we perhaps add a Qemu target in meson.build which would have the modified -march for what qemu supports now? Or perhaps add machine detection logic based either on the "riscv,isa" cpu@0 property in the DT or RHCT ACPI table? Or add perhaps some other way we could specify the extension list suffix for -march? > ] > > ## SoC-specific options. > diff --git a/lib/eal/riscv/include/rte_pause.h b/lib/eal/riscv/include/rt= e_pause.h > index cb8e9ca52d..e7b43dffa3 100644 > --- a/lib/eal/riscv/include/rte_pause.h > +++ b/lib/eal/riscv/include/rte_pause.h > @@ -11,6 +11,12 @@ > extern "C" { > #endif > > +#ifdef RTE_RISCV_ZAWRS > +#define RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED > +#endif > + > +#include > + > #include "rte_atomic.h" > > #include "generic/rte_pause.h" > @@ -24,6 +30,139 @@ static inline void rte_pause(void) > asm volatile(".int 0x0100000F" : : : "memory"); > } > > +#ifdef RTE_RISCV_ZAWRS > + > +/* > + * Atomic load from an address, it returns either a sign-extended word o= r > + * doubleword and creates a 'reservation set' containing the read memory > + * location. When someone else writes to the reservation set, it is inva= lidated, > + * causing any stalled WRS instructions to resume. > + * > + * Address needs to be naturally aligned. > + */ > +#define __RTE_RISCV_LR_32(src, dst, memorder) do { \ > + if ((memorder) =3D=3D rte_memory_order_relaxed) { \ > + asm volatile("lr.w %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } else { \ > + asm volatile("lr.w.aq %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } } while (0) > +#define __RTE_RISCV_LR_64(src, dst, memorder) do { \ > + if ((memorder) =3D=3D rte_memory_order_relaxed) { \ > + asm volatile("lr.d %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } else { \ > + asm volatile("lr.d.aq %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } } while (0) > + > +/* > + * There's not a RISC-V atomic load primitive for halfwords, so cast up = to a > + * _naturally aligned_ word and extract the halfword we want > + */ > +#define __RTE_RISCV_LR_16(src, dst, memorder) do { = \ > + uint32_t word; \ > + __RTE_RISCV_LR_32(((uintptr_t)(src) & (~3)), word, (memorder)); \ > + if ((size_t)(src) & 3) \ > + (dst) =3D (typeof(dst))(word >> 16); = \ > + else \ > + (dst) =3D (typeof(dst))(word & 0xFFFF); = \ > +} while (0) > + > +#define __RTE_RISCV_LR(src, dst, memorder, size) { \ > + RTE_BUILD_BUG_ON(size !=3D 16 && size !=3D 32 && size !=3D 64); \ > + if (size =3D=3D 16) \ > + __RTE_RISCV_LR_16(src, dst, memorder); \ > + else if (size =3D=3D 32) \ > + __RTE_RISCV_LR_32(src, dst, memorder); \ > + else if (size =3D=3D 64) \ > + __RTE_RISCV_LR_64(src, dst, memorder); \ > +} > + > +/* > + * Wait-on-Reservation-Set extension instruction, it stalls execution un= til the > + * reservation set is invalidated or an interrupt is observed. > + * A loop is likely still needed as it may stop stalling arbitrarily. > + */ > +#define __RTE_RISCV_WRS_NTO() { asm volatile("wrs.nto" : : : "memory"); = } > + > +static __rte_always_inline void > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected, > + int memorder) > +{ > + uint16_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 1) =3D=3D 0); > + > + __RTE_RISCV_LR_16(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_16(addr, value, memorder); > + } > +} > + > +static __rte_always_inline void > +rte_wait_until_equal_32(volatile uint32_t *addr, uint32_t expected, > + int memorder) > +{ > + uint32_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 3) =3D=3D 0); > + > + __RTE_RISCV_LR_32(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_32(addr, value, memorder); > + } > +} > + > +static __rte_always_inline void > +rte_wait_until_equal_64(volatile uint64_t *addr, uint64_t expected, > + int memorder) > +{ > + uint64_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 7) =3D=3D 0); > + > + __RTE_RISCV_LR_64(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_64(addr, value, memorder); > + } > +} > + > +#define RTE_WAIT_UNTIL_MASKED(addr, mask, cond, expected, memorder) do {= \ > + RTE_BUILD_BUG_ON(!__builtin_constant_p(memorder)); = \ > + RTE_BUILD_BUG_ON(memorder !=3D rte_memory_order_acquire && = \ > + memorder !=3D rte_memory_order_relaxed); = \ > + RTE_ASSERT(((size_t)(addr) & (sizeof(*(addr)) - 1)) !=3D 0); = \ > + const uint32_t size =3D sizeof(*(addr)) << 3; = \ > + typeof(*(addr)) expected_value =3D (expected); = \ > + typeof(*(addr)) value; = \ > + __RTE_RISCV_LR((addr), value, memorder, size); = \ > + while (!((value & (mask)) cond expected_value)) { = \ > + __RTE_RISCV_WRS_NTO(); = \ > + __RTE_RISCV_LR((addr), value, memorder, size); = \ > + } = \ > +} while (0) > + > +#endif /* RTE_RISCV_ZAWRS */ > + > #ifdef __cplusplus > } > #endif Reviewed-by: Stanislaw Kardach > -- > 2.39.2 > > ] > > ## SoC-specific options. > diff --git a/lib/eal/riscv/include/rte_pause.h b/lib/eal/riscv/include/rt= e_pause.h > index cb8e9ca52d..e7b43dffa3 100644 > --- a/lib/eal/riscv/include/rte_pause.h > +++ b/lib/eal/riscv/include/rte_pause.h > @@ -11,6 +11,12 @@ > extern "C" { > #endif > > +#ifdef RTE_RISCV_ZAWRS > +#define RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED > +#endif > + > +#include > + > #include "rte_atomic.h" > > #include "generic/rte_pause.h" > @@ -24,6 +30,139 @@ static inline void rte_pause(void) > asm volatile(".int 0x0100000F" : : : "memory"); > } > > +#ifdef RTE_RISCV_ZAWRS > + > +/* > + * Atomic load from an address, it returns either a sign-extended word o= r > + * doubleword and creates a 'reservation set' containing the read memory > + * location. When someone else writes to the reservation set, it is inva= lidated, > + * causing any stalled WRS instructions to resume. > + * > + * Address needs to be naturally aligned. > + */ > +#define __RTE_RISCV_LR_32(src, dst, memorder) do { \ > + if ((memorder) =3D=3D rte_memory_order_relaxed) { \ > + asm volatile("lr.w %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } else { \ > + asm volatile("lr.w.aq %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } } while (0) > +#define __RTE_RISCV_LR_64(src, dst, memorder) do { \ > + if ((memorder) =3D=3D rte_memory_order_relaxed) { \ > + asm volatile("lr.d %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } else { \ > + asm volatile("lr.d.aq %0, (%1)" \ > + : "=3Dr" (dst) \ > + : "r" (src) \ > + : "memory"); \ > + } } while (0) > + > +/* > + * There's not a RISC-V atomic load primitive for halfwords, so cast up = to a > + * _naturally aligned_ word and extract the halfword we want > + */ > +#define __RTE_RISCV_LR_16(src, dst, memorder) do { = \ > + uint32_t word; \ > + __RTE_RISCV_LR_32(((uintptr_t)(src) & (~3)), word, (memorder)); \ > + if ((size_t)(src) & 3) \ > + (dst) =3D (typeof(dst))(word >> 16); = \ > + else \ > + (dst) =3D (typeof(dst))(word & 0xFFFF); = \ > +} while (0) > + > +#define __RTE_RISCV_LR(src, dst, memorder, size) { \ > + RTE_BUILD_BUG_ON(size !=3D 16 && size !=3D 32 && size !=3D 64); \ > + if (size =3D=3D 16) \ > + __RTE_RISCV_LR_16(src, dst, memorder); \ > + else if (size =3D=3D 32) \ > + __RTE_RISCV_LR_32(src, dst, memorder); \ > + else if (size =3D=3D 64) \ > + __RTE_RISCV_LR_64(src, dst, memorder); \ > +} > + > +/* > + * Wait-on-Reservation-Set extension instruction, it stalls execution un= til the > + * reservation set is invalidated or an interrupt is observed. > + * A loop is likely still needed as it may stop stalling arbitrarily. > + */ > +#define __RTE_RISCV_WRS_NTO() { asm volatile("wrs.nto" : : : "memory"); = } > + > +static __rte_always_inline void > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected, > + int memorder) > +{ > + uint16_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 1) =3D=3D 0); > + > + __RTE_RISCV_LR_16(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_16(addr, value, memorder); > + } > +} > + > +static __rte_always_inline void > +rte_wait_until_equal_32(volatile uint32_t *addr, uint32_t expected, > + int memorder) > +{ > + uint32_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 3) =3D=3D 0); > + > + __RTE_RISCV_LR_32(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_32(addr, value, memorder); > + } > +} > + > +static __rte_always_inline void > +rte_wait_until_equal_64(volatile uint64_t *addr, uint64_t expected, > + int memorder) > +{ > + uint64_t value; > + > + RTE_ASSERT(memorder =3D=3D rte_memory_order_acquire || > + memorder =3D=3D rte_memory_order_relaxed); > + RTE_ASSERT(((size_t)addr & 7) =3D=3D 0); > + > + __RTE_RISCV_LR_64(addr, value, memorder); > + while (value !=3D expected) { > + __RTE_RISCV_WRS_NTO(); > + __RTE_RISCV_LR_64(addr, value, memorder); > + } > +} > + > +#define RTE_WAIT_UNTIL_MASKED(addr, mask, cond, expected, memorder) do {= \ > + RTE_BUILD_BUG_ON(!__builtin_constant_p(memorder)); = \ > + RTE_BUILD_BUG_ON(memorder !=3D rte_memory_order_acquire && = \ > + memorder !=3D rte_memory_order_relaxed); = \ > + RTE_ASSERT(((size_t)(addr) & (sizeof(*(addr)) - 1)) !=3D 0); = \ > + const uint32_t size =3D sizeof(*(addr)) << 3; = \ > + typeof(*(addr)) expected_value =3D (expected); = \ > + typeof(*(addr)) value; = \ > + __RTE_RISCV_LR((addr), value, memorder, size); = \ > + while (!((value & (mask)) cond expected_value)) { = \ > + __RTE_RISCV_WRS_NTO(); = \ > + __RTE_RISCV_LR((addr), value, memorder, size); = \ > + } = \ > +} while (0) > + > +#endif /* RTE_RISCV_ZAWRS */ > + > #ifdef __cplusplus > } > #endif > -- > 2.39.2 >