From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D1E9645ACD; Mon, 7 Oct 2024 09:48:44 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 988B140E0B; Mon, 7 Oct 2024 09:48:44 +0200 (CEST) Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by mails.dpdk.org (Postfix) with ESMTP id 0B3A640151 for ; Mon, 7 Oct 2024 09:48:43 +0200 (CEST) Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-5c883459b19so4905990a12.2 for ; Mon, 07 Oct 2024 00:48:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728287322; x=1728892122; darn=dpdk.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FUmmoR1c5+RnynB+T0PS9fiYo6o43decDYFdiT6GbN8=; b=lJVrH5ASX7/ieaUvGxpPiun7x62kk99pU4O1EfYJcDYCVK94Z9Likn9wzisI3vU3Ko WB3YpETqqN6ar0nM0kw7/ty7//ohA7LBaQ0GqOX8zX3C7HmWGK3yotJTS+JZjw8QhyBP ECEGgwg+KXqybLBEQP03QcJ/6629fG0Mm+F/9KO8HcdxICKSBh9jpaEmLD3p8t+/LAYE FjaU7NrhlpQLjiQViotyJdDfPxR8i568kOt6wjl6XDC1j060Q1UFBhMRfhORxJymsU1f piFhVkFRykdKgGaWYLhan/iHyOIhMcZRA0jQSDeLqqqQ+cpQj6e0s9yMfWde+XmyngFZ 1hKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728287322; x=1728892122; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FUmmoR1c5+RnynB+T0PS9fiYo6o43decDYFdiT6GbN8=; b=EQY9rcSqJDR7E4hbjM/zRNm/QYGSq9VWc631xVL5necbcIBHuhcuqFHjG593d9bg6Z ATQniWFKP/pmNQ0C+O0lDwddIJ8L1vv0AjSrf9SVupekUOsHW1cuuIAxtKpEvqXZ/Ag7 ZHYCToHcA1k1lF5LGELnYJQCSuApcewIEO7ph4VxCkZ7XLLOJg306uY8bn66141MRVBY VUFujQzUzg6XjBZGkcDO/zsD3xcGbLQeb5fVaC9ZgdlABiy5+Y3QLYUr2wo9E1VHdCae r9h73T79uojzeL7UsxudGzIMzsxKh/Iz7kX8lbxBiK5GQ+xQXFkRAAprCqyfcy1FIKCQ IDzA== X-Forwarded-Encrypted: i=1; AJvYcCUsA8VF4CUfGJCal5QmU8jTYKExwSL0RdaivQgJOxElYyFYDnK6x9SE3W62LRdyinZayRw=@dpdk.org X-Gm-Message-State: AOJu0YwXyLLe748oEHVpvbthf1Hf6qwyJJ8it9HVo+6SvDXxPUfnQg+a TrBV1da1VXcGll6t/+fOFTBFUE5ASXxFAXxamJPT9wDz58F2LjonlrDkZFdHdd3FS2eiOpnvnRH XFsm74Mo43MLUsXoWU9vRAHGE1V4= X-Google-Smtp-Source: AGHT+IGk3gijzDmMk52RvmepHmmBiqaDjkAMfWqTeRGl5llmGIuBTDZujB9Fo3cn+a+O5nT+fbZZtQLyhLD8lWeh4IA= X-Received: by 2002:a05:6402:510c:b0:5c8:8416:cda7 with SMTP id 4fb4d7f45d1cf-5c8d2e28ba0mr9307761a12.15.1728287322427; Mon, 07 Oct 2024 00:48:42 -0700 (PDT) MIME-Version: 1.0 References: <20240530171948.19763-1-daniel.gregory@bytedance.com> <20240530171948.19763-3-daniel.gregory@bytedance.com> In-Reply-To: <20240530171948.19763-3-daniel.gregory@bytedance.com> From: =?UTF-8?Q?Stanis=C5=82aw_Kardach?= Date: Mon, 7 Oct 2024 09:48:31 +0200 Message-ID: Subject: Re: [PATCH 2/2] eal/riscv: add support for zicbop extension To: Daniel Gregory Cc: Bruce Richardson , Tyler Retzlaff , dev@dpdk.org, Liang Ma , Punit Agrawal Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, May 30, 2024 at 7:20=E2=80=AFPM Daniel Gregory wrote: > > The zicbop extension adds instructions for prefetching data into cache. > Use them to implement RISCV-specific versions of the rte_prefetch* and > rte_prefetch*_write functions. > > - prefetch.r indicates to hardware that the cache block will be accessed > by a data read soon > - prefetch.w indicates to hardware that the cache block will be accessed > by a data write soon > > These instructions are emitted by __builtin_prefetch on modern versions > of Clang (17.0.1+) and GCC (13.1.0+). For earlier versions, we may not > have support for assembling Zicbop instructions, so emit the word > that encodes a 'prefetch.[rw] 0(a0)' instruction. Is there a benefit of adding this flag instead of relying on compiler implementation of __builtin_prefetch()? As in do you have some requirements for older compiler support? If just using __builtin_prefetch could simplify this code a lot and push the detection of zicbop to the compiler implementation. The runtime detection is another issue and would require runtime patching to keep the code fast. > > This new functionality is controlled by a Meson flag that is disabled by > default. Whilst it's a hint, like rte_pause(), and so has no effect if > the target doesn't support the extension, it requires the address > prefetched to be loaded into a0, which may be costly. > > Signed-off-by: Daniel Gregory > Suggested-by: Punit Agrawal > --- > config/riscv/meson.build | 6 +++ > lib/eal/riscv/include/rte_prefetch.h | 57 ++++++++++++++++++++++++++-- > 2 files changed, 59 insertions(+), 4 deletions(-) > > diff --git a/config/riscv/meson.build b/config/riscv/meson.build > index 07d7d9da23..ecf9da1c39 100644 > --- a/config/riscv/meson.build > +++ b/config/riscv/meson.build > @@ -26,6 +26,12 @@ flags_common =3D [ > # read from /proc/device-tree/cpus/timebase-frequency. This property= is > # guaranteed on Linux, as riscv time_init() requires it. > ['RTE_RISCV_TIME_FREQ', 0], > + > + # When true override the default implementation of the prefetching f= unctions > + # (rte_prefetch*) with a version that explicitly uses the Zicbop ext= ension. > + # Do not enable when using modern versions of GCC (13.1.0+) or Clang > + # (17.0.1+). They will emit these instructions in the default implem= entation > + ['RTE_RISCV_ZICBOP', false], > ] > > ## SoC-specific options. > diff --git a/lib/eal/riscv/include/rte_prefetch.h b/lib/eal/riscv/include= /rte_prefetch.h > index 748cf1b626..82cad526b3 100644 > --- a/lib/eal/riscv/include/rte_prefetch.h > +++ b/lib/eal/riscv/include/rte_prefetch.h > @@ -14,21 +14,42 @@ extern "C" { > > #include > #include > + > +#ifdef RTE_RISCV_ZICBOP > +#define RTE_PREFETCH_WRITE_ARCH_DEFINED > +#endif > + > #include "generic/rte_prefetch.h" > > +/* > + * Modern versions of GCC & Clang will emit prefetch instructions for > + * __builtin_prefetch when the Zicbop extension is present. > + * The RTE_RISCV_ZICBOP option controls whether we emit them manually fo= r older > + * compilers that may not have the support to assemble them. > + */ > static inline void rte_prefetch0(const volatile void *p) > { > - RTE_SET_USED(p); > +#ifndef RTE_RISCV_ZICBOP > + /* by default __builtin_prefetch prepares for a read */ > + __builtin_prefetch((const void *)p); > +#else > + /* prefetch.r 0(a0) */ > + register const volatile void *a0 asm("a0") =3D p; > + asm volatile (".int 0x00156013" : : "r" (a0)); > +#endif > } > > +/* > + * The RISC-V Zicbop extension doesn't have instructions to prefetch to = only a > + * subset of cache levels, so fallback to rte_prefetch0 > + */ > static inline void rte_prefetch1(const volatile void *p) > { > - RTE_SET_USED(p); > + rte_prefetch0(p); > } > - > static inline void rte_prefetch2(const volatile void *p) > { > - RTE_SET_USED(p); > + rte_prefetch0(p); > } > > static inline void rte_prefetch_non_temporal(const volatile void *p) > @@ -44,6 +65,34 @@ rte_cldemote(const volatile void *p) > RTE_SET_USED(p); > } > > +#ifdef RTE_RISCV_ZICBOP > +__rte_experimental > +static inline void > +rte_prefetch0_write(const void *p) > +{ > + /* prefetch.w 0(a0) */ > + register const void *a0 asm("a0") =3D p; > + asm volatile (".int 0x00356013" : : "r" (a0)); > +} > + > +/* > + * The RISC-V Zicbop extension doesn't have instructions to prefetch to = only a > + * subset of cache levels, so fallback to rte_prefetch0_write > + */ > +__rte_experimental > +static inline void > +rte_prefetch1_write(const void *p) > +{ > + rte_prefetch0_write(p); > +} > +__rte_experimental > +static inline void > +rte_prefetch2_write(const void *p) > +{ > + rte_prefetch0_write(p); > +} > +#endif /* RTE_RISCV_ZICBOP */ > + > #ifdef __cplusplus > } > #endif > -- > 2.39.2 >