This commit adds a new rte_prefetch0_write() variant, suggests to the compiler to use a prefetch instruction with intention to write. As a compiler builtin, the compiler can choose based on compilation target what the best implementation for this instruction is. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- The integer constants passed to the builtin are not available as a #define value, and doing #defines just for this write variant does not seems a nice solution to me... particularly for those using IDEs where any #define value is auto-hinted for code-completion. --- lib/librte_eal/include/generic/rte_prefetch.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h index 6e47bdfbad..44e2e9abfc 100644 --- a/lib/librte_eal/include/generic/rte_prefetch.h +++ b/lib/librte_eal/include/generic/rte_prefetch.h @@ -51,4 +51,20 @@ static inline void rte_prefetch2(const volatile void *p); */ static inline void rte_prefetch_non_temporal(const volatile void *p); +/** + * Prefetch a cache line into all cache levels, with intention to write. This + * prefetch variant hints to the CPU that the program is expecting to write to + * the cache line being prefetched. + * + * @param p Address to prefetch + */ +static inline void rte_prefetch0_write(const void *p) +{ + /* 1 indicates intention to write, 3 sets target cache level to L1. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 3); +} + #endif /* _RTE_PREFETCH_H_ */ -- 2.17.1
>This commit adds a new rte_prefetch0_write() variant, suggests to the >compiler to use a prefetch instruction with intention to write. As a >compiler builtin, the compiler can choose based on compilation target >what the best implementation for this instruction is. Why not have the other variants too i.e. l2/l3/temporal store prefetches too? > >Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > >--- > >The integer constants passed to the builtin are not available as >a #define value, and doing #defines just for this write variant >does not seems a nice solution to me... particularly for those using >IDEs where any #define value is auto-hinted for code-completion. > >--- > lib/librte_eal/include/generic/rte_prefetch.h | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > >diff --git a/lib/librte_eal/include/generic/rte_prefetch.h >b/lib/librte_eal/include/generic/rte_prefetch.h >index 6e47bdfbad..44e2e9abfc 100644 >--- a/lib/librte_eal/include/generic/rte_prefetch.h >+++ b/lib/librte_eal/include/generic/rte_prefetch.h >@@ -51,4 +51,20 @@ static inline void rte_prefetch2(const volatile >void *p); > */ > static inline void rte_prefetch_non_temporal(const volatile void *p); > >+/** >+ * Prefetch a cache line into all cache levels, with intention to write. >This >+ * prefetch variant hints to the CPU that the program is expecting to >write to >+ * the cache line being prefetched. >+ * >+ * @param p Address to prefetch >+ */ >+static inline void rte_prefetch0_write(const void *p) >+{ >+ /* 1 indicates intention to write, 3 sets target cache level to L1. >See >+ * GCC docs where these integer constants are described in >more detail: >+ * https://urldefense.proofpoint.com/v2/url?u=https- >3A__gcc.gnu.org_onlinedocs_gcc_Other- >2DBuiltins.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=1cjuAHrG >h745jHNmj2fD85sUMIJ2IPIDsIJzo6FN6Z0&m=OMPc_t21vsWlzVl0UjdGB >hTTv1Gngqfk8vth9UQtUSA&s=51jgfiV2UC5B3xxBE- >4gRAte3lCZP3v370U7Fpn6LaE&e= >+ */ >+ __builtin_prefetch(p, 1, 3); >+} >+ > #endif /* _RTE_PREFETCH_H_ */ >-- >2.17.1
> -----Original Message----- > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> > Sent: Sunday, September 13, 2020 9:11 PM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] eal: add new prefetch0_write variant > > >This commit adds a new rte_prefetch0_write() variant, suggests to the > >compiler to use a prefetch instruction with intention to write. As a > >compiler builtin, the compiler can choose based on compilation target > >what the best implementation for this instruction is. > > Why not have the other variants too i.e. l2/l3/temporal store prefetches too? Hi Pavan, Are there architectures that actually implement those? Usually for a WB mem store to complete, the data must be present in L1 cache (on x86 at least), and that's what the patch below with write0 achieves. I'm against adding all the variants "just in case", it leads to API bloat, and increases cognitive load on the programmer. My expectation is that in 99% of usage the prefetch write instruction should target L1. Cheers, -Harry > >Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > > >--- > > > >The integer constants passed to the builtin are not available as > >a #define value, and doing #defines just for this write variant > >does not seems a nice solution to me... particularly for those using > >IDEs where any #define value is auto-hinted for code-completion. > > > >--- > > lib/librte_eal/include/generic/rte_prefetch.h | 16 ++++++++++++++++ > > 1 file changed, 16 insertions(+) <snip patch contents>
>> >This commit adds a new rte_prefetch0_write() variant, suggests to >the >> >compiler to use a prefetch instruction with intention to write. As a >> >compiler builtin, the compiler can choose based on compilation >target >> >what the best implementation for this instruction is. >> >> Why not have the other variants too i.e. l2/l3/temporal store >prefetches too? > >Hi Pavan, > Hi Harry, (LTNS) >Are there architectures that actually implement those? Usually for a WB >mem store to complete, >the data must be present in L1 cache (on x86 at least), and that's what >the patch below with write0 achieves. ARM64 does supports all modes of store prefetch " <type> is one of: PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. PLI Preload instructions, encoded in the "Rt<4:3>" field as 0b01. PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. <target> is one of: L1 Level 1 cache, encoded in the "Rt<2:1>" field as 0b00. L2 Level 2 cache, encoded in the "Rt<2:1>" field as 0b01. L3 Level 3 cache, encoded in the "Rt<2:1>" field as 0b10. <policy> is one of: KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" field as 0. STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the "Rt<0>" field as 1. For more information on these prefetch " > >I'm against adding all the variants "just in case", it leads to API bloat, >and increases >cognitive load on the programmer. My expectation is that in 99% of >usage the prefetch >write instruction should target L1. > There is a use case when cache mode is write through and application is pipelining work across cores sharing same L2 cluster. >Cheers, -Harry Regards, Pavan. > >> >Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> >> > >> >--- >> > >> >The integer constants passed to the builtin are not available as >> >a #define value, and doing #defines just for this write variant >> >does not seems a nice solution to me... particularly for those using >> >IDEs where any #define value is auto-hinted for code-completion. >> > >> >--- >> > lib/librte_eal/include/generic/rte_prefetch.h | 16 >++++++++++++++++ >> > 1 file changed, 16 insertions(+) > ><snip patch contents>
> -----Original Message----- > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> > Sent: Monday, September 14, 2020 11:39 AM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] eal: add new prefetch0_write variant > > >> >This commit adds a new rte_prefetch0_write() variant, suggests to > >the > >> >compiler to use a prefetch instruction with intention to write. As a > >> >compiler builtin, the compiler can choose based on compilation > >target > >> >what the best implementation for this instruction is. > >> > >> Why not have the other variants too i.e. l2/l3/temporal store > >prefetches too? > > > >Hi Pavan, > > > Hi Harry, > (LTNS) > > >Are there architectures that actually implement those? Usually for a WB > >mem store to complete, > >the data must be present in L1 cache (on x86 at least), and that's what > >the patch below with write0 achieves. > > ARM64 does supports all modes of store prefetch > " > <type> is one of: > PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. > PLI Preload instructions, encoded in the "Rt<4:3>" field as 0b01. > PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. > <target> is one of: > L1 Level 1 cache, encoded in the "Rt<2:1>" field as 0b00. > L2 Level 2 cache, encoded in the "Rt<2:1>" field as 0b01. > L3 Level 3 cache, encoded in the "Rt<2:1>" field as 0b10. > <policy> is one of: > KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in > the "Rt<0>" > field as 0. > STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded > in the > "Rt<0>" field as 1. > For more information on these prefetch > " > > > > >I'm against adding all the variants "just in case", it leads to API bloat, > >and increases > >cognitive load on the programmer. My expectation is that in 99% of > >usage the prefetch > >write instruction should target L1. > > > > There is a use case when cache mode is write through and application is > pipelining work across cores sharing same L2 cluster. OK - v2 sent: http://patches.dpdk.org/patch/77632/ APIs matching the existing prefetch APIs: rte_prefetch0_write() L1 and all below rte_prefetch1_write() L2 and all below rte_prefetch2_write() L3 Cheers, -Harry
This commit adds a new rte_prefetch0_write() variants, suggesting to the compiler to use a prefetch instruction with intention to write. As a compiler builtin, the compiler can choose based on compilation target what the best implementation for this instruction is. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v2: - Add L1, L2, and L3 variants as ARM64 uarch supports them (Pavan) The integer constants passed to the builtin are not available as a #define value, and doing #defines just for this write variant does not seems a nice solution to me... particularly for those using IDEs where any #define value is auto-hinted for code-completion. --- lib/librte_eal/include/generic/rte_prefetch.h | 49 +++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h index 6e47bdfbad..3dfca77a74 100644 --- a/lib/librte_eal/include/generic/rte_prefetch.h +++ b/lib/librte_eal/include/generic/rte_prefetch.h @@ -51,4 +51,53 @@ static inline void rte_prefetch2(const volatile void *p); */ static inline void rte_prefetch_non_temporal(const volatile void *p); +/** + * Prefetch a cache line into all cache levels, with intention to write. This + * prefetch variant hints to the CPU that the program is expecting to write to + * the cache line being prefetched. + * + * @param p Address to prefetch + */ +static inline void rte_prefetch0_write(const void *p) +{ + /* 1 indicates intention to write, 3 sets target cache level to L1. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 3); +} + +/** + * Prefetch a cache line into all cache levels, except the 0th, with intention + * to write. This prefetch variant hints to the CPU that the program is + * expecting to write to the cache line being prefetched. + * + * @param p Address to prefetch + */ +static inline void rte_prefetch1_write(const void *p) +{ + /* 1 indicates intention to write, 2 sets target cache level to L2. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 2); +} + +/** + * Prefetch a cache line into all cache levels, except the 0th and 1st, with + * intention to write. This prefetch variant hints to the CPU that the program + * is expecting to write to the cache line being prefetched. + * + * @param p Address to prefetch + */ +static inline void rte_prefetch2_write(const void *p) +{ + /* 1 indicates intention to write, 1 sets target cache level to L3. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 1); +} + + #endif /* _RTE_PREFETCH_H_ */ -- 2.17.1
On Mon, Sep 14, 2020 at 5:09 PM Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> This commit adds a new rte_prefetch0_write() variants, suggesting to the
> compiler to use a prefetch instruction with intention to write. As a
> compiler builtin, the compiler can choose based on compilation target
> what the best implementation for this instruction is.
>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
No review, copying arch maintainers.
--
David Marchand
> -----Original Message----- > From: David Marchand <david.marchand@redhat.com> > Sent: Thursday, October 8, 2020 8:43 AM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce > <bruce.richardson@intel.com>; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>; > Ruifeng Wang (Arm Technology China) <ruifeng.wang@arm.com>; Jan Viktorin > <viktorin@rehivetech.com>; David Christensen <drc@linux.vnet.ibm.com>; > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> > Cc: dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com> > Subject: Re: [dpdk-dev] [PATCH v2] eal: add new prefetch write variants > > On Mon, Sep 14, 2020 at 5:09 PM Harry van Haaren > <harry.van.haaren@intel.com> wrote: > > > > This commit adds a new rte_prefetch0_write() variants, suggesting to the > > compiler to use a prefetch instruction with intention to write. As a > > compiler builtin, the compiler can choose based on compilation target > > what the best implementation for this instruction is. > > > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > No review, copying arch maintainers. > > -- > David Marchand Hey All, Actually Pavan provided some feedback to add prefetch1 and prefetch2 variants: https://mails.dpdk.org/archives/dev/2020-September/thread.html#180501 I can push a v3 with those changes later today, will CC folks here. Thanks, -Harry
> -----Original Message----- > From: Van Haaren, Harry > Sent: Thursday, October 8, 2020 9:34 AM > To: David Marchand <david.marchand@redhat.com>; Richardson, Bruce > <bruce.richardson@intel.com>; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>; > Ruifeng Wang (Arm Technology China) <ruifeng.wang@arm.com>; Jan Viktorin > <viktorin@rehivetech.com>; David Christensen <drc@linux.vnet.ibm.com>; > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> > Cc: dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com> > Subject: RE: [dpdk-dev] [PATCH v2] eal: add new prefetch write variants > > > -----Original Message----- > > From: David Marchand <david.marchand@redhat.com> > > Sent: Thursday, October 8, 2020 8:43 AM > > To: Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce > > <bruce.richardson@intel.com>; Ananyev, Konstantin > > <konstantin.ananyev@intel.com>; Jerin Jacob Kollanukkaran > <jerinj@marvell.com>; > > Ruifeng Wang (Arm Technology China) <ruifeng.wang@arm.com>; Jan Viktorin > > <viktorin@rehivetech.com>; David Christensen <drc@linux.vnet.ibm.com>; > > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> > > Cc: dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com> > > Subject: Re: [dpdk-dev] [PATCH v2] eal: add new prefetch write variants > > > > On Mon, Sep 14, 2020 at 5:09 PM Harry van Haaren > > <harry.van.haaren@intel.com> wrote: > > > > > > This commit adds a new rte_prefetch0_write() variants, suggesting to the > > > compiler to use a prefetch instruction with intention to write. As a > > > compiler builtin, the compiler can choose based on compilation target > > > what the best implementation for this instruction is. > > > > > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > > > No review, copying arch maintainers. > > > > -- > > David Marchand > > Hey All, > > Actually Pavan provided some feedback to add prefetch1 and prefetch2 variants: > https://mails.dpdk.org/archives/dev/2020-September/thread.html#180501 > > I can push a v3 with those changes later today, will CC folks here. > > Thanks, -Harry ... apologies - David you're right - the v2 on list already has those changes: http://patches.dpdk.org/patch/77632/ reviews welcome!
On Mon, Sep 14, 2020 at 8:39 PM Harry van Haaren <harry.van.haaren@intel.com> wrote: > > This commit adds a new rte_prefetch0_write() variants, suggesting to the > compiler to use a prefetch instruction with intention to write. As a > compiler builtin, the compiler can choose based on compilation target > what the best implementation for this instruction is. > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > --- > > v2: > - Add L1, L2, and L3 variants as ARM64 uarch supports them (Pavan) > > The integer constants passed to the builtin are not available as > a #define value, and doing #defines just for this write variant > does not seems a nice solution to me... particularly for those using > IDEs where any #define value is auto-hinted for code-completion. > --- > lib/librte_eal/include/generic/rte_prefetch.h | 49 +++++++++++++++++++ > 1 file changed, 49 insertions(+) > > diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h > index 6e47bdfbad..3dfca77a74 100644 > --- a/lib/librte_eal/include/generic/rte_prefetch.h > +++ b/lib/librte_eal/include/generic/rte_prefetch.h > @@ -51,4 +51,53 @@ static inline void rte_prefetch2(const volatile void *p); > */ > static inline void rte_prefetch_non_temporal(const volatile void *p); > > +/** > + * Prefetch a cache line into all cache levels, with intention to write. This > + * prefetch variant hints to the CPU that the program is expecting to write to > + * the cache line being prefetched. > + * > + * @param p Address to prefetch > + */ > +static inline void rte_prefetch0_write(const void *p) Typically DPDK coding standards is to have static inline void rte_prefetch0_write(const void *p) vs static inline void rte_prefetch0_write(const void *p) Either way: Reviewed-by: Jerin Jacob <jerinj@marvell.com> > +{ > + /* 1 indicates intention to write, 3 sets target cache level to L1. See > + * GCC docs where these integer constants are described in more detail: > + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html > + */ > + __builtin_prefetch(p, 1, 3); > +} > + > +/** > + * Prefetch a cache line into all cache levels, except the 0th, with intention > + * to write. This prefetch variant hints to the CPU that the program is > + * expecting to write to the cache line being prefetched. > + * > + * @param p Address to prefetch > + */ > +static inline void rte_prefetch1_write(const void *p) > +{ > + /* 1 indicates intention to write, 2 sets target cache level to L2. See > + * GCC docs where these integer constants are described in more detail: > + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html > + */ > + __builtin_prefetch(p, 1, 2); > +} > + > +/** > + * Prefetch a cache line into all cache levels, except the 0th and 1st, with > + * intention to write. This prefetch variant hints to the CPU that the program > + * is expecting to write to the cache line being prefetched. > + * > + * @param p Address to prefetch > + */ > +static inline void rte_prefetch2_write(const void *p) > +{ > + /* 1 indicates intention to write, 1 sets target cache level to L3. See > + * GCC docs where these integer constants are described in more detail: > + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html > + */ > + __builtin_prefetch(p, 1, 1); > +} > + > + > #endif /* _RTE_PREFETCH_H_ */ > -- > 2.17.1 >
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Harry van Haaren
> Sent: Monday, September 14, 2020 11:10 PM
> To: dev@dpdk.org
> Cc: pbhagavatula@marvell.com; Harry van Haaren
> <harry.van.haaren@intel.com>
> Subject: [dpdk-dev] [PATCH v2] eal: add new prefetch write variants
>
> This commit adds a new rte_prefetch0_write() variants, suggesting to the
> compiler to use a prefetch instruction with intention to write. As a compiler
> builtin, the compiler can choose based on compilation target what the best
> implementation for this instruction is.
>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
>
> ---
>
> v2:
> - Add L1, L2, and L3 variants as ARM64 uarch supports them (Pavan)
>
> The integer constants passed to the builtin are not available as a #define
> value, and doing #defines just for this write variant does not seems a nice
> solution to me... particularly for those using IDEs where any #define value is
> auto-hinted for code-completion.
> ---
> lib/librte_eal/include/generic/rte_prefetch.h | 49 +++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/lib/librte_eal/include/generic/rte_prefetch.h
> b/lib/librte_eal/include/generic/rte_prefetch.h
> index 6e47bdfbad..3dfca77a74 100644
> --- a/lib/librte_eal/include/generic/rte_prefetch.h
> +++ b/lib/librte_eal/include/generic/rte_prefetch.h
> @@ -51,4 +51,53 @@ static inline void rte_prefetch2(const volatile void *p);
> */
> static inline void rte_prefetch_non_temporal(const volatile void *p);
>
> +/**
> + * Prefetch a cache line into all cache levels, with intention to
> +write. This
> + * prefetch variant hints to the CPU that the program is expecting to
> +write to
> + * the cache line being prefetched.
> + *
> + * @param p Address to prefetch
> + */
> +static inline void rte_prefetch0_write(const void *p) {
> + /* 1 indicates intention to write, 3 sets target cache level to L1. See
> + * GCC docs where these integer constants are described in more
> detail:
> + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
> + */
> + __builtin_prefetch(p, 1, 3);
> +}
> +
> +/**
> + * Prefetch a cache line into all cache levels, except the 0th, with
> +intention
> + * to write. This prefetch variant hints to the CPU that the program is
> + * expecting to write to the cache line being prefetched.
> + *
> + * @param p Address to prefetch
> + */
> +static inline void rte_prefetch1_write(const void *p) {
> + /* 1 indicates intention to write, 2 sets target cache level to L2. See
> + * GCC docs where these integer constants are described in more
> detail:
> + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
> + */
> + __builtin_prefetch(p, 1, 2);
> +}
> +
> +/**
> + * Prefetch a cache line into all cache levels, except the 0th and 1st,
> +with
> + * intention to write. This prefetch variant hints to the CPU that the
> +program
> + * is expecting to write to the cache line being prefetched.
> + *
> + * @param p Address to prefetch
> + */
> +static inline void rte_prefetch2_write(const void *p) {
> + /* 1 indicates intention to write, 1 sets target cache level to L3. See
> + * GCC docs where these integer constants are described in more
> detail:
> + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
> + */
> + __builtin_prefetch(p, 1, 1);
> +}
> +
> +
> #endif /* _RTE_PREFETCH_H_ */
> --
> 2.17.1
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Hello Harry,
On Mon, Sep 14, 2020 at 5:09 PM Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> This commit adds a new rte_prefetch0_write() variants, suggesting to the
> compiler to use a prefetch instruction with intention to write. As a
> compiler builtin, the compiler can choose based on compilation target
> what the best implementation for this instruction is.
>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
There was a small comment from Jerin.
Those are new functions and should be marked as experimental.
Please add a call to those in test_prefetch.c.
Please update the release notes.
Thanks.
--
David Marchand
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, October 15, 2020 9:18 AM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Jerin
> Jacob Kollanukkaran <jerinj@marvell.com>; Ruifeng Wang (Arm Technology
> China) <ruifeng.wang@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2] eal: add new prefetch write variants
>
> Hello Harry,
>
> On Mon, Sep 14, 2020 at 5:09 PM Harry van Haaren
> <harry.van.haaren@intel.com> wrote:
> >
> > This commit adds a new rte_prefetch0_write() variants, suggesting to the
> > compiler to use a prefetch instruction with intention to write. As a
> > compiler builtin, the compiler can choose based on compilation target
> > what the best implementation for this instruction is.
> >
> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
>
> There was a small comment from Jerin.
>
> Those are new functions and should be marked as experimental.
> Please add a call to those in test_prefetch.c.
> Please update the release notes.
Will do - thanks!
This commit adds new rte_prefetchX_write() variants, that suggest to the compiler to use a prefetch instruction with intention to write. As a compiler builtin, the compiler can choose based on compilation target what the best implementation for this instruction is. Three versions are provided, targeting the different levels of cache. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> --- v3: - Add reviewed by tags from Jerin and Ruifeng, thanks! - Add __rte_experimental as they are new functions (David) This required adding the rte_compat.h include. - Rework return type to new line (Jerin) - Add calls in test_prefetch.c to new functions (David) - Add item to release notes (David) v2: - Add L1, L2, and L3 variants as ARM64 uarch supports them (Pavan) The integer constants passed to the builtin are not available as a #define value, and doing #defines just for this write variant does not seems a nice solution to me... particularly for those using IDEs where any #define value is auto-hinted for code-completion. --- app/test/test_prefetch.c | 4 ++ doc/guides/rel_notes/release_20_11.rst | 6 ++ lib/librte_eal/include/generic/rte_prefetch.h | 57 +++++++++++++++++++ 3 files changed, 67 insertions(+) diff --git a/app/test/test_prefetch.c b/app/test/test_prefetch.c index 41f219af78..32e08f8afe 100644 --- a/app/test/test_prefetch.c +++ b/app/test/test_prefetch.c @@ -26,6 +26,10 @@ test_prefetch(void) rte_prefetch1(&a); rte_prefetch2(&a); + rte_prefetch0_write(&a); + rte_prefetch1_write(&a); + rte_prefetch2_write(&a); + return 0; } diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index 0925123e9c..8b51ef0dbc 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -62,6 +62,12 @@ New Features The functions are provided as a generic stubs and x86 specific implementation. +* **Added prefetch with intention to write APIs.** + + Added new prefetch function variants e.g. ``rte_prefetch0_write``, + which allow the programmer to prefetch a cache line and also indicate + the intention to write. + * **Updated CRC modules of the net library.** * Added runtime selection of the optimal architecture-specific CRC path. diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h index 6e47bdfbad..53d68c40f1 100644 --- a/lib/librte_eal/include/generic/rte_prefetch.h +++ b/lib/librte_eal/include/generic/rte_prefetch.h @@ -5,6 +5,8 @@ #ifndef _RTE_PREFETCH_H_ #define _RTE_PREFETCH_H_ +#include "rte_compat.h" + /** * @file * @@ -51,4 +53,59 @@ static inline void rte_prefetch2(const volatile void *p); */ static inline void rte_prefetch_non_temporal(const volatile void *p); +/** + * Prefetch a cache line into all cache levels, with intention to write. This + * prefetch variant hints to the CPU that the program is expecting to write to + * the cache line being prefetched. + * + * @param p Address to prefetch + */ +__rte_experimental +static inline void +rte_prefetch0_write(const void *p) +{ + /* 1 indicates intention to write, 3 sets target cache level to L1. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 3); +} + +/** + * Prefetch a cache line into all cache levels, except the 0th, with intention + * to write. This prefetch variant hints to the CPU that the program is + * expecting to write to the cache line being prefetched. + * + * @param p Address to prefetch + */ +__rte_experimental +static inline void +rte_prefetch1_write(const void *p) +{ + /* 1 indicates intention to write, 2 sets target cache level to L2. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 2); +} + +/** + * Prefetch a cache line into all cache levels, except the 0th and 1st, with + * intention to write. This prefetch variant hints to the CPU that the program + * is expecting to write to the cache line being prefetched. + * + * @param p Address to prefetch + */ +__rte_experimental +static inline void +rte_prefetch2_write(const void *p) +{ + /* 1 indicates intention to write, 1 sets target cache level to L3. See + * GCC docs where these integer constants are described in more detail: + * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html + */ + __builtin_prefetch(p, 1, 1); +} + + #endif /* _RTE_PREFETCH_H_ */ -- 2.17.1
On Thu, Oct 15, 2020 at 12:31 PM Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> This commit adds new rte_prefetchX_write() variants, that suggest to the
> compiler to use a prefetch instruction with intention to write. As a
> compiler builtin, the compiler can choose based on compilation target
> what the best implementation for this instruction is.
>
> Three versions are provided, targeting the different levels of cache.
>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Reviewed-by: Jerin Jacob <jerinj@marvell.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Added experimental banners in doxygen.
Applied, thanks.
--
David Marchand