* [PATCH] eal: add cache guard to per-lcore PRNG state @ 2023-09-04 9:26 Morten Brørup 2023-09-04 11:57 ` Mattias Rönnblom 0 siblings, 1 reply; 8+ messages in thread From: Morten Brørup @ 2023-09-04 9:26 UTC (permalink / raw) To: thomas, david.marchand, mattias.ronnblom, bruce.richardson Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev, dev, Morten Brørup The per-lcore random state is frequently updated by their individual lcores, so add a cache guard to prevent CPU cache thrashing. Depends-on: series-29415 ("clarify purpose of empty cache lines") Signed-off-by: Morten Brørup <mb@smartsharesystems.com> --- lib/eal/common/rte_random.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c index 565f2401ce..3df0c7004a 100644 --- a/lib/eal/common/rte_random.c +++ b/lib/eal/common/rte_random.c @@ -18,6 +18,7 @@ struct rte_rand_state { uint64_t z3; uint64_t z4; uint64_t z5; + RTE_CACHE_GUARD; } __rte_cache_aligned; /* One instance each for every lcore id-equipped thread, and one -- 2.17.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-09-04 9:26 [PATCH] eal: add cache guard to per-lcore PRNG state Morten Brørup @ 2023-09-04 11:57 ` Mattias Rönnblom 2023-09-06 16:25 ` Stephen Hemminger 2023-09-29 18:55 ` Morten Brørup 0 siblings, 2 replies; 8+ messages in thread From: Mattias Rönnblom @ 2023-09-04 11:57 UTC (permalink / raw) To: Morten Brørup, thomas, david.marchand, mattias.ronnblom, bruce.richardson Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev, dev On 2023-09-04 11:26, Morten Brørup wrote: > The per-lcore random state is frequently updated by their individual > lcores, so add a cache guard to prevent CPU cache thrashing. > "to prevent false sharing in case the CPU employs a next-N-lines (or similar) hardware prefetcher" In my world, cache trashing and cache line contention are two different things. Other than that, Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> > Depends-on: series-29415 ("clarify purpose of empty cache lines") > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com> > --- > lib/eal/common/rte_random.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c > index 565f2401ce..3df0c7004a 100644 > --- a/lib/eal/common/rte_random.c > +++ b/lib/eal/common/rte_random.c > @@ -18,6 +18,7 @@ struct rte_rand_state { > uint64_t z3; > uint64_t z4; > uint64_t z5; > + RTE_CACHE_GUARD; > } __rte_cache_aligned; > > /* One instance each for every lcore id-equipped thread, and one ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-09-04 11:57 ` Mattias Rönnblom @ 2023-09-06 16:25 ` Stephen Hemminger 2023-10-11 16:07 ` Thomas Monjalon 2023-09-29 18:55 ` Morten Brørup 1 sibling, 1 reply; 8+ messages in thread From: Stephen Hemminger @ 2023-09-06 16:25 UTC (permalink / raw) To: Mattias Rönnblom Cc: Morten Brørup, thomas, david.marchand, mattias.ronnblom, bruce.richardson, olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev, dev On Mon, 4 Sep 2023 13:57:19 +0200 Mattias Rönnblom <hofors@lysator.liu.se> wrote: > On 2023-09-04 11:26, Morten Brørup wrote: > > The per-lcore random state is frequently updated by their individual > > lcores, so add a cache guard to prevent CPU cache thrashing. > > > > "to prevent false sharing in case the CPU employs a next-N-lines (or > similar) hardware prefetcher" > > In my world, cache trashing and cache line contention are two different > things. > > Other than that, > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Could the per-lcore state be thread local? Something like this: From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001 From: Stephen Hemminger <stephen@networkplumber.org> Date: Wed, 6 Sep 2023 09:22:42 -0700 Subject: [PATCH] random: use per lcore state Move the random number state into thread local storage. This has a several benefits. - no false cache sharing from cpu prefetching - fixes initialization of random state for non-DPDK threads - fixes unsafe usage of random state by non-DPDK threads The initialization of random number state is done by the lcore (lazy initialization). Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------ 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c index 53636331a27b..62f36038ac52 100644 --- a/lib/eal/common/rte_random.c +++ b/lib/eal/common/rte_random.c @@ -19,13 +19,14 @@ struct rte_rand_state { uint64_t z3; uint64_t z4; uint64_t z5; -} __rte_cache_aligned; + uint64_t seed; +}; -/* One instance each for every lcore id-equipped thread, and one - * additional instance to be shared by all others threads (i.e., all - * unregistered non-EAL threads). - */ -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; +/* Global random seed */ +static uint64_t rte_rand_seed; + +/* Per lcore random state. */ +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); static uint32_t __rte_rand_lcg32(uint32_t *seed) @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state) state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL); state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL); state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL); + + state->seed = seed; } void rte_srand(uint64_t seed) { - unsigned int lcore_id; - - /* add lcore_id to seed to avoid having the same sequence */ - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); } static __rte_always_inline uint64_t @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state) static __rte_always_inline struct rte_rand_state *__rte_rand_get_state(void) { - unsigned int idx; - - idx = rte_lcore_id(); + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); + uint64_t seed; - /* last instance reserved for unregistered non-EAL threads */ - if (unlikely(idx == LCORE_ID_ANY)) - idx = RTE_MAX_LCORE; + /* did seed change */ + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); + if (unlikely(seed != rand_state->seed)) + __rte_srand_lfsr258(seed, rand_state); - return &rand_states[idx]; + return rand_state; } uint64_t -- 2.39.2 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-09-06 16:25 ` Stephen Hemminger @ 2023-10-11 16:07 ` Thomas Monjalon 2023-10-11 16:55 ` Morten Brørup 2023-10-11 20:41 ` Mattias Rönnblom 0 siblings, 2 replies; 8+ messages in thread From: Thomas Monjalon @ 2023-10-11 16:07 UTC (permalink / raw) To: Mattias Rönnblom, Morten Brørup, Stephen Hemminger Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson, olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev TLS is an alternative solution proposed by Stephen. What do you think? 06/09/2023 18:25, Stephen Hemminger: > On Mon, 4 Sep 2023 13:57:19 +0200 > Mattias Rönnblom <hofors@lysator.liu.se> wrote: > > > On 2023-09-04 11:26, Morten Brørup wrote: > > > The per-lcore random state is frequently updated by their individual > > > lcores, so add a cache guard to prevent CPU cache thrashing. > > > > > > > "to prevent false sharing in case the CPU employs a next-N-lines (or > > similar) hardware prefetcher" > > > > In my world, cache trashing and cache line contention are two different > > things. > > > > Other than that, > > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> > > Could the per-lcore state be thread local? > > Something like this: > > From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001 > From: Stephen Hemminger <stephen@networkplumber.org> > Date: Wed, 6 Sep 2023 09:22:42 -0700 > Subject: [PATCH] random: use per lcore state > > Move the random number state into thread local storage. > This has a several benefits. > - no false cache sharing from cpu prefetching > - fixes initialization of random state for non-DPDK threads > - fixes unsafe usage of random state by non-DPDK threads > > The initialization of random number state is done by the > lcore (lazy initialization). > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > --- > lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------ > 1 file changed, 17 insertions(+), 18 deletions(-) > > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c > index 53636331a27b..62f36038ac52 100644 > --- a/lib/eal/common/rte_random.c > +++ b/lib/eal/common/rte_random.c > @@ -19,13 +19,14 @@ struct rte_rand_state { > uint64_t z3; > uint64_t z4; > uint64_t z5; > -} __rte_cache_aligned; > + uint64_t seed; > +}; > > -/* One instance each for every lcore id-equipped thread, and one > - * additional instance to be shared by all others threads (i.e., all > - * unregistered non-EAL threads). > - */ > -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; > +/* Global random seed */ > +static uint64_t rte_rand_seed; > + > +/* Per lcore random state. */ > +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); > > static uint32_t > __rte_rand_lcg32(uint32_t *seed) > @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state) > state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL); > state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL); > state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL); > + > + state->seed = seed; > } > > void > rte_srand(uint64_t seed) > { > - unsigned int lcore_id; > - > - /* add lcore_id to seed to avoid having the same sequence */ > - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) > - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); > + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); > } > > static __rte_always_inline uint64_t > @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state) > static __rte_always_inline > struct rte_rand_state *__rte_rand_get_state(void) > { > - unsigned int idx; > - > - idx = rte_lcore_id(); > + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); > + uint64_t seed; > > - /* last instance reserved for unregistered non-EAL threads */ > - if (unlikely(idx == LCORE_ID_ANY)) > - idx = RTE_MAX_LCORE; > + /* did seed change */ > + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); > + if (unlikely(seed != rand_state->seed)) > + __rte_srand_lfsr258(seed, rand_state); > > - return &rand_states[idx]; > + return rand_state; > } > > uint64_t > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-10-11 16:07 ` Thomas Monjalon @ 2023-10-11 16:55 ` Morten Brørup 2023-10-11 22:49 ` Thomas Monjalon 2023-10-11 20:41 ` Mattias Rönnblom 1 sibling, 1 reply; 8+ messages in thread From: Morten Brørup @ 2023-10-11 16:55 UTC (permalink / raw) To: Thomas Monjalon, Mattias Rönnblom, Stephen Hemminger Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson, olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Wednesday, 11 October 2023 18.08 > > TLS is an alternative solution proposed by Stephen. > What do you think? I think we went down a rabbit hole - which I admit to enjoy. :-) My simple patch should be applied, with the description improved by Mattias: The per-lcore random state is frequently updated by their individual lcores, so add a cache guard to prevent false sharing in case the CPU employs a next-N-lines (or similar) hardware prefetcher. > > > 06/09/2023 18:25, Stephen Hemminger: > > On Mon, 4 Sep 2023 13:57:19 +0200 > > Mattias Rönnblom <hofors@lysator.liu.se> wrote: > > > > > On 2023-09-04 11:26, Morten Brørup wrote: > > > > The per-lcore random state is frequently updated by their > individual > > > > lcores, so add a cache guard to prevent CPU cache thrashing. > > > > > > > > > > "to prevent false sharing in case the CPU employs a next-N-lines (or > > > similar) hardware prefetcher" > > > > > > In my world, cache trashing and cache line contention are two > different > > > things. > > > > > > Other than that, > > > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> > > > > Could the per-lcore state be thread local? > > > > Something like this: > > > > From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001 > > From: Stephen Hemminger <stephen@networkplumber.org> > > Date: Wed, 6 Sep 2023 09:22:42 -0700 > > Subject: [PATCH] random: use per lcore state > > > > Move the random number state into thread local storage. > > This has a several benefits. > > - no false cache sharing from cpu prefetching > > - fixes initialization of random state for non-DPDK threads > > - fixes unsafe usage of random state by non-DPDK threads > > > > The initialization of random number state is done by the > > lcore (lazy initialization). > > > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > > --- > > lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------ > > 1 file changed, 17 insertions(+), 18 deletions(-) > > > > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c > > index 53636331a27b..62f36038ac52 100644 > > --- a/lib/eal/common/rte_random.c > > +++ b/lib/eal/common/rte_random.c > > @@ -19,13 +19,14 @@ struct rte_rand_state { > > uint64_t z3; > > uint64_t z4; > > uint64_t z5; > > -} __rte_cache_aligned; > > + uint64_t seed; > > +}; > > > > -/* One instance each for every lcore id-equipped thread, and one > > - * additional instance to be shared by all others threads (i.e., all > > - * unregistered non-EAL threads). > > - */ > > -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; > > +/* Global random seed */ > > +static uint64_t rte_rand_seed; > > + > > +/* Per lcore random state. */ > > +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); > > > > static uint32_t > > __rte_rand_lcg32(uint32_t *seed) > > @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct > rte_rand_state *state) > > state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL); > > state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL); > > state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL); > > + > > + state->seed = seed; > > } > > > > void > > rte_srand(uint64_t seed) > > { > > - unsigned int lcore_id; > > - > > - /* add lcore_id to seed to avoid having the same sequence */ > > - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) > > - __rte_srand_lfsr258(seed + lcore_id, > &rand_states[lcore_id]); > > + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); > > } > > > > static __rte_always_inline uint64_t > > @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state) > > static __rte_always_inline > > struct rte_rand_state *__rte_rand_get_state(void) > > { > > - unsigned int idx; > > - > > - idx = rte_lcore_id(); > > + struct rte_rand_state *rand_state = > &RTE_PER_LCORE(rte_rand_state); > > + uint64_t seed; > > > > - /* last instance reserved for unregistered non-EAL threads */ > > - if (unlikely(idx == LCORE_ID_ANY)) > > - idx = RTE_MAX_LCORE; > > + /* did seed change */ > > + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); > > + if (unlikely(seed != rand_state->seed)) > > + __rte_srand_lfsr258(seed, rand_state); > > > > - return &rand_states[idx]; > > + return rand_state; > > } > > > > uint64_t > > > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-10-11 16:55 ` Morten Brørup @ 2023-10-11 22:49 ` Thomas Monjalon 0 siblings, 0 replies; 8+ messages in thread From: Thomas Monjalon @ 2023-10-11 22:49 UTC (permalink / raw) To: Morten Brørup Cc: Mattias Rönnblom, Stephen Hemminger, dev, david.marchand, mattias.ronnblom, bruce.richardson, olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev 11/10/2023 18:55, Morten Brørup: > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > Sent: Wednesday, 11 October 2023 18.08 > > > > TLS is an alternative solution proposed by Stephen. > > What do you think? > > I think we went down a rabbit hole - which I admit to enjoy. :-) There is no reply/explanation in this thread. > My simple patch should be applied, with the description improved by Mattias: > > The per-lcore random state is frequently updated by their individual > lcores, so add a cache guard to prevent false sharing in case the > CPU employs a next-N-lines (or similar) hardware prefetcher. OK applied ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-10-11 16:07 ` Thomas Monjalon 2023-10-11 16:55 ` Morten Brørup @ 2023-10-11 20:41 ` Mattias Rönnblom 1 sibling, 0 replies; 8+ messages in thread From: Mattias Rönnblom @ 2023-10-11 20:41 UTC (permalink / raw) To: Thomas Monjalon, Morten Brørup, Stephen Hemminger Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson, olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev On 2023-10-11 18:07, Thomas Monjalon wrote: > TLS is an alternative solution proposed by Stephen. > What do you think? > I've expressed my views on this topic in two threads already. I'm happy to continue that discussion, but I would suggest it would be under the banner of "what should the standard pattern for maintaining per-lcore (and maybe also per-unregistered thread state) be in DPDK". A related issue is the ambition level for having unregistered threads calling into DPDK APIs in general. MT safety issues, performance issues, and concerns around preemption safety. > > 06/09/2023 18:25, Stephen Hemminger: >> On Mon, 4 Sep 2023 13:57:19 +0200 >> Mattias Rönnblom <hofors@lysator.liu.se> wrote: >> >>> On 2023-09-04 11:26, Morten Brørup wrote: >>>> The per-lcore random state is frequently updated by their individual >>>> lcores, so add a cache guard to prevent CPU cache thrashing. >>>> >>> >>> "to prevent false sharing in case the CPU employs a next-N-lines (or >>> similar) hardware prefetcher" >>> >>> In my world, cache trashing and cache line contention are two different >>> things. >>> >>> Other than that, >>> Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> >> >> Could the per-lcore state be thread local? >> >> Something like this: >> >> From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001 >> From: Stephen Hemminger <stephen@networkplumber.org> >> Date: Wed, 6 Sep 2023 09:22:42 -0700 >> Subject: [PATCH] random: use per lcore state >> >> Move the random number state into thread local storage. >> This has a several benefits. >> - no false cache sharing from cpu prefetching >> - fixes initialization of random state for non-DPDK threads >> - fixes unsafe usage of random state by non-DPDK threads >> >> The initialization of random number state is done by the >> lcore (lazy initialization). >> >> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> >> --- >> lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------ >> 1 file changed, 17 insertions(+), 18 deletions(-) >> >> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c >> index 53636331a27b..62f36038ac52 100644 >> --- a/lib/eal/common/rte_random.c >> +++ b/lib/eal/common/rte_random.c >> @@ -19,13 +19,14 @@ struct rte_rand_state { >> uint64_t z3; >> uint64_t z4; >> uint64_t z5; >> -} __rte_cache_aligned; >> + uint64_t seed; >> +}; >> >> -/* One instance each for every lcore id-equipped thread, and one >> - * additional instance to be shared by all others threads (i.e., all >> - * unregistered non-EAL threads). >> - */ >> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; >> +/* Global random seed */ >> +static uint64_t rte_rand_seed; >> + >> +/* Per lcore random state. */ >> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); >> >> static uint32_t >> __rte_rand_lcg32(uint32_t *seed) >> @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state) >> state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL); >> state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL); >> state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL); >> + >> + state->seed = seed; >> } >> >> void >> rte_srand(uint64_t seed) >> { >> - unsigned int lcore_id; >> - >> - /* add lcore_id to seed to avoid having the same sequence */ >> - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) >> - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); >> + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); >> } >> >> static __rte_always_inline uint64_t >> @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state) >> static __rte_always_inline >> struct rte_rand_state *__rte_rand_get_state(void) >> { >> - unsigned int idx; >> - >> - idx = rte_lcore_id(); >> + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); >> + uint64_t seed; >> >> - /* last instance reserved for unregistered non-EAL threads */ >> - if (unlikely(idx == LCORE_ID_ANY)) >> - idx = RTE_MAX_LCORE; >> + /* did seed change */ >> + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); >> + if (unlikely(seed != rand_state->seed)) >> + __rte_srand_lfsr258(seed, rand_state); >> >> - return &rand_states[idx]; >> + return rand_state; >> } >> >> uint64_t >> > > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] eal: add cache guard to per-lcore PRNG state 2023-09-04 11:57 ` Mattias Rönnblom 2023-09-06 16:25 ` Stephen Hemminger @ 2023-09-29 18:55 ` Morten Brørup 1 sibling, 0 replies; 8+ messages in thread From: Morten Brørup @ 2023-09-29 18:55 UTC (permalink / raw) To: Stephen Hemminger Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev, dev, Mattias Rönnblom, thomas, david.marchand, mattias.ronnblom, bruce.richardson PING for review. Stephen, the discussion took quite a few turns, but didn't seem to reach a better solution. If you don't object to this simple patch, could you please also ack/review it, so it can be applied. > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se] > Sent: Monday, 4 September 2023 13.57 > > On 2023-09-04 11:26, Morten Brørup wrote: > > The per-lcore random state is frequently updated by their individual > > lcores, so add a cache guard to prevent CPU cache thrashing. > > > > "to prevent false sharing in case the CPU employs a next-N-lines (or > similar) hardware prefetcher" > > In my world, cache trashing and cache line contention are two different > things. You are right, Mattias. I didn't think give the description much thought, and simply used "cache trashing" in a broad, general sense. I think most readers will get the point anyway. Or they could take a look at the description provided for the RTE_CACHE_GUARD itself. :-) > > Other than that, > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> > > > Depends-on: series-29415 ("clarify purpose of empty cache lines") > > > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com> > > --- > > lib/eal/common/rte_random.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c > > index 565f2401ce..3df0c7004a 100644 > > --- a/lib/eal/common/rte_random.c > > +++ b/lib/eal/common/rte_random.c > > @@ -18,6 +18,7 @@ struct rte_rand_state { > > uint64_t z3; > > uint64_t z4; > > uint64_t z5; > > + RTE_CACHE_GUARD; > > } __rte_cache_aligned; > > > > /* One instance each for every lcore id-equipped thread, and one ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-10-11 22:49 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-09-04 9:26 [PATCH] eal: add cache guard to per-lcore PRNG state Morten Brørup 2023-09-04 11:57 ` Mattias Rönnblom 2023-09-06 16:25 ` Stephen Hemminger 2023-10-11 16:07 ` Thomas Monjalon 2023-10-11 16:55 ` Morten Brørup 2023-10-11 22:49 ` Thomas Monjalon 2023-10-11 20:41 ` Mattias Rönnblom 2023-09-29 18:55 ` Morten Brørup
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).