* [PATCH] eal: add cache guard to per-lcore PRNG state
@ 2023-09-04 9:26 Morten Brørup
2023-09-04 11:57 ` Mattias Rönnblom
0 siblings, 1 reply; 8+ messages in thread
From: Morten Brørup @ 2023-09-04 9:26 UTC (permalink / raw)
To: thomas, david.marchand, mattias.ronnblom, bruce.richardson
Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev, dev, Morten Brørup
The per-lcore random state is frequently updated by their individual
lcores, so add a cache guard to prevent CPU cache thrashing.
Depends-on: series-29415 ("clarify purpose of empty cache lines")
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
lib/eal/common/rte_random.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 565f2401ce..3df0c7004a 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -18,6 +18,7 @@ struct rte_rand_state {
uint64_t z3;
uint64_t z4;
uint64_t z5;
+ RTE_CACHE_GUARD;
} __rte_cache_aligned;
/* One instance each for every lcore id-equipped thread, and one
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-09-04 9:26 [PATCH] eal: add cache guard to per-lcore PRNG state Morten Brørup
@ 2023-09-04 11:57 ` Mattias Rönnblom
2023-09-06 16:25 ` Stephen Hemminger
2023-09-29 18:55 ` Morten Brørup
0 siblings, 2 replies; 8+ messages in thread
From: Mattias Rönnblom @ 2023-09-04 11:57 UTC (permalink / raw)
To: Morten Brørup, thomas, david.marchand, mattias.ronnblom,
bruce.richardson
Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev, dev
On 2023-09-04 11:26, Morten Brørup wrote:
> The per-lcore random state is frequently updated by their individual
> lcores, so add a cache guard to prevent CPU cache thrashing.
>
"to prevent false sharing in case the CPU employs a next-N-lines (or
similar) hardware prefetcher"
In my world, cache trashing and cache line contention are two different
things.
Other than that,
Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Depends-on: series-29415 ("clarify purpose of empty cache lines")
>
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> lib/eal/common/rte_random.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
> index 565f2401ce..3df0c7004a 100644
> --- a/lib/eal/common/rte_random.c
> +++ b/lib/eal/common/rte_random.c
> @@ -18,6 +18,7 @@ struct rte_rand_state {
> uint64_t z3;
> uint64_t z4;
> uint64_t z5;
> + RTE_CACHE_GUARD;
> } __rte_cache_aligned;
>
> /* One instance each for every lcore id-equipped thread, and one
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-09-04 11:57 ` Mattias Rönnblom
@ 2023-09-06 16:25 ` Stephen Hemminger
2023-10-11 16:07 ` Thomas Monjalon
2023-09-29 18:55 ` Morten Brørup
1 sibling, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2023-09-06 16:25 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Morten Brørup, thomas, david.marchand, mattias.ronnblom,
bruce.richardson, olivier.matz, andrew.rybchenko,
honnappa.nagarahalli, konstantin.v.ananyev, dev
On Mon, 4 Sep 2023 13:57:19 +0200
Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> On 2023-09-04 11:26, Morten Brørup wrote:
> > The per-lcore random state is frequently updated by their individual
> > lcores, so add a cache guard to prevent CPU cache thrashing.
> >
>
> "to prevent false sharing in case the CPU employs a next-N-lines (or
> similar) hardware prefetcher"
>
> In my world, cache trashing and cache line contention are two different
> things.
>
> Other than that,
> Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Could the per-lcore state be thread local?
Something like this:
From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <stephen@networkplumber.org>
Date: Wed, 6 Sep 2023 09:22:42 -0700
Subject: [PATCH] random: use per lcore state
Move the random number state into thread local storage.
This has a several benefits.
- no false cache sharing from cpu prefetching
- fixes initialization of random state for non-DPDK threads
- fixes unsafe usage of random state by non-DPDK threads
The initialization of random number state is done by the
lcore (lazy initialization).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------
1 file changed, 17 insertions(+), 18 deletions(-)
diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 53636331a27b..62f36038ac52 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -19,13 +19,14 @@ struct rte_rand_state {
uint64_t z3;
uint64_t z4;
uint64_t z5;
-} __rte_cache_aligned;
+ uint64_t seed;
+};
-/* One instance each for every lcore id-equipped thread, and one
- * additional instance to be shared by all others threads (i.e., all
- * unregistered non-EAL threads).
- */
-static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
+/* Global random seed */
+static uint64_t rte_rand_seed;
+
+/* Per lcore random state. */
+static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
static uint32_t
__rte_rand_lcg32(uint32_t *seed)
@@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state)
state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL);
state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL);
state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL);
+
+ state->seed = seed;
}
void
rte_srand(uint64_t seed)
{
- unsigned int lcore_id;
-
- /* add lcore_id to seed to avoid having the same sequence */
- for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
- __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
+ __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
}
static __rte_always_inline uint64_t
@@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
static __rte_always_inline
struct rte_rand_state *__rte_rand_get_state(void)
{
- unsigned int idx;
-
- idx = rte_lcore_id();
+ struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state);
+ uint64_t seed;
- /* last instance reserved for unregistered non-EAL threads */
- if (unlikely(idx == LCORE_ID_ANY))
- idx = RTE_MAX_LCORE;
+ /* did seed change */
+ seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
+ if (unlikely(seed != rand_state->seed))
+ __rte_srand_lfsr258(seed, rand_state);
- return &rand_states[idx];
+ return rand_state;
}
uint64_t
--
2.39.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-09-04 11:57 ` Mattias Rönnblom
2023-09-06 16:25 ` Stephen Hemminger
@ 2023-09-29 18:55 ` Morten Brørup
1 sibling, 0 replies; 8+ messages in thread
From: Morten Brørup @ 2023-09-29 18:55 UTC (permalink / raw)
To: Stephen Hemminger
Cc: olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev, dev, Mattias Rönnblom, thomas,
david.marchand, mattias.ronnblom, bruce.richardson
PING for review.
Stephen, the discussion took quite a few turns, but didn't seem to reach a better solution. If you don't object to this simple patch, could you please also ack/review it, so it can be applied.
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Monday, 4 September 2023 13.57
>
> On 2023-09-04 11:26, Morten Brørup wrote:
> > The per-lcore random state is frequently updated by their individual
> > lcores, so add a cache guard to prevent CPU cache thrashing.
> >
>
> "to prevent false sharing in case the CPU employs a next-N-lines (or
> similar) hardware prefetcher"
>
> In my world, cache trashing and cache line contention are two different
> things.
You are right, Mattias.
I didn't think give the description much thought, and simply used "cache trashing" in a broad, general sense. I think most readers will get the point anyway. Or they could take a look at the description provided for the RTE_CACHE_GUARD itself. :-)
>
> Other than that,
> Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>
> > Depends-on: series-29415 ("clarify purpose of empty cache lines")
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> > lib/eal/common/rte_random.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
> > index 565f2401ce..3df0c7004a 100644
> > --- a/lib/eal/common/rte_random.c
> > +++ b/lib/eal/common/rte_random.c
> > @@ -18,6 +18,7 @@ struct rte_rand_state {
> > uint64_t z3;
> > uint64_t z4;
> > uint64_t z5;
> > + RTE_CACHE_GUARD;
> > } __rte_cache_aligned;
> >
> > /* One instance each for every lcore id-equipped thread, and one
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-09-06 16:25 ` Stephen Hemminger
@ 2023-10-11 16:07 ` Thomas Monjalon
2023-10-11 16:55 ` Morten Brørup
2023-10-11 20:41 ` Mattias Rönnblom
0 siblings, 2 replies; 8+ messages in thread
From: Thomas Monjalon @ 2023-10-11 16:07 UTC (permalink / raw)
To: Mattias Rönnblom, Morten Brørup, Stephen Hemminger
Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson,
olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev
TLS is an alternative solution proposed by Stephen.
What do you think?
06/09/2023 18:25, Stephen Hemminger:
> On Mon, 4 Sep 2023 13:57:19 +0200
> Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> > On 2023-09-04 11:26, Morten Brørup wrote:
> > > The per-lcore random state is frequently updated by their individual
> > > lcores, so add a cache guard to prevent CPU cache thrashing.
> > >
> >
> > "to prevent false sharing in case the CPU employs a next-N-lines (or
> > similar) hardware prefetcher"
> >
> > In my world, cache trashing and cache line contention are two different
> > things.
> >
> > Other than that,
> > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>
> Could the per-lcore state be thread local?
>
> Something like this:
>
> From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001
> From: Stephen Hemminger <stephen@networkplumber.org>
> Date: Wed, 6 Sep 2023 09:22:42 -0700
> Subject: [PATCH] random: use per lcore state
>
> Move the random number state into thread local storage.
> This has a several benefits.
> - no false cache sharing from cpu prefetching
> - fixes initialization of random state for non-DPDK threads
> - fixes unsafe usage of random state by non-DPDK threads
>
> The initialization of random number state is done by the
> lcore (lazy initialization).
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------
> 1 file changed, 17 insertions(+), 18 deletions(-)
>
> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
> index 53636331a27b..62f36038ac52 100644
> --- a/lib/eal/common/rte_random.c
> +++ b/lib/eal/common/rte_random.c
> @@ -19,13 +19,14 @@ struct rte_rand_state {
> uint64_t z3;
> uint64_t z4;
> uint64_t z5;
> -} __rte_cache_aligned;
> + uint64_t seed;
> +};
>
> -/* One instance each for every lcore id-equipped thread, and one
> - * additional instance to be shared by all others threads (i.e., all
> - * unregistered non-EAL threads).
> - */
> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
> +/* Global random seed */
> +static uint64_t rte_rand_seed;
> +
> +/* Per lcore random state. */
> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
>
> static uint32_t
> __rte_rand_lcg32(uint32_t *seed)
> @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state)
> state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL);
> state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL);
> state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL);
> +
> + state->seed = seed;
> }
>
> void
> rte_srand(uint64_t seed)
> {
> - unsigned int lcore_id;
> -
> - /* add lcore_id to seed to avoid having the same sequence */
> - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
> - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
> + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
> }
>
> static __rte_always_inline uint64_t
> @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
> static __rte_always_inline
> struct rte_rand_state *__rte_rand_get_state(void)
> {
> - unsigned int idx;
> -
> - idx = rte_lcore_id();
> + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state);
> + uint64_t seed;
>
> - /* last instance reserved for unregistered non-EAL threads */
> - if (unlikely(idx == LCORE_ID_ANY))
> - idx = RTE_MAX_LCORE;
> + /* did seed change */
> + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
> + if (unlikely(seed != rand_state->seed))
> + __rte_srand_lfsr258(seed, rand_state);
>
> - return &rand_states[idx];
> + return rand_state;
> }
>
> uint64_t
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-10-11 16:07 ` Thomas Monjalon
@ 2023-10-11 16:55 ` Morten Brørup
2023-10-11 22:49 ` Thomas Monjalon
2023-10-11 20:41 ` Mattias Rönnblom
1 sibling, 1 reply; 8+ messages in thread
From: Morten Brørup @ 2023-10-11 16:55 UTC (permalink / raw)
To: Thomas Monjalon, Mattias Rönnblom, Stephen Hemminger
Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson,
olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, 11 October 2023 18.08
>
> TLS is an alternative solution proposed by Stephen.
> What do you think?
I think we went down a rabbit hole - which I admit to enjoy. :-)
My simple patch should be applied, with the description improved by Mattias:
The per-lcore random state is frequently updated by their individual
lcores, so add a cache guard to prevent false sharing in case the
CPU employs a next-N-lines (or similar) hardware prefetcher.
>
>
> 06/09/2023 18:25, Stephen Hemminger:
> > On Mon, 4 Sep 2023 13:57:19 +0200
> > Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> >
> > > On 2023-09-04 11:26, Morten Brørup wrote:
> > > > The per-lcore random state is frequently updated by their
> individual
> > > > lcores, so add a cache guard to prevent CPU cache thrashing.
> > > >
> > >
> > > "to prevent false sharing in case the CPU employs a next-N-lines (or
> > > similar) hardware prefetcher"
> > >
> > > In my world, cache trashing and cache line contention are two
> different
> > > things.
> > >
> > > Other than that,
> > > Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >
> > Could the per-lcore state be thread local?
> >
> > Something like this:
> >
> > From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Date: Wed, 6 Sep 2023 09:22:42 -0700
> > Subject: [PATCH] random: use per lcore state
> >
> > Move the random number state into thread local storage.
> > This has a several benefits.
> > - no false cache sharing from cpu prefetching
> > - fixes initialization of random state for non-DPDK threads
> > - fixes unsafe usage of random state by non-DPDK threads
> >
> > The initialization of random number state is done by the
> > lcore (lazy initialization).
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------
> > 1 file changed, 17 insertions(+), 18 deletions(-)
> >
> > diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
> > index 53636331a27b..62f36038ac52 100644
> > --- a/lib/eal/common/rte_random.c
> > +++ b/lib/eal/common/rte_random.c
> > @@ -19,13 +19,14 @@ struct rte_rand_state {
> > uint64_t z3;
> > uint64_t z4;
> > uint64_t z5;
> > -} __rte_cache_aligned;
> > + uint64_t seed;
> > +};
> >
> > -/* One instance each for every lcore id-equipped thread, and one
> > - * additional instance to be shared by all others threads (i.e., all
> > - * unregistered non-EAL threads).
> > - */
> > -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
> > +/* Global random seed */
> > +static uint64_t rte_rand_seed;
> > +
> > +/* Per lcore random state. */
> > +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
> >
> > static uint32_t
> > __rte_rand_lcg32(uint32_t *seed)
> > @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct
> rte_rand_state *state)
> > state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL);
> > state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL);
> > state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL);
> > +
> > + state->seed = seed;
> > }
> >
> > void
> > rte_srand(uint64_t seed)
> > {
> > - unsigned int lcore_id;
> > -
> > - /* add lcore_id to seed to avoid having the same sequence */
> > - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
> > - __rte_srand_lfsr258(seed + lcore_id,
> &rand_states[lcore_id]);
> > + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
> > }
> >
> > static __rte_always_inline uint64_t
> > @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
> > static __rte_always_inline
> > struct rte_rand_state *__rte_rand_get_state(void)
> > {
> > - unsigned int idx;
> > -
> > - idx = rte_lcore_id();
> > + struct rte_rand_state *rand_state =
> &RTE_PER_LCORE(rte_rand_state);
> > + uint64_t seed;
> >
> > - /* last instance reserved for unregistered non-EAL threads */
> > - if (unlikely(idx == LCORE_ID_ANY))
> > - idx = RTE_MAX_LCORE;
> > + /* did seed change */
> > + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
> > + if (unlikely(seed != rand_state->seed))
> > + __rte_srand_lfsr258(seed, rand_state);
> >
> > - return &rand_states[idx];
> > + return rand_state;
> > }
> >
> > uint64_t
> >
>
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-10-11 16:07 ` Thomas Monjalon
2023-10-11 16:55 ` Morten Brørup
@ 2023-10-11 20:41 ` Mattias Rönnblom
1 sibling, 0 replies; 8+ messages in thread
From: Mattias Rönnblom @ 2023-10-11 20:41 UTC (permalink / raw)
To: Thomas Monjalon, Morten Brørup, Stephen Hemminger
Cc: dev, david.marchand, mattias.ronnblom, bruce.richardson,
olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
konstantin.v.ananyev
On 2023-10-11 18:07, Thomas Monjalon wrote:
> TLS is an alternative solution proposed by Stephen.
> What do you think?
>
I've expressed my views on this topic in two threads already.
I'm happy to continue that discussion, but I would suggest it would be
under the banner of "what should the standard pattern for maintaining
per-lcore (and maybe also per-unregistered thread state) be in DPDK".
A related issue is the ambition level for having unregistered threads
calling into DPDK APIs in general. MT safety issues, performance issues,
and concerns around preemption safety.
>
> 06/09/2023 18:25, Stephen Hemminger:
>> On Mon, 4 Sep 2023 13:57:19 +0200
>> Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>>
>>> On 2023-09-04 11:26, Morten Brørup wrote:
>>>> The per-lcore random state is frequently updated by their individual
>>>> lcores, so add a cache guard to prevent CPU cache thrashing.
>>>>
>>>
>>> "to prevent false sharing in case the CPU employs a next-N-lines (or
>>> similar) hardware prefetcher"
>>>
>>> In my world, cache trashing and cache line contention are two different
>>> things.
>>>
>>> Other than that,
>>> Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>
>> Could the per-lcore state be thread local?
>>
>> Something like this:
>>
>> From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Date: Wed, 6 Sep 2023 09:22:42 -0700
>> Subject: [PATCH] random: use per lcore state
>>
>> Move the random number state into thread local storage.
>> This has a several benefits.
>> - no false cache sharing from cpu prefetching
>> - fixes initialization of random state for non-DPDK threads
>> - fixes unsafe usage of random state by non-DPDK threads
>>
>> The initialization of random number state is done by the
>> lcore (lazy initialization).
>>
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>> ---
>> lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------
>> 1 file changed, 17 insertions(+), 18 deletions(-)
>>
>> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
>> index 53636331a27b..62f36038ac52 100644
>> --- a/lib/eal/common/rte_random.c
>> +++ b/lib/eal/common/rte_random.c
>> @@ -19,13 +19,14 @@ struct rte_rand_state {
>> uint64_t z3;
>> uint64_t z4;
>> uint64_t z5;
>> -} __rte_cache_aligned;
>> + uint64_t seed;
>> +};
>>
>> -/* One instance each for every lcore id-equipped thread, and one
>> - * additional instance to be shared by all others threads (i.e., all
>> - * unregistered non-EAL threads).
>> - */
>> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
>> +/* Global random seed */
>> +static uint64_t rte_rand_seed;
>> +
>> +/* Per lcore random state. */
>> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
>>
>> static uint32_t
>> __rte_rand_lcg32(uint32_t *seed)
>> @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state)
>> state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL);
>> state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL);
>> state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL);
>> +
>> + state->seed = seed;
>> }
>>
>> void
>> rte_srand(uint64_t seed)
>> {
>> - unsigned int lcore_id;
>> -
>> - /* add lcore_id to seed to avoid having the same sequence */
>> - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
>> - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
>> + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
>> }
>>
>> static __rte_always_inline uint64_t
>> @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
>> static __rte_always_inline
>> struct rte_rand_state *__rte_rand_get_state(void)
>> {
>> - unsigned int idx;
>> -
>> - idx = rte_lcore_id();
>> + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state);
>> + uint64_t seed;
>>
>> - /* last instance reserved for unregistered non-EAL threads */
>> - if (unlikely(idx == LCORE_ID_ANY))
>> - idx = RTE_MAX_LCORE;
>> + /* did seed change */
>> + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
>> + if (unlikely(seed != rand_state->seed))
>> + __rte_srand_lfsr258(seed, rand_state);
>>
>> - return &rand_states[idx];
>> + return rand_state;
>> }
>>
>> uint64_t
>>
>
>
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] eal: add cache guard to per-lcore PRNG state
2023-10-11 16:55 ` Morten Brørup
@ 2023-10-11 22:49 ` Thomas Monjalon
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2023-10-11 22:49 UTC (permalink / raw)
To: Morten Brørup
Cc: Mattias Rönnblom, Stephen Hemminger, dev, david.marchand,
mattias.ronnblom, bruce.richardson, olivier.matz,
andrew.rybchenko, honnappa.nagarahalli, konstantin.v.ananyev
11/10/2023 18:55, Morten Brørup:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Wednesday, 11 October 2023 18.08
> >
> > TLS is an alternative solution proposed by Stephen.
> > What do you think?
>
> I think we went down a rabbit hole - which I admit to enjoy. :-)
There is no reply/explanation in this thread.
> My simple patch should be applied, with the description improved by Mattias:
>
> The per-lcore random state is frequently updated by their individual
> lcores, so add a cache guard to prevent false sharing in case the
> CPU employs a next-N-lines (or similar) hardware prefetcher.
OK applied
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-10-11 22:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-04 9:26 [PATCH] eal: add cache guard to per-lcore PRNG state Morten Brørup
2023-09-04 11:57 ` Mattias Rönnblom
2023-09-06 16:25 ` Stephen Hemminger
2023-10-11 16:07 ` Thomas Monjalon
2023-10-11 16:55 ` Morten Brørup
2023-10-11 22:49 ` Thomas Monjalon
2023-10-11 20:41 ` Mattias Rönnblom
2023-09-29 18:55 ` Morten Brørup
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).