* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 8:49 ` [RFC v3 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-02-20 9:11 ` Bruce Richardson
2024-02-20 10:47 ` Mattias Rönnblom
2024-02-21 9:43 ` Jerin Jacob
` (2 subsequent siblings)
3 siblings, 1 reply; 323+ messages in thread
From: Bruce Richardson @ 2024-02-20 9:11 UTC (permalink / raw)
To: Mattias Rönnblom; +Cc: dev, hofors, Morten Brørup, Stephen Hemminger
On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias Rönnblom wrote:
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small chunks of often-used data, which is related logically, but where
> there are performance benefits to reap from having updates being local
> to an lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
> lib/eal/version.map | 4 +
> 7 files changed, 465 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/config/rte_config.h b/config/rte_config.h
> index da265d7dd2..884482e473 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -30,6 +30,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index a6a768bd7c..bb06bb7ca1 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..dfd11cbd0b
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define WARN_THRESHOLD 75
> +
> +/*
> + * Avoid using offset zero, since it would result in a NULL-value
> + * "handle" (offset) pointer, which in principle and per the API
> + * definition shouldn't be an issue, but may confuse some tools and
> + * users.
> + */
> +#define INITIAL_OFFSET 1
> +
> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
> +
While I like the idea of improved handling for per-core variables, my main
concern with this set is this definition here, which adds yet another
dependency on the compile-time defined RTE_MAX_LCORE value.
I believe we already have an issue with this #define where it's impossible
to come up with a single value that works for all, or nearly all cases. The
current default is still 128, yet DPDK needs to support systems where the
number of cores is well into the hundreds, requiring workarounds of core
mappings or customized builds of DPDK. Upping the value fixes those issues
at the cost to memory footprint explosion for smaller systems.
I'm therefore nervous about putting more dependencies on this value, when I
feel we should be moving away from its use, to allow more runtime
configurability of cores.
For this set/feature, would it be possible to have a run-time allocated
(and sized) array rather than using the RTE_MAX_LCORE value?
Thanks, (and apologies for the mini-rant!)
/Bruce
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 9:11 ` Bruce Richardson
@ 2024-02-20 10:47 ` Mattias Rönnblom
2024-02-20 11:39 ` Bruce Richardson
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-20 10:47 UTC (permalink / raw)
To: Bruce Richardson, Mattias Rönnblom
Cc: dev, Morten Brørup, Stephen Hemminger
On 2024-02-20 10:11, Bruce Richardson wrote:
> On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias Rönnblom wrote:
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small chunks of often-used data, which is related logically, but where
>> there are performance benefits to reap from having updates being local
>> to an lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> RFC v3:
>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>
>> RFC v2:
>> * Use alignof to derive alignment requirements. (Morten Brørup)
>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>> * Allow user-specified alignment, but limit max to cache line size.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> ---
>> config/rte_config.h | 1 +
>> doc/api/doxy-api-index.md | 1 +
>> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
>> lib/eal/common/meson.build | 1 +
>> lib/eal/include/meson.build | 1 +
>> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
>> lib/eal/version.map | 4 +
>> 7 files changed, 465 insertions(+)
>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>
>> diff --git a/config/rte_config.h b/config/rte_config.h
>> index da265d7dd2..884482e473 100644
>> --- a/config/rte_config.h
>> +++ b/config/rte_config.h
>> @@ -30,6 +30,7 @@
>> /* EAL defines */
>> #define RTE_CACHE_GUARD_LINES 1
>> #define RTE_MAX_HEAPS 32
>> +#define RTE_MAX_LCORE_VAR 1048576
>> #define RTE_MAX_MEMSEG_LISTS 128
>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index a6a768bd7c..bb06bb7ca1 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore-varible](@ref rte_lcore_var.h),
>> [per-lcore](@ref rte_per_lcore.h),
>> [service cores](@ref rte_service.h),
>> [keepalive](@ref rte_keepalive.h),
>> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
>> new file mode 100644
>> index 0000000000..dfd11cbd0b
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,82 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define WARN_THRESHOLD 75
>> +
>> +/*
>> + * Avoid using offset zero, since it would result in a NULL-value
>> + * "handle" (offset) pointer, which in principle and per the API
>> + * definition shouldn't be an issue, but may confuse some tools and
>> + * users.
>> + */
>> +#define INITIAL_OFFSET 1
>> +
>> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
>> +
>
> While I like the idea of improved handling for per-core variables, my main
> concern with this set is this definition here, which adds yet another
> dependency on the compile-time defined RTE_MAX_LCORE value.
>
lcore variables replaces one RTE_MAX_LCORE-dependent pattern with another.
You could even argue the dependency on RTE_MAX_LCORE is reduced with
lcore variables, if you look at where/in how many places in the code
base this macro is being used. Centralizing per-lcore data management
may also provide some opportunity in the future for extending the API to
cope with some more dynamic RTE_MAX_LCORE variant. Not without ABI
breakage of course, but we are not ever going to change anything related
to RTE_MAX_LCORE without breaking the ABI, since this constant is
everywhere, including compiled into the application itself.
> I believe we already have an issue with this #define where it's impossible
> to come up with a single value that works for all, or nearly all cases. The
> current default is still 128, yet DPDK needs to support systems where the
> number of cores is well into the hundreds, requiring workarounds of core
> mappings or customized builds of DPDK. Upping the value fixes those issues
> at the cost to memory footprint explosion for smaller systems.
>
I agree this is an issue.
RTE_MAX_LCORE also need to be sized to accommodate not only all cores
used, but the sum of all EAL threads and registered non-EAL threads.
So, there is no reliable way to discover what RTE_MAX_LCORE is on a
particular piece of hardware, since the actual number of lcore ids
needed is up to the application.
Why is the default set so low? Linux has MAX_CPUS, which serves the same
purpose, which is set to 4096 by default, if I recall correctly.
Shouldn't we at least be able to increase it to 256?
> I'm therefore nervous about putting more dependencies on this value, when I
> feel we should be moving away from its use, to allow more runtime
> configurability of cores.
>
What more specifically do you have in mind?
Maybe I'm overly pessimistic, but supporting lcores without any upper
bound and also allowing them to be added and removed at any point during
run time seems far-fetched, given where DPDK is today.
To include an actual upper bound, set during DPDK run-time
initialization, lower than RTE_MAX_LCORE, seems easier. I think there is
some equivalent in the Linux kernel. Again, you would need to
accommodate for future rte_register_thread() calls.
<rte_lcore_var.h> could be extended with a user-specified lcore variable
init/free function callbacks, to allow lazy or late initialization.
If one could have a way to retrieve the max possible lcore ids *for a
particular DPDK process* (as opposed to a particular build) it would be
possible to avoid touching the per-lcore buffers for lcore ids that
would never be used. With data in BSS, it would never be mapped/allocated.
An issue with BSS data is that there might be very RT-sensitive
applications deciding to lock all memory into RAM, to avoid latency
jitter caused by paging, and such would suffer from a large
rte_lcore_var (or all the current static arrays). Lcore variables makes
this worse, since rte_lcore_var is larger than the sum of today's static
arrays, and must be so, with some margin, since there is no way to
figure out ahead of time how much memory is actually going to be needed.
> For this set/feature, would it be possible to have a run-time allocated
> (and sized) array rather than using the RTE_MAX_LCORE value?
>
What I explored was having the per-lcore buffers dynamically allocated.
What I ran into was I saw no apparent benefit, and with dynamic
allocation there were new problems to solve. One was to assure lcore
variable buffers were allocated before they were being used. In
particular if you want to use huge page memory, lcore variables may be
available only when that machinery is ready to accept requests.
Also, with huge page memory, you won't get the benefit you will get from
depend paging and BSS (i.e., only used memory is actually allocated).
With malloc(), I believe you generally do get that same benefit, if you
allocation is sufficiently large.
I also considered just allocating chunks, fitting (say) 64 kB worth of
lcore variables in each. Turned out more complex, and to no benefit,
other than reducing footprint for mlockall() type apps, which seemed
like corner case.
I never considered no upper-bound, dynamic, RTE_MAX_LCORE.
> Thanks, (and apologies for the mini-rant!)
>
> /Bruce
Thanks for the comments. This is was no way near a rant.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 10:47 ` Mattias Rönnblom
@ 2024-02-20 11:39 ` Bruce Richardson
2024-02-20 13:37 ` Morten Brørup
2024-02-20 16:26 ` Mattias Rönnblom
0 siblings, 2 replies; 323+ messages in thread
From: Bruce Richardson @ 2024-02-20 11:39 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Mattias Rönnblom, dev, Morten Brørup, Stephen Hemminger
On Tue, Feb 20, 2024 at 11:47:14AM +0100, Mattias Rönnblom wrote:
> On 2024-02-20 10:11, Bruce Richardson wrote:
> > On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias Rönnblom wrote:
> > > Introduce DPDK per-lcore id variables, or lcore variables for short.
> > >
> > > An lcore variable has one value for every current and future lcore
> > > id-equipped thread.
> > >
> > > The primary <rte_lcore_var.h> use case is for statically allocating
> > > small chunks of often-used data, which is related logically, but where
> > > there are performance benefits to reap from having updates being local
> > > to an lcore.
> > >
> > > Lcore variables are similar to thread-local storage (TLS, e.g., C11
> > > _Thread_local), but decoupling the values' life time with that of the
> > > threads.
<snip>
> > > +/*
> > > + * Avoid using offset zero, since it would result in a NULL-value
> > > + * "handle" (offset) pointer, which in principle and per the API
> > > + * definition shouldn't be an issue, but may confuse some tools and
> > > + * users.
> > > + */
> > > +#define INITIAL_OFFSET 1
> > > +
> > > +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
> > > +
> >
> > While I like the idea of improved handling for per-core variables, my main
> > concern with this set is this definition here, which adds yet another
> > dependency on the compile-time defined RTE_MAX_LCORE value.
> >
>
> lcore variables replaces one RTE_MAX_LCORE-dependent pattern with another.
>
> You could even argue the dependency on RTE_MAX_LCORE is reduced with lcore
> variables, if you look at where/in how many places in the code base this
> macro is being used. Centralizing per-lcore data management may also provide
> some opportunity in the future for extending the API to cope with some more
> dynamic RTE_MAX_LCORE variant. Not without ABI breakage of course, but we
> are not ever going to change anything related to RTE_MAX_LCORE without
> breaking the ABI, since this constant is everywhere, including compiled into
> the application itself.
>
Yep, that is true if it's widely used.
> > I believe we already have an issue with this #define where it's impossible
> > to come up with a single value that works for all, or nearly all cases. The
> > current default is still 128, yet DPDK needs to support systems where the
> > number of cores is well into the hundreds, requiring workarounds of core
> > mappings or customized builds of DPDK. Upping the value fixes those issues
> > at the cost to memory footprint explosion for smaller systems.
> >
>
> I agree this is an issue.
>
> RTE_MAX_LCORE also need to be sized to accommodate not only all cores used,
> but the sum of all EAL threads and registered non-EAL threads.
>
> So, there is no reliable way to discover what RTE_MAX_LCORE is on a
> particular piece of hardware, since the actual number of lcore ids needed is
> up to the application.
>
> Why is the default set so low? Linux has MAX_CPUS, which serves the same
> purpose, which is set to 4096 by default, if I recall correctly. Shouldn't
> we at least be able to increase it to 256?
The default is so low because of the mempool caches. These are an array of
buffer pointers with 512 (IIRC) entries per core up to RTE_MAX_LCORE.
>
> > I'm therefore nervous about putting more dependencies on this value, when I
> > feel we should be moving away from its use, to allow more runtime
> > configurability of cores.
> >
>
> What more specifically do you have in mind?
>
I don't think having a dynamically scaling RTE_MAX_LCORE is feasible, but
what I would like to see is a runtime specified value. For example, you
could run DPDK with EAL parameter "--max-lcores=1024" for large systems or
"--max-lcores=32" for small ones. That would then be used at init-time to
scale all internal datastructures appropriately.
/Bruce
<snip for brevity>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 11:39 ` Bruce Richardson
@ 2024-02-20 13:37 ` Morten Brørup
2024-02-20 16:26 ` Mattias Rönnblom
1 sibling, 0 replies; 323+ messages in thread
From: Morten Brørup @ 2024-02-20 13:37 UTC (permalink / raw)
To: Bruce Richardson, Mattias Rönnblom
Cc: Mattias Rönnblom, dev, Stephen Hemminger
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Tuesday, 20 February 2024 12.39
>
> On Tue, Feb 20, 2024 at 11:47:14AM +0100, Mattias Rönnblom wrote:
> > On 2024-02-20 10:11, Bruce Richardson wrote:
> > > On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias Rönnblom wrote:
> > > > Introduce DPDK per-lcore id variables, or lcore variables for
> short.
> > > >
> > > > An lcore variable has one value for every current and future
> lcore
> > > > id-equipped thread.
> > > >
> > > > The primary <rte_lcore_var.h> use case is for statically
> allocating
> > > > small chunks of often-used data, which is related logically, but
> where
> > > > there are performance benefits to reap from having updates being
> local
> > > > to an lcore.
> > > >
> > > > Lcore variables are similar to thread-local storage (TLS, e.g.,
> C11
> > > > _Thread_local), but decoupling the values' life time with that of
> the
> > > > threads.
>
> <snip>
>
> > > > +/*
> > > > + * Avoid using offset zero, since it would result in a NULL-
> value
> > > > + * "handle" (offset) pointer, which in principle and per the API
> > > > + * definition shouldn't be an issue, but may confuse some tools
> and
> > > > + * users.
> > > > + */
> > > > +#define INITIAL_OFFSET 1
> > > > +
> > > > +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR]
> __rte_cache_aligned;
> > > > +
> > >
> > > While I like the idea of improved handling for per-core variables,
> my main
> > > concern with this set is this definition here, which adds yet
> another
> > > dependency on the compile-time defined RTE_MAX_LCORE value.
> > >
> >
> > lcore variables replaces one RTE_MAX_LCORE-dependent pattern with
> another.
> >
> > You could even argue the dependency on RTE_MAX_LCORE is reduced with
> lcore
> > variables, if you look at where/in how many places in the code base
> this
> > macro is being used. Centralizing per-lcore data management may also
> provide
> > some opportunity in the future for extending the API to cope with
> some more
> > dynamic RTE_MAX_LCORE variant. Not without ABI breakage of course,
> but we
> > are not ever going to change anything related to RTE_MAX_LCORE
> without
> > breaking the ABI, since this constant is everywhere, including
> compiled into
> > the application itself.
> >
>
> Yep, that is true if it's widely used.
>
> > > I believe we already have an issue with this #define where it's
> impossible
> > > to come up with a single value that works for all, or nearly all
> cases. The
> > > current default is still 128, yet DPDK needs to support systems
> where the
> > > number of cores is well into the hundreds, requiring workarounds of
> core
> > > mappings or customized builds of DPDK. Upping the value fixes those
> issues
> > > at the cost to memory footprint explosion for smaller systems.
> > >
> >
> > I agree this is an issue.
> >
> > RTE_MAX_LCORE also need to be sized to accommodate not only all cores
> used,
> > but the sum of all EAL threads and registered non-EAL threads.
> >
> > So, there is no reliable way to discover what RTE_MAX_LCORE is on a
> > particular piece of hardware, since the actual number of lcore ids
> needed is
> > up to the application.
> >
> > Why is the default set so low? Linux has MAX_CPUS, which serves the
> same
> > purpose, which is set to 4096 by default, if I recall correctly.
> Shouldn't
> > we at least be able to increase it to 256?
I recall a recent techboard meeting where the default was discussed. The default was agreed so low because it suffices for the vast majority of hardware out there, and applications for bigger platforms can be expected to build DPDK with a different configuration themselves. And as Bruce also mentions, it's a tradeoff for memory consumption.
>
> The default is so low because of the mempool caches. These are an array
> of
> buffer pointers with 512 (IIRC) entries per core up to RTE_MAX_LCORE.
The decision was based on a need to make a quick decision, so we used narrow guesstimates, not a broader memory consumption analysis.
If we really cared about default memory consumption, we should reduce the default RTE_MAX_QUEUES_PER_PORT from 1024 too. It has quite an effect.
Having hard data about which build time configuration parameters have the biggest effect on memory consumption would be extremely useful for tweaking the parameters for resource limited hardware.
It's a mix of static and dynamic allocation, so it's not obvious which scalable data structures consume the most memory.
>
> >
> > > I'm therefore nervous about putting more dependencies on this
> value, when I
> > > feel we should be moving away from its use, to allow more runtime
> > > configurability of cores.
> > >
> >
> > What more specifically do you have in mind?
> >
>
> I don't think having a dynamically scaling RTE_MAX_LCORE is feasible,
> but
> what I would like to see is a runtime specified value. For example, you
> could run DPDK with EAL parameter "--max-lcores=1024" for large systems
> or
> "--max-lcores=32" for small ones. That would then be used at init-time
> to
> scale all internal datastructures appropriately.
>
I agree 100 % that a better long term solution should be on the general road map.
Memory is a precious resource, but few seem to care about it.
A mix could provide an easy migration path:
Having RTE_MAX_LCORE as the hard upper limit (and default value) for a runtime specified max number ("rte_max_lcores").
With this, the goal would be for modules with very small data sets to continue using RTE_MAX_LCORE fixed size arrays, and for modules with larger data sets to migrate to rte_max_lcores dynamically sized arrays.
I am opposed to blocking a new patch series, only because it adds another RTE_MAX_LCORE sized array. We already have plenty of those.
It can be migrated towards dynamically sized array at a later time, just like the other modules with RTE_MAX_LCORE sized arrays.
Perhaps "fixing" an existing module would free up more memory than fixing this module. Let's spend development resources where they have the biggest impact.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 11:39 ` Bruce Richardson
2024-02-20 13:37 ` Morten Brørup
@ 2024-02-20 16:26 ` Mattias Rönnblom
1 sibling, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-20 16:26 UTC (permalink / raw)
To: Bruce Richardson
Cc: Mattias Rönnblom, dev, Morten Brørup, Stephen Hemminger
On 2024-02-20 12:39, Bruce Richardson wrote:
> On Tue, Feb 20, 2024 at 11:47:14AM +0100, Mattias Rönnblom wrote:
>> On 2024-02-20 10:11, Bruce Richardson wrote:
>>> On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias Rönnblom wrote:
>>>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>>>
>>>> An lcore variable has one value for every current and future lcore
>>>> id-equipped thread.
>>>>
>>>> The primary <rte_lcore_var.h> use case is for statically allocating
>>>> small chunks of often-used data, which is related logically, but where
>>>> there are performance benefits to reap from having updates being local
>>>> to an lcore.
>>>>
>>>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>>>> _Thread_local), but decoupling the values' life time with that of the
>>>> threads.
>
> <snip>
>
>>>> +/*
>>>> + * Avoid using offset zero, since it would result in a NULL-value
>>>> + * "handle" (offset) pointer, which in principle and per the API
>>>> + * definition shouldn't be an issue, but may confuse some tools and
>>>> + * users.
>>>> + */
>>>> +#define INITIAL_OFFSET 1
>>>> +
>>>> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
>>>> +
>>>
>>> While I like the idea of improved handling for per-core variables, my main
>>> concern with this set is this definition here, which adds yet another
>>> dependency on the compile-time defined RTE_MAX_LCORE value.
>>>
>>
>> lcore variables replaces one RTE_MAX_LCORE-dependent pattern with another.
>>
>> You could even argue the dependency on RTE_MAX_LCORE is reduced with lcore
>> variables, if you look at where/in how many places in the code base this
>> macro is being used. Centralizing per-lcore data management may also provide
>> some opportunity in the future for extending the API to cope with some more
>> dynamic RTE_MAX_LCORE variant. Not without ABI breakage of course, but we
>> are not ever going to change anything related to RTE_MAX_LCORE without
>> breaking the ABI, since this constant is everywhere, including compiled into
>> the application itself.
>>
>
> Yep, that is true if it's widely used.
>
>>> I believe we already have an issue with this #define where it's impossible
>>> to come up with a single value that works for all, or nearly all cases. The
>>> current default is still 128, yet DPDK needs to support systems where the
>>> number of cores is well into the hundreds, requiring workarounds of core
>>> mappings or customized builds of DPDK. Upping the value fixes those issues
>>> at the cost to memory footprint explosion for smaller systems.
>>>
>>
>> I agree this is an issue.
>>
>> RTE_MAX_LCORE also need to be sized to accommodate not only all cores used,
>> but the sum of all EAL threads and registered non-EAL threads.
>>
>> So, there is no reliable way to discover what RTE_MAX_LCORE is on a
>> particular piece of hardware, since the actual number of lcore ids needed is
>> up to the application.
>>
>> Why is the default set so low? Linux has MAX_CPUS, which serves the same
>> purpose, which is set to 4096 by default, if I recall correctly. Shouldn't
>> we at least be able to increase it to 256?
>
> The default is so low because of the mempool caches. These are an array of
> buffer pointers with 512 (IIRC) entries per core up to RTE_MAX_LCORE.
>
>>
>>> I'm therefore nervous about putting more dependencies on this value, when I
>>> feel we should be moving away from its use, to allow more runtime
>>> configurability of cores.
>>>
>>
>> What more specifically do you have in mind?
>>
>
> I don't think having a dynamically scaling RTE_MAX_LCORE is feasible, but
> what I would like to see is a runtime specified value. For example, you
> could run DPDK with EAL parameter "--max-lcores=1024" for large systems or
> "--max-lcores=32" for small ones. That would then be used at init-time to
> scale all internal datastructures appropriately.
>
Sounds reasonably to me, especially if you would take gradual approach.
By gradual I mean something like adding a function
rte_lcore_max_possible(), or something like that, returning the EAL
init-specified value. DPDK libraries/PMDs could then gradually be made
aware and taking advantage of knowing that lcore ids will always be
below a certain threshold, usually significantly lower than RTE_MAX_LCORE.
The only change required for lcore variables would be that the FOREACH
macro would use the run-time-max value, rather than RTE_MAX_LCORE, which
in turn would leave all the higher-numbered lcore id buffers
untouched/unmapped.
The set of possible lcore ids could also be expressed as a bitset, if
you have machine with a huge amount of cores, running many small DPDK
instances.
> /Bruce
>
> <snip for brevity>
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 8:49 ` [RFC v3 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-02-20 9:11 ` Bruce Richardson
@ 2024-02-21 9:43 ` Jerin Jacob
2024-02-21 10:31 ` Morten Brørup
2024-02-21 14:26 ` Mattias Rönnblom
2024-02-22 9:22 ` Morten Brørup
2024-02-25 15:03 ` [RFC v4 0/6] Lcore variables Mattias Rönnblom
3 siblings, 2 replies; 323+ messages in thread
From: Jerin Jacob @ 2024-02-21 9:43 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: dev, hofors, Morten Brørup, Stephen Hemminger, Tomasz Duszynski
On Tue, Feb 20, 2024 at 2:35 PM Mattias Rönnblom
<mattias.ronnblom@ericsson.com> wrote:
>
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small chunks of often-used data, which is related logically, but where
> there are performance benefits to reap from having updates being local
> to an lcore.
I think, in order to quantify the gain, we must add a performance test
case to measure the acces cycles with lcore variables scheme vs this
scheme.
Other PMU counters(Cache misses) may be interesting but we dont have
means in DPDK to do self monitoring now like
https://patches.dpdk.org/project/dpdk/patch/20221213104350.3218167-1-tduszynski@marvell.com/
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
> lib/eal/version.map | 4 +
> 7 files changed, 465 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/config/rte_config.h b/config/rte_config.h
> index da265d7dd2..884482e473 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -30,6 +30,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index a6a768bd7c..bb06bb7ca1 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..dfd11cbd0b
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define WARN_THRESHOLD 75
> +
> +/*
> + * Avoid using offset zero, since it would result in a NULL-value
> + * "handle" (offset) pointer, which in principle and per the API
> + * definition shouldn't be an issue, but may confuse some tools and
> + * users.
> + */
> +#define INITIAL_OFFSET 1
> +
> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
> +
> +static uintptr_t allocated = INITIAL_OFFSET;
> +
> +static void
> +verify_allocation(uintptr_t new_allocated)
> +{
> + static bool has_warned;
> +
> + RTE_VERIFY(new_allocated < RTE_MAX_LCORE_VAR);
> +
> + if (new_allocated > (WARN_THRESHOLD * RTE_MAX_LCORE_VAR) / 100 &&
> + !has_warned) {
> + EAL_LOG(WARNING, "Per-lcore data usage has exceeded %d%% "
> + "of the maximum capacity (%d bytes)", WARN_THRESHOLD,
> + RTE_MAX_LCORE_VAR);
> + has_warned = true;
> + }
> +}
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + uintptr_t new_allocated = RTE_ALIGN_CEIL(allocated, align);
> +
> + void *offset = (void *)new_allocated;
> +
> + new_allocated += size;
> +
> + verify_allocation(new_allocated);
> +
> + allocated = new_allocated;
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return offset;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> + * assures as well as having the base pointer aligned on cache
> + * size assures that aligned offsets also translate to aligned
> + * pointers across all values.
> + */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> +
> + /* '0' means asking for worst-case alignment requirements */
> + if (align == 0)
> + align = alignof(max_align_t);
> +
> + RTE_ASSERT(rte_is_power_of_2(align));
> +
> + return lcore_var_alloc(size, align);
> +}
> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
> index 22a626ba6f..d41403680b 100644
> --- a/lib/eal/common/meson.build
> +++ b/lib/eal/common/meson.build
> @@ -18,6 +18,7 @@ sources += files(
> 'eal_common_interrupts.c',
> 'eal_common_launch.c',
> 'eal_common_lcore.c',
> + 'eal_common_lcore_var.c',
> 'eal_common_mcfg.c',
> 'eal_common_memalloc.c',
> 'eal_common_memory.c',
> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
> index e94b056d46..9449253e23 100644
> --- a/lib/eal/include/meson.build
> +++ b/lib/eal/include/meson.build
> @@ -27,6 +27,7 @@ headers += files(
> 'rte_keepalive.h',
> 'rte_launch.h',
> 'rte_lcore.h',
> + 'rte_lcore_var.h',
> 'rte_lock_annotations.h',
> 'rte_malloc.h',
> 'rte_mcslock.h',
> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
> new file mode 100644
> index 0000000000..da49d48d7c
> --- /dev/null
> +++ b/lib/eal/include/rte_lcore_var.h
> @@ -0,0 +1,375 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#ifndef _RTE_LCORE_VAR_H_
> +#define _RTE_LCORE_VAR_H_
> +
> +/**
> + * @file
> + *
> + * RTE Per-lcore id variables
> + *
> + * This API provides a mechanism to create and access per-lcore id
> + * variables in a space- and cycle-efficient manner.
> + *
> + * A per-lcore id variable (or lcore variable for short) has one value
> + * for each EAL thread and registered non-EAL thread. In other words,
> + * there's one copy of its value for each and every current and future
> + * lcore id-equipped thread, with the total number of copies amounting
> + * to \c RTE_MAX_LCORE.
> + *
> + * In order to access the values of an lcore variable, a handle is
> + * used. The type of the handle is a pointer to the value's type
> + * (e.g., for \c uint32_t lcore variable, the handle is a
> + * <code>uint32_t *</code>. A handle may be passed between modules and
> + * threads just like any pointer, but its value is not the address of
> + * any particular object, but rather just an opaque identifier, stored
> + * in a typed pointer (to inform the access macro the type of values).
> + *
> + * @b Creation
> + *
> + * An lcore variable is created in two steps:
> + * 1. Define a lcore variable handle by using \ref RTE_LCORE_VAR_HANDLE.
> + * 2. Allocate lcore variable storage and initialize the handle with
> + * a unique identifier by \ref RTE_LCORE_VAR_ALLOC or
> + * \ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
> + * module initialization, but may be done at any time.
> + *
> + * An lcore variable is not tied to the owning thread's lifetime. It's
> + * available for use by any thread immediately after having been
> + * allocated, and continues to be available throughout the lifetime of
> + * the EAL.
> + *
> + * Lcore variables cannot and need not be freed.
> + *
> + * @b Access
> + *
> + * The value of any lcore variable for any lcore id may be accessed
> + * from any thread (including unregistered threads), but is should
> + * generally only *frequently* read from or written to by the owner.
> + *
> + * Values of the same lcore variable but owned by to different lcore
> + * ids *may* be frequently read or written by the owners without the
> + * risk of false sharing.
> + *
> + * An appropriate synchronization mechanism (e.g., atomics) should
> + * employed to assure there are no data races between the owning
> + * thread and any non-owner threads accessing the same lcore variable
> + * instance.
> + *
> + * The value of the lcore variable for a particular lcore id may be
> + * retrieved with \ref RTE_LCORE_VAR_LCORE_GET. To get a pointer to the
> + * same object, use \ref RTE_LCORE_VAR_LCORE_PTR.
> + *
> + * To modify the value of an lcore variable for a particular lcore id,
> + * either access the object through the pointer retrieved by \ref
> + * RTE_LCORE_VAR_LCORE_PTR or, for primitive types, use \ref
> + * RTE_LCORE_VAR_LCORE_SET.
> + *
> + * The access macros each has a short-hand which may be used by an EAL
> + * thread or registered non-EAL thread to access the lcore variable
> + * instance of its own lcore id. Those are \ref RTE_LCORE_VAR_GET,
> + * \ref RTE_LCORE_VAR_PTR, and \ref RTE_LCORE_VAR_SET.
> + *
> + * Although the handle (as defined by \ref RTE_LCORE_VAR_HANDLE) is a
> + * pointer with the same type as the value, it may not be directly
> + * dereferenced and must be treated as an opaque identifier. The
> + * *identifier* value is common across all lcore ids.
> + *
> + * @b Storage
> + *
> + * An lcore variable's values may by of a primitive type like \c int,
> + * but would more typically be a \c struct. An application may choose
> + * to define an lcore variable, which it then it goes on to never
> + * allocate.
> + *
> + * The lcore variable handle introduces a per-variable (not
> + * per-value/per-lcore id) overhead of \c sizeof(void *) bytes, so
> + * there are some memory footprint gains to be made by organizing all
> + * per-lcore id data for a particular module as one lcore variable
> + * (e.g., as a struct).
> + *
> + * The sum of all lcore variables, plus any padding required, must be
> + * less than the DPDK build-time constant \c RTE_MAX_LCORE_VAR. A
> + * violation of this maximum results in the process being terminated.
> + *
> + * It's reasonable to expected that \c RTE_MAX_LCORE_VAR is on the
> + * same order of magnitude in size as a thread stack.
> + *
> + * The lcore variable storage buffers are kept in the BSS section in
> + * the resulting binary, where data generally isn't mapped in until
> + * it's accessed. This means that unused portions of the lcore
> + * variable storage area will not occupy any physical memory (with a
> + * granularity of the memory page size [usually 4 kB]).
> + *
> + * Lcore variables should generally *not* be \ref __rte_cache_aligned
> + * and need *not* include a \ref RTE_CACHE_GUARD field, since the use
> + * of these constructs are designed to avoid false sharing. In the
> + * case of an lcore variable instance, all nearby data structures
> + * should almost-always be written to by a single thread (the lcore
> + * variable owner). Adding padding will increase the effective memory
> + * working set size, and potentially reducing performance.
> + *
> + * @b Example
> + *
> + * Below is an example of the use of an lcore variable:
> + *
> + * \code{.c}
> + * struct foo_lcore_state {
> + * int a;
> + * long b;
> + * };
> + *
> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
> + *
> + * long foo_get_a_plus_b(void)
> + * {
> + * struct foo_lcore_state *state = RTE_LCORE_VAR_PTR(lcore_states);
> + *
> + * return state->a + state->b;
> + * }
> + *
> + * RTE_INIT(rte_foo_init)
> + * {
> + * unsigned int lcore_id;
> + *
> + * RTE_LCORE_VAR_ALLOC(foo_state);
> + *
> + * struct foo_lcore_state *state;
> + * RTE_LCORE_VAR_FOREACH_VALUE(lcore_states) {
> + * (initialize 'state')
> + * }
> + *
> + * (other initialization)
> + * }
> + * \endcode
> + *
> + *
> + * @b Alternatives
> + *
> + * Lcore variables are designed to replace a pattern exemplified below:
> + * \code{.c}
> + * struct foo_lcore_state {
> + * int a;
> + * long b;
> + * RTE_CACHE_GUARD;
> + * } __rte_cache_aligned;
> + *
> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
> + * \endcode
> + *
> + * This scheme is simple and effective, but has one drawback: the data
> + * is organized so that objects related to all lcores for a particular
> + * module is kept close in memory. At a bare minimum, this forces the
> + * use of cache-line alignment to avoid false sharing. With CPU
> + * hardware prefetching and memory loads resulting from speculative
> + * execution (functions which seemingly are getting more eager faster
> + * than they are getting more intelligent), one or more "guard" cache
> + * lines may be required to separate one lcore's data from another's.
> + *
> + * Lcore variables has the upside of working with, not against, the
> + * CPU's assumptions and for example next-line prefetchers may well
> + * work the way its designers intended (i.e., to the benefit, not
> + * detriment, of system performance).
> + *
> + * Another alternative to \ref rte_lcore_var.h is the \ref
> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
> + * e.g., GCC __thread or C11 _Thread_local). The main differences
> + * between by using the various forms of TLS (e.g., \ref
> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
> + * variables are:
> + *
> + * * The existence and non-existence of a thread-local variable
> + * instance follow that of particular thread's. The data cannot be
> + * accessed before the thread has been created, nor after it has
> + * exited. One effect of this is thread-local variables must
> + * initialized in a "lazy" manner (e.g., at the point of thread
> + * creation). Lcore variables may be accessed immediately after
> + * having been allocated (which is usually prior any thread beyond
> + * the main thread is running).
> + * * A thread-local variable is duplicated across all threads in the
> + * process, including unregistered non-EAL threads (i.e.,
> + * "regular" threads). For DPDK applications heavily relying on
> + * multi-threading (in conjunction to DPDK's "one thread per core"
> + * pattern), either by having many concurrent threads or
> + * creating/destroying threads at a high rate, an excessive use of
> + * thread-local variables may cause inefficiencies (e.g.,
> + * increased thread creation overhead due to thread-local storage
> + * initialization or increased total RAM footprint usage). Lcore
> + * variables *only* exist for threads with an lcore id, and thus
> + * not for such "regular" threads.
> + * * If data in thread-local storage may be shared between threads
> + * (i.e., can a pointer to a thread-local variable be passed to
> + * and successfully dereferenced by non-owning thread) depends on
> + * the details of the TLS implementation. With GCC __thread and
> + * GCC _Thread_local, such data sharing is supported. In the C11
> + * standard, the result of accessing another thread's
> + * _Thread_local object is implementation-defined. Lcore variable
> + * instances may be accessed reliably by any thread.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stddef.h>
> +#include <stdalign.h>
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_lcore.h>
> +
> +/**
> + * Given the lcore variable type, produces the type of the lcore
> + * variable handle.
> + */
> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> + type *
> +
> +/**
> + * Define a lcore variable handle.
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various per-lcore id instances of a per-lcore id variable.
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handler, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable are only to be
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align) \
> + name = rte_lcore_var_alloc(size, align)
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle,
> + * with values aligned for any type of object.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE(name, size) \
> + name = rte_lcore_var_alloc(size, 0)
> +
> +/**
> + * Allocate space for an lcore variable of the size and alignment requirements
> + * suggested by the handler pointer type, and initialize its handle.
> + */
> +#define RTE_LCORE_VAR_ALLOC(name) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, sizeof(*(name)), \
> + alignof(typeof(*(name))))
> +
> +/**
> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> + * means of a \ref RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> + }
> +
> +/**
> + * Allocate an explicitly-sized lcore variable by means of a \ref
> + * RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> +
> +/**
> + * Allocate an lcore variable by means of a \ref RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT(name) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC(name); \
> + }
> +
> +#define __RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
> + ((void *)(&rte_lcore_var[lcore_id][(uintptr_t)(name)]))
> +
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
> + ((typeof(name))__RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
> +
> +/**
> + * Get value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, name) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)))
> +
> +/**
> + * Set the value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, name, value) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)) = (value))
> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_PTR(name) RTE_LCORE_VAR_LCORE_PTR(rte_lcore_id(), name)
> +
> +/**
> + * Get value of lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_GET(name) RTE_LCORE_VAR_LCORE_GET(rte_lcore_id(), name)
> +
> +/**
> + * Set value of lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_SET(name, value) \
> + RTE_LCORE_VAR_LCORE_SET(rte_lcore_id(), name, value)
> +
> +/**
> + * Iterate over each lcore id's value for a lcore variable.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(var, name) \
> + for (unsigned int lcore_id = \
> + (((var) = RTE_LCORE_VAR_LCORE_PTR(0, name)), 0); \
> + lcore_id < RTE_MAX_LCORE; \
> + lcore_id++, (var) = RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
> +
> +extern char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR];
> +
> +/**
> + * Allocate space in the per-lcore id buffers for a lcore variable.
> + *
> + * The pointer returned is only an opaque identifer of the variable. To
> + * get an actual pointer to a particular instance of the variable use
> + * \ref RTE_LCORE_VAR_PTR or \ref RTE_LCORE_VAR_LCORE_PTR.
> + *
> + * The allocation is always successful, barring a fatal exhaustion of
> + * the per-lcore id buffer space.
> + *
> + * @param size
> + * The size (in bytes) of the variable's per-lcore id value.
> + * @param align
> + * If 0, the values will be suitably aligned for any kind of type
> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
> + * on a multiple of *align*, which must be a power of 2 and equal or
> + * less than \c RTE_CACHE_LINE_SIZE.
> + * @return
> + * The id of the variable, stored in a void pointer value.
> + */
> +__rte_experimental
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_LCORE_VAR_H_ */
> diff --git a/lib/eal/version.map b/lib/eal/version.map
> index 5e0cd47c82..e90b86115a 100644
> --- a/lib/eal/version.map
> +++ b/lib/eal/version.map
> @@ -393,6 +393,10 @@ EXPERIMENTAL {
> # added in 23.07
> rte_memzone_max_get;
> rte_memzone_max_set;
> +
> + # added in 24.03
> + rte_lcore_var_alloc;
> + rte_lcore_var;
> };
>
> INTERNAL {
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-21 9:43 ` Jerin Jacob
@ 2024-02-21 10:31 ` Morten Brørup
2024-02-21 14:26 ` Mattias Rönnblom
1 sibling, 0 replies; 323+ messages in thread
From: Morten Brørup @ 2024-02-21 10:31 UTC (permalink / raw)
To: Jerin Jacob, Mattias Rönnblom
Cc: dev, hofors, Stephen Hemminger, Tomasz Duszynski
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Wednesday, 21 February 2024 10.44
>
> On Tue, Feb 20, 2024 at 2:35 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
> >
> > Introduce DPDK per-lcore id variables, or lcore variables for short.
> >
> > An lcore variable has one value for every current and future lcore
> > id-equipped thread.
> >
> > The primary <rte_lcore_var.h> use case is for statically allocating
> > small chunks of often-used data, which is related logically, but
> where
> > there are performance benefits to reap from having updates being
> local
> > to an lcore.
>
> I think, in order to quantify the gain, we must add a performance test
> case to measure the acces cycles with lcore variables scheme vs this
> scheme.
> Other PMU counters(Cache misses) may be interesting but we dont have
> means in DPDK to do self monitoring now like
> https://patches.dpdk.org/project/dpdk/patch/20221213104350.3218167-1-
> tduszynski@marvell.com/
>
> >
> > Lcore variables are similar to thread-local storage (TLS, e.g., C11
> > _Thread_local), but decoupling the values' life time with that of the
> > threads.
Lcore variables can be accessed by other threads, unlike TLS variables.
If a TLS variable needs to be accessed by other threads, there must also be an RTE_MAX_LCORE-sized array of pointers to the TLS variable, where each worker thread must initialize the entry pointing to its TLS variable.
> >
> > Lcore variables are also similar in terms of functionality provided
> by
> > FreeBSD kernel's DPCPU_*() family of macros and the associated
> > build-time machinery. DPCPU uses linker scripts, which effectively
> > prevents the reuse of its, otherwise seemingly viable, approach.
> >
> > The currently-prevailing way to solve the same problem as lcore
> > variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> > array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> > lcore variables over this approach is that data related to the same
> > lcore now is close (spatially, in memory), rather than data used by
> > the same module, which in turn avoid excessive use of padding,
> > polluting caches with unused data.
> >
There are 3 ways to implement per-lcore variables:
1. Thread-local storage, available via RTE_DEFINE_PER_LCORE(type, name).
2. RTE_MAX_LCORE-sized arrays.
3. Lcore variables, as provided by this patch series.
Perhaps an overview of differences and performance numbers would help understand the benefits of this patch series.
The advantages of packing more variables into the same cache line may be hard to measure without PMU counters, and could perhaps be described or estimated instead.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-21 9:43 ` Jerin Jacob
2024-02-21 10:31 ` Morten Brørup
@ 2024-02-21 14:26 ` Mattias Rönnblom
1 sibling, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-21 14:26 UTC (permalink / raw)
To: Jerin Jacob, Mattias Rönnblom
Cc: dev, Morten Brørup, Stephen Hemminger, Tomasz Duszynski
On 2024-02-21 10:43, Jerin Jacob wrote:
> On Tue, Feb 20, 2024 at 2:35 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>>
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small chunks of often-used data, which is related logically, but where
>> there are performance benefits to reap from having updates being local
>> to an lcore.
>
> I think, in order to quantify the gain, we must add a performance test
> case to measure the acces cycles with lcore variables scheme vs this
> scheme.
As I might have mentioned elsewhere in the thread, the micro benchmarks
are already there, in the form of the service and random perf tests.
The service perf tests doesn't show any difference, and the rand perf
tests seems to indicate lcore variables add one (1) core clock cycle per
rte_rand() call (measured on Raptor Lake E- and P-cores).
The effects on a real-world app would be highly dependent on what DPDK
services it's using that themselves are using static per-lcore data, and
to what extent the app itself use per-lcore data.
Provided lcore variables performs as good as the cache-aligned static
array pattern for micro benchmarks, lcore variables should always
be-as-good-or-better in a real-world app, because the cache working set
size will always be smaller (no padding).
That said, I don't think lcore variables will result in noticable
performance gain for the typical app. If you do see large gains, I
suspect it will be on systems with next-N-lines prefetchers and the
lcore data weren't RTE_CACHE_GUARDed.
> Other PMU counters(Cache misses) may be interesting but we dont have
> means in DPDK to do self monitoring now like
> https://patches.dpdk.org/project/dpdk/patch/20221213104350.3218167-1-tduszynski@marvell.com/
>
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> RFC v3:
>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>
>> RFC v2:
>> * Use alignof to derive alignment requirements. (Morten Brørup)
>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>> * Allow user-specified alignment, but limit max to cache line size.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> ---
>> config/rte_config.h | 1 +
>> doc/api/doxy-api-index.md | 1 +
>> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
>> lib/eal/common/meson.build | 1 +
>> lib/eal/include/meson.build | 1 +
>> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
>> lib/eal/version.map | 4 +
>> 7 files changed, 465 insertions(+)
>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>
>> diff --git a/config/rte_config.h b/config/rte_config.h
>> index da265d7dd2..884482e473 100644
>> --- a/config/rte_config.h
>> +++ b/config/rte_config.h
>> @@ -30,6 +30,7 @@
>> /* EAL defines */
>> #define RTE_CACHE_GUARD_LINES 1
>> #define RTE_MAX_HEAPS 32
>> +#define RTE_MAX_LCORE_VAR 1048576
>> #define RTE_MAX_MEMSEG_LISTS 128
>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index a6a768bd7c..bb06bb7ca1 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore-varible](@ref rte_lcore_var.h),
>> [per-lcore](@ref rte_per_lcore.h),
>> [service cores](@ref rte_service.h),
>> [keepalive](@ref rte_keepalive.h),
>> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
>> new file mode 100644
>> index 0000000000..dfd11cbd0b
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,82 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define WARN_THRESHOLD 75
>> +
>> +/*
>> + * Avoid using offset zero, since it would result in a NULL-value
>> + * "handle" (offset) pointer, which in principle and per the API
>> + * definition shouldn't be an issue, but may confuse some tools and
>> + * users.
>> + */
>> +#define INITIAL_OFFSET 1
>> +
>> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
>> +
>> +static uintptr_t allocated = INITIAL_OFFSET;
>> +
>> +static void
>> +verify_allocation(uintptr_t new_allocated)
>> +{
>> + static bool has_warned;
>> +
>> + RTE_VERIFY(new_allocated < RTE_MAX_LCORE_VAR);
>> +
>> + if (new_allocated > (WARN_THRESHOLD * RTE_MAX_LCORE_VAR) / 100 &&
>> + !has_warned) {
>> + EAL_LOG(WARNING, "Per-lcore data usage has exceeded %d%% "
>> + "of the maximum capacity (%d bytes)", WARN_THRESHOLD,
>> + RTE_MAX_LCORE_VAR);
>> + has_warned = true;
>> + }
>> +}
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + uintptr_t new_allocated = RTE_ALIGN_CEIL(allocated, align);
>> +
>> + void *offset = (void *)new_allocated;
>> +
>> + new_allocated += size;
>> +
>> + verify_allocation(new_allocated);
>> +
>> + allocated = new_allocated;
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>> +
>> + return offset;
>> +}
>> +
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align)
>> +{
>> + /* Having the per-lcore buffer size aligned on cache lines
>> + * assures as well as having the base pointer aligned on cache
>> + * size assures that aligned offsets also translate to aligned
>> + * pointers across all values.
>> + */
>> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
>> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
>> +
>> + /* '0' means asking for worst-case alignment requirements */
>> + if (align == 0)
>> + align = alignof(max_align_t);
>> +
>> + RTE_ASSERT(rte_is_power_of_2(align));
>> +
>> + return lcore_var_alloc(size, align);
>> +}
>> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
>> index 22a626ba6f..d41403680b 100644
>> --- a/lib/eal/common/meson.build
>> +++ b/lib/eal/common/meson.build
>> @@ -18,6 +18,7 @@ sources += files(
>> 'eal_common_interrupts.c',
>> 'eal_common_launch.c',
>> 'eal_common_lcore.c',
>> + 'eal_common_lcore_var.c',
>> 'eal_common_mcfg.c',
>> 'eal_common_memalloc.c',
>> 'eal_common_memory.c',
>> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
>> index e94b056d46..9449253e23 100644
>> --- a/lib/eal/include/meson.build
>> +++ b/lib/eal/include/meson.build
>> @@ -27,6 +27,7 @@ headers += files(
>> 'rte_keepalive.h',
>> 'rte_launch.h',
>> 'rte_lcore.h',
>> + 'rte_lcore_var.h',
>> 'rte_lock_annotations.h',
>> 'rte_malloc.h',
>> 'rte_mcslock.h',
>> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
>> new file mode 100644
>> index 0000000000..da49d48d7c
>> --- /dev/null
>> +++ b/lib/eal/include/rte_lcore_var.h
>> @@ -0,0 +1,375 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_LCORE_VAR_H_
>> +#define _RTE_LCORE_VAR_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE Per-lcore id variables
>> + *
>> + * This API provides a mechanism to create and access per-lcore id
>> + * variables in a space- and cycle-efficient manner.
>> + *
>> + * A per-lcore id variable (or lcore variable for short) has one value
>> + * for each EAL thread and registered non-EAL thread. In other words,
>> + * there's one copy of its value for each and every current and future
>> + * lcore id-equipped thread, with the total number of copies amounting
>> + * to \c RTE_MAX_LCORE.
>> + *
>> + * In order to access the values of an lcore variable, a handle is
>> + * used. The type of the handle is a pointer to the value's type
>> + * (e.g., for \c uint32_t lcore variable, the handle is a
>> + * <code>uint32_t *</code>. A handle may be passed between modules and
>> + * threads just like any pointer, but its value is not the address of
>> + * any particular object, but rather just an opaque identifier, stored
>> + * in a typed pointer (to inform the access macro the type of values).
>> + *
>> + * @b Creation
>> + *
>> + * An lcore variable is created in two steps:
>> + * 1. Define a lcore variable handle by using \ref RTE_LCORE_VAR_HANDLE.
>> + * 2. Allocate lcore variable storage and initialize the handle with
>> + * a unique identifier by \ref RTE_LCORE_VAR_ALLOC or
>> + * \ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
>> + * module initialization, but may be done at any time.
>> + *
>> + * An lcore variable is not tied to the owning thread's lifetime. It's
>> + * available for use by any thread immediately after having been
>> + * allocated, and continues to be available throughout the lifetime of
>> + * the EAL.
>> + *
>> + * Lcore variables cannot and need not be freed.
>> + *
>> + * @b Access
>> + *
>> + * The value of any lcore variable for any lcore id may be accessed
>> + * from any thread (including unregistered threads), but is should
>> + * generally only *frequently* read from or written to by the owner.
>> + *
>> + * Values of the same lcore variable but owned by to different lcore
>> + * ids *may* be frequently read or written by the owners without the
>> + * risk of false sharing.
>> + *
>> + * An appropriate synchronization mechanism (e.g., atomics) should
>> + * employed to assure there are no data races between the owning
>> + * thread and any non-owner threads accessing the same lcore variable
>> + * instance.
>> + *
>> + * The value of the lcore variable for a particular lcore id may be
>> + * retrieved with \ref RTE_LCORE_VAR_LCORE_GET. To get a pointer to the
>> + * same object, use \ref RTE_LCORE_VAR_LCORE_PTR.
>> + *
>> + * To modify the value of an lcore variable for a particular lcore id,
>> + * either access the object through the pointer retrieved by \ref
>> + * RTE_LCORE_VAR_LCORE_PTR or, for primitive types, use \ref
>> + * RTE_LCORE_VAR_LCORE_SET.
>> + *
>> + * The access macros each has a short-hand which may be used by an EAL
>> + * thread or registered non-EAL thread to access the lcore variable
>> + * instance of its own lcore id. Those are \ref RTE_LCORE_VAR_GET,
>> + * \ref RTE_LCORE_VAR_PTR, and \ref RTE_LCORE_VAR_SET.
>> + *
>> + * Although the handle (as defined by \ref RTE_LCORE_VAR_HANDLE) is a
>> + * pointer with the same type as the value, it may not be directly
>> + * dereferenced and must be treated as an opaque identifier. The
>> + * *identifier* value is common across all lcore ids.
>> + *
>> + * @b Storage
>> + *
>> + * An lcore variable's values may by of a primitive type like \c int,
>> + * but would more typically be a \c struct. An application may choose
>> + * to define an lcore variable, which it then it goes on to never
>> + * allocate.
>> + *
>> + * The lcore variable handle introduces a per-variable (not
>> + * per-value/per-lcore id) overhead of \c sizeof(void *) bytes, so
>> + * there are some memory footprint gains to be made by organizing all
>> + * per-lcore id data for a particular module as one lcore variable
>> + * (e.g., as a struct).
>> + *
>> + * The sum of all lcore variables, plus any padding required, must be
>> + * less than the DPDK build-time constant \c RTE_MAX_LCORE_VAR. A
>> + * violation of this maximum results in the process being terminated.
>> + *
>> + * It's reasonable to expected that \c RTE_MAX_LCORE_VAR is on the
>> + * same order of magnitude in size as a thread stack.
>> + *
>> + * The lcore variable storage buffers are kept in the BSS section in
>> + * the resulting binary, where data generally isn't mapped in until
>> + * it's accessed. This means that unused portions of the lcore
>> + * variable storage area will not occupy any physical memory (with a
>> + * granularity of the memory page size [usually 4 kB]).
>> + *
>> + * Lcore variables should generally *not* be \ref __rte_cache_aligned
>> + * and need *not* include a \ref RTE_CACHE_GUARD field, since the use
>> + * of these constructs are designed to avoid false sharing. In the
>> + * case of an lcore variable instance, all nearby data structures
>> + * should almost-always be written to by a single thread (the lcore
>> + * variable owner). Adding padding will increase the effective memory
>> + * working set size, and potentially reducing performance.
>> + *
>> + * @b Example
>> + *
>> + * Below is an example of the use of an lcore variable:
>> + *
>> + * \code{.c}
>> + * struct foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * };
>> + *
>> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
>> + *
>> + * long foo_get_a_plus_b(void)
>> + * {
>> + * struct foo_lcore_state *state = RTE_LCORE_VAR_PTR(lcore_states);
>> + *
>> + * return state->a + state->b;
>> + * }
>> + *
>> + * RTE_INIT(rte_foo_init)
>> + * {
>> + * unsigned int lcore_id;
>> + *
>> + * RTE_LCORE_VAR_ALLOC(foo_state);
>> + *
>> + * struct foo_lcore_state *state;
>> + * RTE_LCORE_VAR_FOREACH_VALUE(lcore_states) {
>> + * (initialize 'state')
>> + * }
>> + *
>> + * (other initialization)
>> + * }
>> + * \endcode
>> + *
>> + *
>> + * @b Alternatives
>> + *
>> + * Lcore variables are designed to replace a pattern exemplified below:
>> + * \code{.c}
>> + * struct foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * RTE_CACHE_GUARD;
>> + * } __rte_cache_aligned;
>> + *
>> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
>> + * \endcode
>> + *
>> + * This scheme is simple and effective, but has one drawback: the data
>> + * is organized so that objects related to all lcores for a particular
>> + * module is kept close in memory. At a bare minimum, this forces the
>> + * use of cache-line alignment to avoid false sharing. With CPU
>> + * hardware prefetching and memory loads resulting from speculative
>> + * execution (functions which seemingly are getting more eager faster
>> + * than they are getting more intelligent), one or more "guard" cache
>> + * lines may be required to separate one lcore's data from another's.
>> + *
>> + * Lcore variables has the upside of working with, not against, the
>> + * CPU's assumptions and for example next-line prefetchers may well
>> + * work the way its designers intended (i.e., to the benefit, not
>> + * detriment, of system performance).
>> + *
>> + * Another alternative to \ref rte_lcore_var.h is the \ref
>> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
>> + * e.g., GCC __thread or C11 _Thread_local). The main differences
>> + * between by using the various forms of TLS (e.g., \ref
>> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
>> + * variables are:
>> + *
>> + * * The existence and non-existence of a thread-local variable
>> + * instance follow that of particular thread's. The data cannot be
>> + * accessed before the thread has been created, nor after it has
>> + * exited. One effect of this is thread-local variables must
>> + * initialized in a "lazy" manner (e.g., at the point of thread
>> + * creation). Lcore variables may be accessed immediately after
>> + * having been allocated (which is usually prior any thread beyond
>> + * the main thread is running).
>> + * * A thread-local variable is duplicated across all threads in the
>> + * process, including unregistered non-EAL threads (i.e.,
>> + * "regular" threads). For DPDK applications heavily relying on
>> + * multi-threading (in conjunction to DPDK's "one thread per core"
>> + * pattern), either by having many concurrent threads or
>> + * creating/destroying threads at a high rate, an excessive use of
>> + * thread-local variables may cause inefficiencies (e.g.,
>> + * increased thread creation overhead due to thread-local storage
>> + * initialization or increased total RAM footprint usage). Lcore
>> + * variables *only* exist for threads with an lcore id, and thus
>> + * not for such "regular" threads.
>> + * * If data in thread-local storage may be shared between threads
>> + * (i.e., can a pointer to a thread-local variable be passed to
>> + * and successfully dereferenced by non-owning thread) depends on
>> + * the details of the TLS implementation. With GCC __thread and
>> + * GCC _Thread_local, such data sharing is supported. In the C11
>> + * standard, the result of accessing another thread's
>> + * _Thread_local object is implementation-defined. Lcore variable
>> + * instances may be accessed reliably by any thread.
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stddef.h>
>> +#include <stdalign.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_config.h>
>> +#include <rte_lcore.h>
>> +
>> +/**
>> + * Given the lcore variable type, produces the type of the lcore
>> + * variable handle.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
>> + type *
>> +
>> +/**
>> + * Define a lcore variable handle.
>> + *
>> + * This macro defines a variable which is used as a handle to access
>> + * the various per-lcore id instances of a per-lcore id variable.
>> + *
>> + * The aim with this macro is to make clear at the point of
>> + * declaration that this is an lcore handler, rather than a regular
>> + * pointer.
>> + *
>> + * Add @b static as a prefix in case the lcore variable are only to be
>> + * accessed from a particular translation unit.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align) \
>> + name = rte_lcore_var_alloc(size, align)
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle,
>> + * with values aligned for any type of object.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE(name, size) \
>> + name = rte_lcore_var_alloc(size, 0)
>> +
>> +/**
>> + * Allocate space for an lcore variable of the size and alignment requirements
>> + * suggested by the handler pointer type, and initialize its handle.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC(name) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, sizeof(*(name)), \
>> + alignof(typeof(*(name))))
>> +
>> +/**
>> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
>> + * means of a \ref RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
>> + }
>> +
>> +/**
>> + * Allocate an explicitly-sized lcore variable by means of a \ref
>> + * RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
>> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
>> +
>> +/**
>> + * Allocate an lcore variable by means of a \ref RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT(name) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC(name); \
>> + }
>> +
>> +#define __RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
>> + ((void *)(&rte_lcore_var[lcore_id][(uintptr_t)(name)]))
>> +
>> +/**
>> + * Get pointer to lcore variable instance with the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
>> + ((typeof(name))__RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
>> +
>> +/**
>> + * Get value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, name) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)))
>> +
>> +/**
>> + * Set the value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, name, value) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)) = (value))
>> +
>> +/**
>> + * Get pointer to lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_PTR(name) RTE_LCORE_VAR_LCORE_PTR(rte_lcore_id(), name)
>> +
>> +/**
>> + * Get value of lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_GET(name) RTE_LCORE_VAR_LCORE_GET(rte_lcore_id(), name)
>> +
>> +/**
>> + * Set value of lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_SET(name, value) \
>> + RTE_LCORE_VAR_LCORE_SET(rte_lcore_id(), name, value)
>> +
>> +/**
>> + * Iterate over each lcore id's value for a lcore variable.
>> + */
>> +#define RTE_LCORE_VAR_FOREACH_VALUE(var, name) \
>> + for (unsigned int lcore_id = \
>> + (((var) = RTE_LCORE_VAR_LCORE_PTR(0, name)), 0); \
>> + lcore_id < RTE_MAX_LCORE; \
>> + lcore_id++, (var) = RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
>> +
>> +extern char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR];
>> +
>> +/**
>> + * Allocate space in the per-lcore id buffers for a lcore variable.
>> + *
>> + * The pointer returned is only an opaque identifer of the variable. To
>> + * get an actual pointer to a particular instance of the variable use
>> + * \ref RTE_LCORE_VAR_PTR or \ref RTE_LCORE_VAR_LCORE_PTR.
>> + *
>> + * The allocation is always successful, barring a fatal exhaustion of
>> + * the per-lcore id buffer space.
>> + *
>> + * @param size
>> + * The size (in bytes) of the variable's per-lcore id value.
>> + * @param align
>> + * If 0, the values will be suitably aligned for any kind of type
>> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
>> + * on a multiple of *align*, which must be a power of 2 and equal or
>> + * less than \c RTE_CACHE_LINE_SIZE.
>> + * @return
>> + * The id of the variable, stored in a void pointer value.
>> + */
>> +__rte_experimental
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_LCORE_VAR_H_ */
>> diff --git a/lib/eal/version.map b/lib/eal/version.map
>> index 5e0cd47c82..e90b86115a 100644
>> --- a/lib/eal/version.map
>> +++ b/lib/eal/version.map
>> @@ -393,6 +393,10 @@ EXPERIMENTAL {
>> # added in 23.07
>> rte_memzone_max_get;
>> rte_memzone_max_set;
>> +
>> + # added in 24.03
>> + rte_lcore_var_alloc;
>> + rte_lcore_var;
>> };
>>
>> INTERNAL {
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-20 8:49 ` [RFC v3 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-02-20 9:11 ` Bruce Richardson
2024-02-21 9:43 ` Jerin Jacob
@ 2024-02-22 9:22 ` Morten Brørup
2024-02-23 10:12 ` Mattias Rönnblom
2024-02-25 15:03 ` [RFC v4 0/6] Lcore variables Mattias Rönnblom
3 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-02-22 9:22 UTC (permalink / raw)
To: Mattias Rönnblom, dev; +Cc: hofors, Stephen Hemminger
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Tuesday, 20 February 2024 09.49
>
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small chunks of often-used data, which is related logically, but where
> there are performance benefits to reap from having updates being local
> to an lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
> lib/eal/version.map | 4 +
> 7 files changed, 465 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/config/rte_config.h b/config/rte_config.h
> index da265d7dd2..884482e473 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -30,6 +30,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index a6a768bd7c..bb06bb7ca1 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/lib/eal/common/eal_common_lcore_var.c
> b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..dfd11cbd0b
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define WARN_THRESHOLD 75
It's not an error condition, so 75 % seems like a low threshold for WARNING.
Consider increasing it to 95 %, or change the level to NOTICE.
Or both.
> +
> +/*
> + * Avoid using offset zero, since it would result in a NULL-value
> + * "handle" (offset) pointer, which in principle and per the API
> + * definition shouldn't be an issue, but may confuse some tools and
> + * users.
> + */
> +#define INITIAL_OFFSET 1
> +
> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
> +
> +static uintptr_t allocated = INITIAL_OFFSET;
Please add an API to get the amount of allocated lcore variable memory.
The easy option is to make the above variable public (with a proper name, e.g. rte_lcore_var_allocated).
The total amount of lcore variable memory is already public: RTE_MAX_LCORE_VAR.
> +
> +static void
> +verify_allocation(uintptr_t new_allocated)
> +{
> + static bool has_warned;
> +
> + RTE_VERIFY(new_allocated < RTE_MAX_LCORE_VAR);
> +
> + if (new_allocated > (WARN_THRESHOLD * RTE_MAX_LCORE_VAR) / 100 &&
> + !has_warned) {
> + EAL_LOG(WARNING, "Per-lcore data usage has exceeded %d%% "
> + "of the maximum capacity (%d bytes)", WARN_THRESHOLD,
> + RTE_MAX_LCORE_VAR);
> + has_warned = true;
> + }
> +}
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + uintptr_t new_allocated = RTE_ALIGN_CEIL(allocated, align);
> +
> + void *offset = (void *)new_allocated;
> +
> + new_allocated += size;
> +
> + verify_allocation(new_allocated);
> +
> + allocated = new_allocated;
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return offset;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> + * assures as well as having the base pointer aligned on cache
> + * size assures that aligned offsets also translate to aligned
> + * pointers across all values.
> + */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> +
> + /* '0' means asking for worst-case alignment requirements */
> + if (align == 0)
> + align = alignof(max_align_t);
> +
> + RTE_ASSERT(rte_is_power_of_2(align));
> +
> + return lcore_var_alloc(size, align);
> +}
> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
> index 22a626ba6f..d41403680b 100644
> --- a/lib/eal/common/meson.build
> +++ b/lib/eal/common/meson.build
> @@ -18,6 +18,7 @@ sources += files(
> 'eal_common_interrupts.c',
> 'eal_common_launch.c',
> 'eal_common_lcore.c',
> + 'eal_common_lcore_var.c',
> 'eal_common_mcfg.c',
> 'eal_common_memalloc.c',
> 'eal_common_memory.c',
> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
> index e94b056d46..9449253e23 100644
> --- a/lib/eal/include/meson.build
> +++ b/lib/eal/include/meson.build
> @@ -27,6 +27,7 @@ headers += files(
> 'rte_keepalive.h',
> 'rte_launch.h',
> 'rte_lcore.h',
> + 'rte_lcore_var.h',
> 'rte_lock_annotations.h',
> 'rte_malloc.h',
> 'rte_mcslock.h',
> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
> new file mode 100644
> index 0000000000..da49d48d7c
> --- /dev/null
> +++ b/lib/eal/include/rte_lcore_var.h
> @@ -0,0 +1,375 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#ifndef _RTE_LCORE_VAR_H_
> +#define _RTE_LCORE_VAR_H_
> +
> +/**
> + * @file
> + *
> + * RTE Per-lcore id variables
> + *
> + * This API provides a mechanism to create and access per-lcore id
> + * variables in a space- and cycle-efficient manner.
> + *
> + * A per-lcore id variable (or lcore variable for short) has one value
> + * for each EAL thread and registered non-EAL thread. In other words,
> + * there's one copy of its value for each and every current and future
> + * lcore id-equipped thread, with the total number of copies amounting
> + * to \c RTE_MAX_LCORE.
> + *
> + * In order to access the values of an lcore variable, a handle is
> + * used. The type of the handle is a pointer to the value's type
> + * (e.g., for \c uint32_t lcore variable, the handle is a
> + * <code>uint32_t *</code>. A handle may be passed between modules and
> + * threads just like any pointer, but its value is not the address of
> + * any particular object, but rather just an opaque identifier, stored
> + * in a typed pointer (to inform the access macro the type of values).
> + *
> + * @b Creation
> + *
> + * An lcore variable is created in two steps:
> + * 1. Define a lcore variable handle by using \ref RTE_LCORE_VAR_HANDLE.
> + * 2. Allocate lcore variable storage and initialize the handle with
> + * a unique identifier by \ref RTE_LCORE_VAR_ALLOC or
> + * \ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
> + * module initialization, but may be done at any time.
> + *
> + * An lcore variable is not tied to the owning thread's lifetime. It's
> + * available for use by any thread immediately after having been
> + * allocated, and continues to be available throughout the lifetime of
> + * the EAL.
> + *
> + * Lcore variables cannot and need not be freed.
> + *
> + * @b Access
> + *
> + * The value of any lcore variable for any lcore id may be accessed
> + * from any thread (including unregistered threads), but is should
> + * generally only *frequently* read from or written to by the owner.
> + *
> + * Values of the same lcore variable but owned by to different lcore
> + * ids *may* be frequently read or written by the owners without the
> + * risk of false sharing.
> + *
> + * An appropriate synchronization mechanism (e.g., atomics) should
> + * employed to assure there are no data races between the owning
> + * thread and any non-owner threads accessing the same lcore variable
> + * instance.
> + *
> + * The value of the lcore variable for a particular lcore id may be
> + * retrieved with \ref RTE_LCORE_VAR_LCORE_GET. To get a pointer to the
> + * same object, use \ref RTE_LCORE_VAR_LCORE_PTR.
> + *
> + * To modify the value of an lcore variable for a particular lcore id,
> + * either access the object through the pointer retrieved by \ref
> + * RTE_LCORE_VAR_LCORE_PTR or, for primitive types, use \ref
> + * RTE_LCORE_VAR_LCORE_SET.
> + *
> + * The access macros each has a short-hand which may be used by an EAL
> + * thread or registered non-EAL thread to access the lcore variable
> + * instance of its own lcore id. Those are \ref RTE_LCORE_VAR_GET,
> + * \ref RTE_LCORE_VAR_PTR, and \ref RTE_LCORE_VAR_SET.
> + *
> + * Although the handle (as defined by \ref RTE_LCORE_VAR_HANDLE) is a
> + * pointer with the same type as the value, it may not be directly
> + * dereferenced and must be treated as an opaque identifier. The
> + * *identifier* value is common across all lcore ids.
> + *
> + * @b Storage
> + *
> + * An lcore variable's values may by of a primitive type like \c int,
> + * but would more typically be a \c struct. An application may choose
> + * to define an lcore variable, which it then it goes on to never
> + * allocate.
> + *
> + * The lcore variable handle introduces a per-variable (not
> + * per-value/per-lcore id) overhead of \c sizeof(void *) bytes, so
> + * there are some memory footprint gains to be made by organizing all
> + * per-lcore id data for a particular module as one lcore variable
> + * (e.g., as a struct).
> + *
> + * The sum of all lcore variables, plus any padding required, must be
> + * less than the DPDK build-time constant \c RTE_MAX_LCORE_VAR. A
> + * violation of this maximum results in the process being terminated.
> + *
> + * It's reasonable to expected that \c RTE_MAX_LCORE_VAR is on the
> + * same order of magnitude in size as a thread stack.
> + *
> + * The lcore variable storage buffers are kept in the BSS section in
> + * the resulting binary, where data generally isn't mapped in until
> + * it's accessed. This means that unused portions of the lcore
> + * variable storage area will not occupy any physical memory (with a
> + * granularity of the memory page size [usually 4 kB]).
> + *
> + * Lcore variables should generally *not* be \ref __rte_cache_aligned
> + * and need *not* include a \ref RTE_CACHE_GUARD field, since the use
> + * of these constructs are designed to avoid false sharing. In the
> + * case of an lcore variable instance, all nearby data structures
> + * should almost-always be written to by a single thread (the lcore
> + * variable owner). Adding padding will increase the effective memory
> + * working set size, and potentially reducing performance.
> + *
> + * @b Example
> + *
> + * Below is an example of the use of an lcore variable:
> + *
> + * \code{.c}
> + * struct foo_lcore_state {
> + * int a;
> + * long b;
> + * };
> + *
> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
> + *
> + * long foo_get_a_plus_b(void)
> + * {
> + * struct foo_lcore_state *state = RTE_LCORE_VAR_PTR(lcore_states);
> + *
> + * return state->a + state->b;
> + * }
> + *
> + * RTE_INIT(rte_foo_init)
> + * {
> + * unsigned int lcore_id;
This variable is part of RTE_LCORE_VAR_FOREACH_VALUE(), and can be removed from here.
> + *
> + * RTE_LCORE_VAR_ALLOC(foo_state);
Typo: foo_state -> lcore_states
> + *
> + * struct foo_lcore_state *state;
> + * RTE_LCORE_VAR_FOREACH_VALUE(lcore_states) {
Typo:
RTE_LCORE_VAR_FOREACH_VALUE(lcore_states)
->
RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states)
> + * (initialize 'state')
> + * }
> + *
> + * (other initialization)
> + * }
> + * \endcode
> + *
> + *
> + * @b Alternatives
> + *
> + * Lcore variables are designed to replace a pattern exemplified below:
> + * \code{.c}
> + * struct foo_lcore_state {
> + * int a;
> + * long b;
> + * RTE_CACHE_GUARD;
> + * } __rte_cache_aligned;
> + *
> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
> + * \endcode
> + *
> + * This scheme is simple and effective, but has one drawback: the data
> + * is organized so that objects related to all lcores for a particular
> + * module is kept close in memory. At a bare minimum, this forces the
> + * use of cache-line alignment to avoid false sharing. With CPU
> + * hardware prefetching and memory loads resulting from speculative
> + * execution (functions which seemingly are getting more eager faster
> + * than they are getting more intelligent), one or more "guard" cache
> + * lines may be required to separate one lcore's data from another's.
> + *
> + * Lcore variables has the upside of working with, not against, the
> + * CPU's assumptions and for example next-line prefetchers may well
> + * work the way its designers intended (i.e., to the benefit, not
> + * detriment, of system performance).
> + *
> + * Another alternative to \ref rte_lcore_var.h is the \ref
> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
> + * e.g., GCC __thread or C11 _Thread_local). The main differences
> + * between by using the various forms of TLS (e.g., \ref
> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
> + * variables are:
> + *
> + * * The existence and non-existence of a thread-local variable
> + * instance follow that of particular thread's. The data cannot be
> + * accessed before the thread has been created, nor after it has
> + * exited. One effect of this is thread-local variables must
> + * initialized in a "lazy" manner (e.g., at the point of thread
> + * creation). Lcore variables may be accessed immediately after
> + * having been allocated (which is usually prior any thread beyond
> + * the main thread is running).
> + * * A thread-local variable is duplicated across all threads in the
> + * process, including unregistered non-EAL threads (i.e.,
> + * "regular" threads). For DPDK applications heavily relying on
> + * multi-threading (in conjunction to DPDK's "one thread per core"
> + * pattern), either by having many concurrent threads or
> + * creating/destroying threads at a high rate, an excessive use of
> + * thread-local variables may cause inefficiencies (e.g.,
> + * increased thread creation overhead due to thread-local storage
> + * initialization or increased total RAM footprint usage). Lcore
> + * variables *only* exist for threads with an lcore id, and thus
> + * not for such "regular" threads.
> + * * If data in thread-local storage may be shared between threads
> + * (i.e., can a pointer to a thread-local variable be passed to
> + * and successfully dereferenced by non-owning thread) depends on
> + * the details of the TLS implementation. With GCC __thread and
> + * GCC _Thread_local, such data sharing is supported. In the C11
> + * standard, the result of accessing another thread's
> + * _Thread_local object is implementation-defined. Lcore variable
> + * instances may be accessed reliably by any thread.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stddef.h>
> +#include <stdalign.h>
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_lcore.h>
> +
> +/**
> + * Given the lcore variable type, produces the type of the lcore
> + * variable handle.
> + */
> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> + type *
This macro seems superfluous.
In RTE_LCORE_VAR_HANDLE(type, name) just use:
type * name
Are there other use cases for it?
> +
> +/**
> + * Define a lcore variable handle.
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various per-lcore id instances of a per-lcore id variable.
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handler, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable are only to be
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
Thinking out loud here...
Consider if this name should be more similar with RTE_DEFINE_PER_LCORE(type, name), e.g. RTE_DEFINE_LCORE_VAR(type, name) or RTE_LCORE_VAR_DEFINE(type, name).
Using the common prefix RTE_LCORE_VAR is preferable.
Using the term "handle" indicates that it is opaque and needs to be allocated by an allocation function.
On the other hand, the "handle" is not unique per thread, so it's nor really a "handle".
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align) \
> + name = rte_lcore_var_alloc(size, align)
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle,
> + * with values aligned for any type of object.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE(name, size) \
> + name = rte_lcore_var_alloc(size, 0)
> +
> +/**
> + * Allocate space for an lcore variable of the size and alignment
> requirements
> + * suggested by the handler pointer type, and initialize its handle.
> + */
> +#define RTE_LCORE_VAR_ALLOC(name) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, sizeof(*(name)), \
> + alignof(typeof(*(name))))
> +
> +/**
> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> + * means of a \ref RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> + }
> +
> +/**
> + * Allocate an explicitly-sized lcore variable by means of a \ref
> + * RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> +
> +/**
> + * Allocate an lcore variable by means of a \ref RTE_INIT constructor.
> + */
> +#define RTE_LCORE_VAR_INIT(name) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC(name); \
> + }
> +
> +#define __RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
> + ((void *)(&rte_lcore_var[lcore_id][(uintptr_t)(name)]))
This macro also seems superfluous.
Doesn't RTE_LCORE_VAR_LCORE_PTR() suffice?
> +
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
> + ((typeof(name))__RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
This uses type casting.
I wonder if additional build-time type checking would be possible...
Nice to have: The compiler should fail if name is not a pointer, but a struct or an uint64_t, or even an uintptr_t.
> +
> +/**
> + * Get value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, name) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)))
The four accessor functions, RTE_LCORE_VAR[_LCORE]_GET/SET(), seem superfluous.
They make the API seem more complex than just using RTE_LCORE_VAR[_LCORE]_PTR() for access.
> +
> +/**
> + * Set the value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, name, value) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)) = (value))
> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_PTR(name) RTE_LCORE_VAR_LCORE_PTR(rte_lcore_id(), name)
> +
> +/**
> + * Get value of lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_GET(name) RTE_LCORE_VAR_LCORE_GET(rte_lcore_id(), name)
> +
> +/**
> + * Set value of lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_SET(name, value) \
> + RTE_LCORE_VAR_LCORE_SET(rte_lcore_id(), name, value)
> +
> +/**
> + * Iterate over each lcore id's value for a lcore variable.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(var, name) \
> + for (unsigned int lcore_id = \
> + (((var) = RTE_LCORE_VAR_LCORE_PTR(0, name)), 0); \
> + lcore_id < RTE_MAX_LCORE; \
> + lcore_id++, (var) = RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
RTE_LCORE_VAR_FOREACH_PTR(ptr, name) would be an even better name; considering that "var" is really a pointer.
I also wonder about build-time type checking here...
Nice to have: The compiler should fail if "ptr" is not a pointer.
> +
> +extern char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR];
> +
> +/**
> + * Allocate space in the per-lcore id buffers for a lcore variable.
> + *
> + * The pointer returned is only an opaque identifer of the variable. To
> + * get an actual pointer to a particular instance of the variable use
> + * \ref RTE_LCORE_VAR_PTR or \ref RTE_LCORE_VAR_LCORE_PTR.
> + *
> + * The allocation is always successful, barring a fatal exhaustion of
> + * the per-lcore id buffer space.
> + *
> + * @param size
> + * The size (in bytes) of the variable's per-lcore id value.
> + * @param align
> + * If 0, the values will be suitably aligned for any kind of type
> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
> + * on a multiple of *align*, which must be a power of 2 and equal or
> + * less than \c RTE_CACHE_LINE_SIZE.
> + * @return
> + * The id of the variable, stored in a void pointer value.
> + */
> +__rte_experimental
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_LCORE_VAR_H_ */
> diff --git a/lib/eal/version.map b/lib/eal/version.map
> index 5e0cd47c82..e90b86115a 100644
> --- a/lib/eal/version.map
> +++ b/lib/eal/version.map
> @@ -393,6 +393,10 @@ EXPERIMENTAL {
> # added in 23.07
> rte_memzone_max_get;
> rte_memzone_max_set;
> +
> + # added in 24.03
> + rte_lcore_var_alloc;
> + rte_lcore_var;
> };
>
> INTERNAL {
> --
> 2.34.1
Acked-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility
2024-02-22 9:22 ` Morten Brørup
@ 2024-02-23 10:12 ` Mattias Rönnblom
0 siblings, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-23 10:12 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev; +Cc: Stephen Hemminger
On 2024-02-22 10:22, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>> Sent: Tuesday, 20 February 2024 09.49
>>
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small chunks of often-used data, which is related logically, but where
>> there are performance benefits to reap from having updates being local
>> to an lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> RFC v3:
>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>
>> RFC v2:
>> * Use alignof to derive alignment requirements. (Morten Brørup)
>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>> * Allow user-specified alignment, but limit max to cache line size.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> ---
>> config/rte_config.h | 1 +
>> doc/api/doxy-api-index.md | 1 +
>> lib/eal/common/eal_common_lcore_var.c | 82 ++++++
>> lib/eal/common/meson.build | 1 +
>> lib/eal/include/meson.build | 1 +
>> lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
>> lib/eal/version.map | 4 +
>> 7 files changed, 465 insertions(+)
>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>
>> diff --git a/config/rte_config.h b/config/rte_config.h
>> index da265d7dd2..884482e473 100644
>> --- a/config/rte_config.h
>> +++ b/config/rte_config.h
>> @@ -30,6 +30,7 @@
>> /* EAL defines */
>> #define RTE_CACHE_GUARD_LINES 1
>> #define RTE_MAX_HEAPS 32
>> +#define RTE_MAX_LCORE_VAR 1048576
>> #define RTE_MAX_MEMSEG_LISTS 128
>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index a6a768bd7c..bb06bb7ca1 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -98,6 +98,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore-varible](@ref rte_lcore_var.h),
>> [per-lcore](@ref rte_per_lcore.h),
>> [service cores](@ref rte_service.h),
>> [keepalive](@ref rte_keepalive.h),
>> diff --git a/lib/eal/common/eal_common_lcore_var.c
>> b/lib/eal/common/eal_common_lcore_var.c
>> new file mode 100644
>> index 0000000000..dfd11cbd0b
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,82 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define WARN_THRESHOLD 75
>
> It's not an error condition, so 75 % seems like a low threshold for WARNING.
> Consider increasing it to 95 %, or change the level to NOTICE.
> Or both.
>
I'll make an attempt at a variant which uses the libc heap instead of
BSS, and does so dynamically. Then one need not worry about a fixed-size
upper bound, barring heap allocation failures (which you are best off
making fatal in the lcore variables case).
The glibc heap is available early (as early as the earliest RTE_INIT()).
You also avoid the headache of thinking about what happens if indeed all
of the rte_lcore_var array is backed by actual memory. That could be due
to mlockall(), huge page use for BSS, or systems where BSS is not
on-demand mapped. I have no idea how paging works on Windows NT, for
example.
>> +
>> +/*
>> + * Avoid using offset zero, since it would result in a NULL-value
>> + * "handle" (offset) pointer, which in principle and per the API
>> + * definition shouldn't be an issue, but may confuse some tools and
>> + * users.
>> + */
>> +#define INITIAL_OFFSET 1
>> +
>> +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
>> +
>> +static uintptr_t allocated = INITIAL_OFFSET;
>
> Please add an API to get the amount of allocated lcore variable memory.
> The easy option is to make the above variable public (with a proper name, e.g. rte_lcore_var_allocated).
>
> The total amount of lcore variable memory is already public: RTE_MAX_LCORE_VAR.
>
Makes sense with the RFC v3 design.
If you eliminate the fixed upper bound and use the heap, there shouldn't
be any particular need to track lcore variable memory use separately
from other heap users.
>> +
>> +static void
>> +verify_allocation(uintptr_t new_allocated)
>> +{
>> + static bool has_warned;
>> +
>> + RTE_VERIFY(new_allocated < RTE_MAX_LCORE_VAR);
>> +
>> + if (new_allocated > (WARN_THRESHOLD * RTE_MAX_LCORE_VAR) / 100 &&
>> + !has_warned) {
>> + EAL_LOG(WARNING, "Per-lcore data usage has exceeded %d%% "
>> + "of the maximum capacity (%d bytes)", WARN_THRESHOLD,
>> + RTE_MAX_LCORE_VAR);
>> + has_warned = true;
>> + }
>> +}
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + uintptr_t new_allocated = RTE_ALIGN_CEIL(allocated, align);
>> +
>> + void *offset = (void *)new_allocated;
>> +
>> + new_allocated += size;
>> +
>> + verify_allocation(new_allocated);
>> +
>> + allocated = new_allocated;
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>> +
>> + return offset;
>> +}
>> +
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align)
>> +{
>> + /* Having the per-lcore buffer size aligned on cache lines
>> + * assures as well as having the base pointer aligned on cache
>> + * size assures that aligned offsets also translate to aligned
>> + * pointers across all values.
>> + */
>> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
>> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
>> +
>> + /* '0' means asking for worst-case alignment requirements */
>> + if (align == 0)
>> + align = alignof(max_align_t);
>> +
>> + RTE_ASSERT(rte_is_power_of_2(align));
>> +
>> + return lcore_var_alloc(size, align);
>> +}
>> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
>> index 22a626ba6f..d41403680b 100644
>> --- a/lib/eal/common/meson.build
>> +++ b/lib/eal/common/meson.build
>> @@ -18,6 +18,7 @@ sources += files(
>> 'eal_common_interrupts.c',
>> 'eal_common_launch.c',
>> 'eal_common_lcore.c',
>> + 'eal_common_lcore_var.c',
>> 'eal_common_mcfg.c',
>> 'eal_common_memalloc.c',
>> 'eal_common_memory.c',
>> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
>> index e94b056d46..9449253e23 100644
>> --- a/lib/eal/include/meson.build
>> +++ b/lib/eal/include/meson.build
>> @@ -27,6 +27,7 @@ headers += files(
>> 'rte_keepalive.h',
>> 'rte_launch.h',
>> 'rte_lcore.h',
>> + 'rte_lcore_var.h',
>> 'rte_lock_annotations.h',
>> 'rte_malloc.h',
>> 'rte_mcslock.h',
>> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
>> new file mode 100644
>> index 0000000000..da49d48d7c
>> --- /dev/null
>> +++ b/lib/eal/include/rte_lcore_var.h
>> @@ -0,0 +1,375 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_LCORE_VAR_H_
>> +#define _RTE_LCORE_VAR_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE Per-lcore id variables
>> + *
>> + * This API provides a mechanism to create and access per-lcore id
>> + * variables in a space- and cycle-efficient manner.
>> + *
>> + * A per-lcore id variable (or lcore variable for short) has one value
>> + * for each EAL thread and registered non-EAL thread. In other words,
>> + * there's one copy of its value for each and every current and future
>> + * lcore id-equipped thread, with the total number of copies amounting
>> + * to \c RTE_MAX_LCORE.
>> + *
>> + * In order to access the values of an lcore variable, a handle is
>> + * used. The type of the handle is a pointer to the value's type
>> + * (e.g., for \c uint32_t lcore variable, the handle is a
>> + * <code>uint32_t *</code>. A handle may be passed between modules and
>> + * threads just like any pointer, but its value is not the address of
>> + * any particular object, but rather just an opaque identifier, stored
>> + * in a typed pointer (to inform the access macro the type of values).
>> + *
>> + * @b Creation
>> + *
>> + * An lcore variable is created in two steps:
>> + * 1. Define a lcore variable handle by using \ref RTE_LCORE_VAR_HANDLE.
>> + * 2. Allocate lcore variable storage and initialize the handle with
>> + * a unique identifier by \ref RTE_LCORE_VAR_ALLOC or
>> + * \ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
>> + * module initialization, but may be done at any time.
>> + *
>> + * An lcore variable is not tied to the owning thread's lifetime. It's
>> + * available for use by any thread immediately after having been
>> + * allocated, and continues to be available throughout the lifetime of
>> + * the EAL.
>> + *
>> + * Lcore variables cannot and need not be freed.
>> + *
>> + * @b Access
>> + *
>> + * The value of any lcore variable for any lcore id may be accessed
>> + * from any thread (including unregistered threads), but is should
>> + * generally only *frequently* read from or written to by the owner.
>> + *
>> + * Values of the same lcore variable but owned by to different lcore
>> + * ids *may* be frequently read or written by the owners without the
>> + * risk of false sharing.
>> + *
>> + * An appropriate synchronization mechanism (e.g., atomics) should
>> + * employed to assure there are no data races between the owning
>> + * thread and any non-owner threads accessing the same lcore variable
>> + * instance.
>> + *
>> + * The value of the lcore variable for a particular lcore id may be
>> + * retrieved with \ref RTE_LCORE_VAR_LCORE_GET. To get a pointer to the
>> + * same object, use \ref RTE_LCORE_VAR_LCORE_PTR.
>> + *
>> + * To modify the value of an lcore variable for a particular lcore id,
>> + * either access the object through the pointer retrieved by \ref
>> + * RTE_LCORE_VAR_LCORE_PTR or, for primitive types, use \ref
>> + * RTE_LCORE_VAR_LCORE_SET.
>> + *
>> + * The access macros each has a short-hand which may be used by an EAL
>> + * thread or registered non-EAL thread to access the lcore variable
>> + * instance of its own lcore id. Those are \ref RTE_LCORE_VAR_GET,
>> + * \ref RTE_LCORE_VAR_PTR, and \ref RTE_LCORE_VAR_SET.
>> + *
>> + * Although the handle (as defined by \ref RTE_LCORE_VAR_HANDLE) is a
>> + * pointer with the same type as the value, it may not be directly
>> + * dereferenced and must be treated as an opaque identifier. The
>> + * *identifier* value is common across all lcore ids.
>> + *
>> + * @b Storage
>> + *
>> + * An lcore variable's values may by of a primitive type like \c int,
>> + * but would more typically be a \c struct. An application may choose
>> + * to define an lcore variable, which it then it goes on to never
>> + * allocate.
>> + *
>> + * The lcore variable handle introduces a per-variable (not
>> + * per-value/per-lcore id) overhead of \c sizeof(void *) bytes, so
>> + * there are some memory footprint gains to be made by organizing all
>> + * per-lcore id data for a particular module as one lcore variable
>> + * (e.g., as a struct).
>> + *
>> + * The sum of all lcore variables, plus any padding required, must be
>> + * less than the DPDK build-time constant \c RTE_MAX_LCORE_VAR. A
>> + * violation of this maximum results in the process being terminated.
>> + *
>> + * It's reasonable to expected that \c RTE_MAX_LCORE_VAR is on the
>> + * same order of magnitude in size as a thread stack.
>> + *
>> + * The lcore variable storage buffers are kept in the BSS section in
>> + * the resulting binary, where data generally isn't mapped in until
>> + * it's accessed. This means that unused portions of the lcore
>> + * variable storage area will not occupy any physical memory (with a
>> + * granularity of the memory page size [usually 4 kB]).
>> + *
>> + * Lcore variables should generally *not* be \ref __rte_cache_aligned
>> + * and need *not* include a \ref RTE_CACHE_GUARD field, since the use
>> + * of these constructs are designed to avoid false sharing. In the
>> + * case of an lcore variable instance, all nearby data structures
>> + * should almost-always be written to by a single thread (the lcore
>> + * variable owner). Adding padding will increase the effective memory
>> + * working set size, and potentially reducing performance.
>> + *
>> + * @b Example
>> + *
>> + * Below is an example of the use of an lcore variable:
>> + *
>> + * \code{.c}
>> + * struct foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * };
>> + *
>> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
>> + *
>> + * long foo_get_a_plus_b(void)
>> + * {
>> + * struct foo_lcore_state *state = RTE_LCORE_VAR_PTR(lcore_states);
>> + *
>> + * return state->a + state->b;
>> + * }
>> + *
>> + * RTE_INIT(rte_foo_init)
>> + * {
>> + * unsigned int lcore_id;
>
> This variable is part of RTE_LCORE_VAR_FOREACH_VALUE(), and can be removed from here.
>
>> + *
>> + * RTE_LCORE_VAR_ALLOC(foo_state);
>
> Typo: foo_state -> lcore_states
>
Will fix.
>> + *
>> + * struct foo_lcore_state *state;
>> + * RTE_LCORE_VAR_FOREACH_VALUE(lcore_states) {
>
> Typo:
> RTE_LCORE_VAR_FOREACH_VALUE(lcore_states)
> ->
> RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states)
>
Will fix.
>> + * (initialize 'state')
>> + * }
>> + *
>> + * (other initialization)
>> + * }
>> + * \endcode
>> + *
>> + *
>> + * @b Alternatives
>> + *
>> + * Lcore variables are designed to replace a pattern exemplified below:
>> + * \code{.c}
>> + * struct foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * RTE_CACHE_GUARD;
>> + * } __rte_cache_aligned;
>> + *
>> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
>> + * \endcode
>> + *
>> + * This scheme is simple and effective, but has one drawback: the data
>> + * is organized so that objects related to all lcores for a particular
>> + * module is kept close in memory. At a bare minimum, this forces the
>> + * use of cache-line alignment to avoid false sharing. With CPU
>> + * hardware prefetching and memory loads resulting from speculative
>> + * execution (functions which seemingly are getting more eager faster
>> + * than they are getting more intelligent), one or more "guard" cache
>> + * lines may be required to separate one lcore's data from another's.
>> + *
>> + * Lcore variables has the upside of working with, not against, the
>> + * CPU's assumptions and for example next-line prefetchers may well
>> + * work the way its designers intended (i.e., to the benefit, not
>> + * detriment, of system performance).
>> + *
>> + * Another alternative to \ref rte_lcore_var.h is the \ref
>> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
>> + * e.g., GCC __thread or C11 _Thread_local). The main differences
>> + * between by using the various forms of TLS (e.g., \ref
>> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
>> + * variables are:
>> + *
>> + * * The existence and non-existence of a thread-local variable
>> + * instance follow that of particular thread's. The data cannot be
>> + * accessed before the thread has been created, nor after it has
>> + * exited. One effect of this is thread-local variables must
>> + * initialized in a "lazy" manner (e.g., at the point of thread
>> + * creation). Lcore variables may be accessed immediately after
>> + * having been allocated (which is usually prior any thread beyond
>> + * the main thread is running).
>> + * * A thread-local variable is duplicated across all threads in the
>> + * process, including unregistered non-EAL threads (i.e.,
>> + * "regular" threads). For DPDK applications heavily relying on
>> + * multi-threading (in conjunction to DPDK's "one thread per core"
>> + * pattern), either by having many concurrent threads or
>> + * creating/destroying threads at a high rate, an excessive use of
>> + * thread-local variables may cause inefficiencies (e.g.,
>> + * increased thread creation overhead due to thread-local storage
>> + * initialization or increased total RAM footprint usage). Lcore
>> + * variables *only* exist for threads with an lcore id, and thus
>> + * not for such "regular" threads.
>> + * * If data in thread-local storage may be shared between threads
>> + * (i.e., can a pointer to a thread-local variable be passed to
>> + * and successfully dereferenced by non-owning thread) depends on
>> + * the details of the TLS implementation. With GCC __thread and
>> + * GCC _Thread_local, such data sharing is supported. In the C11
>> + * standard, the result of accessing another thread's
>> + * _Thread_local object is implementation-defined. Lcore variable
>> + * instances may be accessed reliably by any thread.
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stddef.h>
>> +#include <stdalign.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_config.h>
>> +#include <rte_lcore.h>
>> +
>> +/**
>> + * Given the lcore variable type, produces the type of the lcore
>> + * variable handle.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
>> + type *
>
> This macro seems superfluous.
> In RTE_LCORE_VAR_HANDLE(type, name) just use:
> type * name
> Are there other use cases for it?
>
It's just a marker, like RTE_LCORE_VAR_HANDLE(), to indicate this is not
your average pointer type.
It's not obvious these marker macros make things more clear. One could
just say in the API docs that lcore handles are opaque pointers to the
lcore variable's type, and make clear they may only be dereferenced
through the provided macros.
>> +
>> +/**
>> + * Define a lcore variable handle.
>> + *
>> + * This macro defines a variable which is used as a handle to access
>> + * the various per-lcore id instances of a per-lcore id variable.
>> + *
>> + * The aim with this macro is to make clear at the point of
>> + * declaration that this is an lcore handler, rather than a regular
>> + * pointer.
>> + *
>> + * Add @b static as a prefix in case the lcore variable are only to be
>> + * accessed from a particular translation unit.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>
> Thinking out loud here...
> Consider if this name should be more similar with RTE_DEFINE_PER_LCORE(type, name), e.g. RTE_DEFINE_LCORE_VAR(type, name) or RTE_LCORE_VAR_DEFINE(type, name).
> Using the common prefix RTE_LCORE_VAR is preferable.
> Using the term "handle" indicates that it is opaque and needs to be allocated by an allocation function.
> On the other hand, the "handle" is not unique per thread, so it's nor really a "handle".
>
It's a handle to a variable, not a handle to a particular instance of
its values.
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align) \
>> + name = rte_lcore_var_alloc(size, align)
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle,
>> + * with values aligned for any type of object.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE(name, size) \
>> + name = rte_lcore_var_alloc(size, 0)
>> +
>> +/**
>> + * Allocate space for an lcore variable of the size and alignment
>> requirements
>> + * suggested by the handler pointer type, and initialize its handle.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC(name) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, sizeof(*(name)), \
>> + alignof(typeof(*(name))))
>> +
>> +/**
>> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
>> + * means of a \ref RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
>> + }
>> +
>> +/**
>> + * Allocate an explicitly-sized lcore variable by means of a \ref
>> + * RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
>> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
>> +
>> +/**
>> + * Allocate an lcore variable by means of a \ref RTE_INIT constructor.
>> + */
>> +#define RTE_LCORE_VAR_INIT(name) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC(name); \
>> + }
>> +
>> +#define __RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
>> + ((void *)(&rte_lcore_var[lcore_id][(uintptr_t)(name)]))
>
> This macro also seems superfluous.
> Doesn't RTE_LCORE_VAR_LCORE_PTR() suffice?
>
It's just functional decomposition (but for macros). To make the whole
thing a little more readable.
Maybe I should change "name" to "handle" in this and other instances
(e.g., RTE_LCORE_VAR_LCORE_PTR).
>> +
>> +/**
>> + * Get pointer to lcore variable instance with the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, name) \
>> + ((typeof(name))__RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
>
> This uses type casting.
> I wonder if additional build-time type checking would be possible...
> Nice to have: The compiler should fail if name is not a pointer, but a struct or an uint64_t, or even an uintptr_t.
>
There is no way to compared the type of the lcore variable (at the point
of declaration) with the type of the handle pointer at the point of
handle "dereferencing" (which is essentially is what this macro does).
You can't cast a struct to a pointer. You could assure it's a pointer by
replacing the __RTE_LCORE_VAR_LCORE_PTR() with
static inline __rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
{
return (void *)&rte_lcore_var[lcore_id][(uintptr_t)handle];
}
(Bad practice to use a macro when a function can do the job anyway.)
Maybe this function shouldn't even have the "__" prefix. Could well be
valid uses cases when you want void * typed access to a lcore variable
value.
I'll use a function in the next RFC version.
>> +
>> +/**
>> + * Get value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, name) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)))
>
> The four accessor functions, RTE_LCORE_VAR[_LCORE]_GET/SET(), seem superfluous.
> They make the API seem more complex than just using RTE_LCORE_VAR[_LCORE]_PTR() for access.
>
They are (somewhat) useful when the value is a primitive type.
RTE_LCORE_VAR_SET(my_int, 17);
versus
*RTE_LCORE_VAR_PTR(my_int) = 17;
Former is slightly more readable, imo, but I agree with you that these
macros do clutter up the API.
>> +
>> +/**
>> + * Set the value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, name, value) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, name)) = (value))
>> +
>> +/**
>> + * Get pointer to lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_PTR(name) RTE_LCORE_VAR_LCORE_PTR(rte_lcore_id(), name)
>> +
>> +/**
>> + * Get value of lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_GET(name) RTE_LCORE_VAR_LCORE_GET(rte_lcore_id(), name)
>> +
>> +/**
>> + * Set value of lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_SET(name, value) \
>> + RTE_LCORE_VAR_LCORE_SET(rte_lcore_id(), name, value)
>> +
>> +/**
>> + * Iterate over each lcore id's value for a lcore variable.
>> + */
>> +#define RTE_LCORE_VAR_FOREACH_VALUE(var, name) \
>> + for (unsigned int lcore_id = \
>> + (((var) = RTE_LCORE_VAR_LCORE_PTR(0, name)), 0); \
>> + lcore_id < RTE_MAX_LCORE; \
>> + lcore_id++, (var) = RTE_LCORE_VAR_LCORE_PTR(lcore_id, name))
>
> RTE_LCORE_VAR_FOREACH_PTR(ptr, name) would be an even better name; considering that "var" is really a pointer.
>
No, it's for each value, referenced via the pointer.
RTE_LCORE_VAR_FOREACH_VALUE_PTR() is too long.
I'll change "var" -> "ptr".
> I also wonder about build-time type checking here...
> Nice to have: The compiler should fail if "ptr" is not a pointer.
>
I agree.
>> +
>> +extern char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR];
>> +
>> +/**
>> + * Allocate space in the per-lcore id buffers for a lcore variable.
>> + *
>> + * The pointer returned is only an opaque identifer of the variable. To
>> + * get an actual pointer to a particular instance of the variable use
>> + * \ref RTE_LCORE_VAR_PTR or \ref RTE_LCORE_VAR_LCORE_PTR.
>> + *
>> + * The allocation is always successful, barring a fatal exhaustion of
>> + * the per-lcore id buffer space.
>> + *
>> + * @param size
>> + * The size (in bytes) of the variable's per-lcore id value.
>> + * @param align
>> + * If 0, the values will be suitably aligned for any kind of type
>> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
>> + * on a multiple of *align*, which must be a power of 2 and equal or
>> + * less than \c RTE_CACHE_LINE_SIZE.
>> + * @return
>> + * The id of the variable, stored in a void pointer value.
>> + */
>> +__rte_experimental
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_LCORE_VAR_H_ */
>> diff --git a/lib/eal/version.map b/lib/eal/version.map
>> index 5e0cd47c82..e90b86115a 100644
>> --- a/lib/eal/version.map
>> +++ b/lib/eal/version.map
>> @@ -393,6 +393,10 @@ EXPERIMENTAL {
>> # added in 23.07
>> rte_memzone_max_get;
>> rte_memzone_max_set;
>> +
>> + # added in 24.03
>> + rte_lcore_var_alloc;
>> + rte_lcore_var;
>> };
>>
>> INTERNAL {
>> --
>> 2.34.1
>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v4 0/6] Lcore variables
2024-02-20 8:49 ` [RFC v3 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (2 preceding siblings ...)
2024-02-22 9:22 ` Morten Brørup
@ 2024-02-25 15:03 ` Mattias Rönnblom
2024-02-25 15:03 ` [RFC v4 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (5 more replies)
3 siblings, 6 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-25 15:03 UTC (permalink / raw)
To: dev; +Cc: hofors, Morten Brørup, Stephen Hemminger, Mattias Rönnblom
This RFC presents a new API <rte_lcore_var.h> for static per-lcore id
data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
One thing is unclear to the author is how this API relates to
potential future per-lcore dynamic allocator (e.g., a per-lcore heap).
Contrary to what the version.map edit suggests, this RFC is not meant
for a proposal for DPDK 24.03.
Mattias Rönnblom (6):
eal: add static per-lcore memory allocation facility
eal: add lcore variable test suite
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
app/test/meson.build | 1 +
app/test/test_lcore_var.c | 439 ++++++++++++++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 68 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 30 +-
lib/eal/common/rte_service.c | 120 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++
lib/eal/version.map | 4 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 36 +--
13 files changed, 1006 insertions(+), 88 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-25 15:03 ` [RFC v4 0/6] Lcore variables Mattias Rönnblom
@ 2024-02-25 15:03 ` Mattias Rönnblom
2024-02-27 9:58 ` Morten Brørup
2024-02-28 10:09 ` [RFC v5 0/6] Lcore variables Mattias Rönnblom
2024-02-25 15:03 ` [RFC v4 2/6] eal: add lcore variable test suite Mattias Rönnblom
` (4 subsequent siblings)
5 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-25 15:03 UTC (permalink / raw)
To: dev; +Cc: hofors, Morten Brørup, Stephen Hemminger, Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small chunks of often-used data, which is related logically, but where
there are performance benefits to reap from having updates being local
to an lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
---
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 68 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 375 ++++++++++++++++++++++++++
lib/eal/version.map | 4 +
7 files changed, 451 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/config/rte_config.h b/config/rte_config.h
index d743a5c3d3..0dac33d3b9 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 8c1eb8fafa..a3b8391570 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore-varible](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..5c353ebd46
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..09a7c7d4f6
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,375 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Per-lcore id variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. In other words,
+ * there's one copy of its value for each and every current and future
+ * lcore id-equipped thread, with the total number of copies amounting
+ * to \c RTE_MAX_LCORE.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for \c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. A handle may be passed between modules and
+ * threads just like any pointer, but its value is not the address of
+ * any particular object, but rather just an opaque identifier, stored
+ * in a typed pointer (to inform the access macro the type of values).
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define a lcore variable handle by using \ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by \ref RTE_LCORE_VAR_ALLOC or
+ * \ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but is should
+ * generally only *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by to different lcore
+ * ids *may* be frequently read or written by the owners without the
+ * risk of false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomics) should
+ * employed to assure there are no data races between the owning
+ * thread and any non-owner threads accessing the same lcore variable
+ * instance.
+ *
+ * The value of the lcore variable for a particular lcore id may be
+ * retrieved with \ref RTE_LCORE_VAR_LCORE_GET. To get a pointer to the
+ * same object, use \ref RTE_LCORE_VAR_LCORE_PTR.
+ *
+ * To modify the value of an lcore variable for a particular lcore id,
+ * either access the object through the pointer retrieved by \ref
+ * RTE_LCORE_VAR_LCORE_PTR or, for primitive types, use \ref
+ * RTE_LCORE_VAR_LCORE_SET.
+ *
+ * The access macros each has a short-hand which may be used by an EAL
+ * thread or registered non-EAL thread to access the lcore variable
+ * instance of its own lcore id. Those are \ref RTE_LCORE_VAR_GET,
+ * \ref RTE_LCORE_VAR_PTR, and \ref RTE_LCORE_VAR_SET.
+ *
+ * Although the handle (as defined by \ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier. The
+ * *identifier* value is common across all lcore ids.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like \c int,
+ * but would more typically be a \c struct. An application may choose
+ * to define an lcore variable, which it then it goes on to never
+ * allocate.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of \c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * The size of a lcore variable's value must be less than the DPDK
+ * build-time constant \c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be \ref __rte_cache_aligned
+ * and need *not* include a \ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, all nearby data structures
+ * should almost-always be written to by a single thread (the lcore
+ * variable owner). Adding padding will increase the effective memory
+ * working set size, and potentially reducing performance.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * \code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_PTR(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * \endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * \code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * } __rte_cache_aligned;
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * \endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this forces the
+ * use of cache-line alignment to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables has the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to \ref rte_lcore_var.h is the \ref
+ * rte_per_lcore.h API, which make use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., \ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. One effect of this is thread-local variables must
+ * initialized in a "lazy" manner (e.g., at the point of thread
+ * creation). Lcore variables may be accessed immediately after
+ * having been allocated (which is usually prior any thread beyond
+ * the main thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id, and thus
+ * not for such "regular" threads.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define a lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various per-lcore id instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handler, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable are only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handler pointer type, and initialize its handle.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a \ref RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a \ref
+ * RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a \ref RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+static inline void *
+__rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ */
+#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle) \
+ ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get value of a lcore variable instance of the specified lcore id.
+ */
+#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
+ (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
+
+/**
+ * Set the value of a lcore variable instance of the specified lcore id.
+ */
+#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value) \
+ (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_PTR(handle) \
+ RTE_LCORE_VAR_LCORE_PTR(rte_lcore_id(), handle)
+
+/**
+ * Get value of lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_GET(handle) \
+ RTE_LCORE_VAR_LCORE_GET(rte_lcore_id(), handle)
+
+/**
+ * Set value of lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_SET(handle, value) \
+ RTE_LCORE_VAR_LCORE_SET(rte_lcore_id(), handle, value)
+
+/**
+ * Iterate over each lcore id's value for a lcore variable.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(var, handle) \
+ for (unsigned int lcore_id = \
+ (((var) = RTE_LCORE_VAR_LCORE_PTR(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (var) = RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for a lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * \ref RTE_LCORE_VAR_PTR or \ref RTE_LCORE_VAR_LCORE_PTR.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than \c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The id of the variable, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 5e0cd47c82..e90b86115a 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -393,6 +393,10 @@ EXPERIMENTAL {
# added in 23.07
rte_memzone_max_get;
rte_memzone_max_set;
+
+ # added in 24.03
+ rte_lcore_var_alloc;
+ rte_lcore_var;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-25 15:03 ` [RFC v4 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-02-27 9:58 ` Morten Brørup
2024-02-27 13:44 ` Mattias Rönnblom
2024-02-28 10:09 ` [RFC v5 0/6] Lcore variables Mattias Rönnblom
1 sibling, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-02-27 9:58 UTC (permalink / raw)
To: Mattias Rönnblom, dev; +Cc: hofors, Stephen Hemminger
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Sunday, 25 February 2024 16.03
[...]
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
This would be the usual comparison:
if (lcore_buffer == NULL) {
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
[...]
> +/**
> + * Define a lcore variable handle.
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various per-lcore id instances of a per-lcore id variable.
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handler, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable are only to be
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> +
The parameter is "name" here, and "handle" in other macros.
Just mentioning to make sure you thought about it.
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle) \
> + ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
> +
> +/**
> + * Get value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
> +
> +/**
> + * Set the value of a lcore variable instance of the specified lcore id.
> + */
> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value) \
> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
I still think RTE_LCORE_VAR[_LCORE]_PTR() suffice, and RTE_LCORE_VAR[_LCORE]_GET/SET are superfluous.
But I don't insist on their removal. :-)
With or without suggested changes...
For the series,
Acked-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-27 9:58 ` Morten Brørup
@ 2024-02-27 13:44 ` Mattias Rönnblom
2024-02-27 15:05 ` Morten Brørup
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-27 13:44 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev; +Cc: Stephen Hemminger
On 2024-02-27 10:58, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>> Sent: Sunday, 25 February 2024 16.03
>
> [...]
>
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + void *handle;
>> + void *value;
>> +
>> + offset = RTE_ALIGN_CEIL(offset, align);
>> +
>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>
> This would be the usual comparison:
> if (lcore_buffer == NULL) {
>
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>> + RTE_VERIFY(lcore_buffer != NULL);
>> +
>> + offset = 0;
>> + }
>
> [...]
>
>> +/**
>> + * Define a lcore variable handle.
>> + *
>> + * This macro defines a variable which is used as a handle to access
>> + * the various per-lcore id instances of a per-lcore id variable.
>> + *
>> + * The aim with this macro is to make clear at the point of
>> + * declaration that this is an lcore handler, rather than a regular
>> + * pointer.
>> + *
>> + * Add @b static as a prefix in case the lcore variable are only to be
>> + * accessed from a particular translation unit.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>> +
>
> The parameter is "name" here, and "handle" in other macros.
> Just mentioning to make sure you thought about it.
>
>> +/**
>> + * Get pointer to lcore variable instance with the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle) \
>> + ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
>> +
>> +/**
>> + * Get value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
>> +
>> +/**
>> + * Set the value of a lcore variable instance of the specified lcore id.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value) \
>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
>
> I still think RTE_LCORE_VAR[_LCORE]_PTR() suffice, and RTE_LCORE_VAR[_LCORE]_GET/SET are superfluous.
> But I don't insist on their removal. :-)
>
I'll remove them. One can always add them later. Nothing I've seen in
the DPDK code base so far has been called for their use.
Should the RTE_LCORE_VAR_PTR() be renamed RTE_LCORE_VAR_VALUE() (and
still return a pointer, obviously)? "PTR" seems a little superfluous
(Hungarian). "RTE_LCORE_VAR()" would be short, but not very descriptive.
> With or without suggested changes...
>
> For the series,
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
Thanks for all help.
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-27 13:44 ` Mattias Rönnblom
@ 2024-02-27 15:05 ` Morten Brørup
2024-02-27 16:27 ` Mattias Rönnblom
0 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-02-27 15:05 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev; +Cc: Stephen Hemminger
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Tuesday, 27 February 2024 14.44
>
> On 2024-02-27 10:58, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> >> Sent: Sunday, 25 February 2024 16.03
> >
> > [...]
> >
> >> +static void *
> >> +lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + void *handle;
> >> + void *value;
> >> +
> >> + offset = RTE_ALIGN_CEIL(offset, align);
> >> +
> >> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >
> > This would be the usual comparison:
> > if (lcore_buffer == NULL) {
> >
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >> + RTE_VERIFY(lcore_buffer != NULL);
> >> +
> >> + offset = 0;
> >> + }
> >
> > [...]
> >
> >> +/**
> >> + * Define a lcore variable handle.
> >> + *
> >> + * This macro defines a variable which is used as a handle to access
> >> + * the various per-lcore id instances of a per-lcore id variable.
> >> + *
> >> + * The aim with this macro is to make clear at the point of
> >> + * declaration that this is an lcore handler, rather than a regular
> >> + * pointer.
> >> + *
> >> + * Add @b static as a prefix in case the lcore variable are only to
> be
> >> + * accessed from a particular translation unit.
> >> + */
> >> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> >> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> >> +
> >
> > The parameter is "name" here, and "handle" in other macros.
> > Just mentioning to make sure you thought about it.
> >
> >> +/**
> >> + * Get pointer to lcore variable instance with the specified lcore
> id.
> >> + */
> >> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle) \
> >> + ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
> >> +
> >> +/**
> >> + * Get value of a lcore variable instance of the specified lcore id.
> >> + */
> >> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
> >> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
> >> +
> >> +/**
> >> + * Set the value of a lcore variable instance of the specified lcore
> id.
> >> + */
> >> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value) \
> >> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
> >
> > I still think RTE_LCORE_VAR[_LCORE]_PTR() suffice, and
> RTE_LCORE_VAR[_LCORE]_GET/SET are superfluous.
> > But I don't insist on their removal. :-)
> >
>
> I'll remove them. One can always add them later. Nothing I've seen in
> the DPDK code base so far has been called for their use.
>
> Should the RTE_LCORE_VAR_PTR() be renamed RTE_LCORE_VAR_VALUE() (and
> still return a pointer, obviously)? "PTR" seems a little superfluous
> (Hungarian). "RTE_LCORE_VAR()" would be short, but not very descriptive.
Good question...
I would try to align this name and the name of the associated foreach macro, currently RTE_LCORE_VAR_FOREACH_VALUE(var, handle).
It seems confusing to have a macro named _VALUE() returning a pointer.
(Which is why I also dislike the foreach macro's current name and "var" parameter name.)
If it is supposed to be frequently used, a shorter name is preferable.
Which leans towards RTE_LCORE_VAR().
And then RTE_FOREACH_LCORE_VAR(iterator, handle) or RTE_LCORE_VAR_FOREACH(iterator, handle).
But then it is not obvious from the name that they operate on pointers.
We don't use Hungarian style in DPDK, so perhaps that is acceptable.
Your conclusion that GET/SET are not generally required inspired me for another idea...
Maybe returning a pointer is not the right thing to do!
I wonder if there are any obstacles to generally dereferencing the lcore variable pointer, like this:
#define RTE_LCORE_VAR_LCORE(lcore_id, handle) \
(*(typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
It would work for both get and set:
RTE_LCORE_VAR(foo) = RTE_LCORE_VAR(bar);
And also for functions being passed the address of the variable.
E.g. memset(&RTE_LCORE_VAR(foo), ...) would expand to:
memset(&(*(typeof(foo))__rte_lcore_var_lcore_ptr(rte_lcore_id(), foo)), ...);
One more thought, not related to the above discussion:
The TLS per-lcore variables are built with "per_lcore_" prefix added to the names, like this:
#define RTE_DEFINE_PER_LCORE(type, name) \
__thread __typeof__(type) per_lcore_##name
Should the lcore variables have something similar, i.e.:
#define RTE_LCORE_VAR_HANDLE(type, name) \
RTE_LCORE_VAR_HANDLE_TYPE(type) lcore_var_##name
>
> > With or without suggested changes...
> >
> > For the series,
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >
>
> Thanks for all help.
Thank you for the detailed consideration of my feedback.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-27 15:05 ` Morten Brørup
@ 2024-02-27 16:27 ` Mattias Rönnblom
2024-02-27 16:51 ` Morten Brørup
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-27 16:27 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev; +Cc: Stephen Hemminger
On 2024-02-27 16:05, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
>> Sent: Tuesday, 27 February 2024 14.44
>>
>> On 2024-02-27 10:58, Morten Brørup wrote:
>>>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>>>> Sent: Sunday, 25 February 2024 16.03
>>>
>>> [...]
>>>
>>>> +static void *
>>>> +lcore_var_alloc(size_t size, size_t align)
>>>> +{
>>>> + void *handle;
>>>> + void *value;
>>>> +
>>>> + offset = RTE_ALIGN_CEIL(offset, align);
>>>> +
>>>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>>>
>>> This would be the usual comparison:
>>> if (lcore_buffer == NULL) {
>>>
>>>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>>>> + LCORE_BUFFER_SIZE);
>>>> + RTE_VERIFY(lcore_buffer != NULL);
>>>> +
>>>> + offset = 0;
>>>> + }
>>>
>>> [...]
>>>
>>>> +/**
>>>> + * Define a lcore variable handle.
>>>> + *
>>>> + * This macro defines a variable which is used as a handle to access
>>>> + * the various per-lcore id instances of a per-lcore id variable.
>>>> + *
>>>> + * The aim with this macro is to make clear at the point of
>>>> + * declaration that this is an lcore handler, rather than a regular
>>>> + * pointer.
>>>> + *
>>>> + * Add @b static as a prefix in case the lcore variable are only to
>> be
>>>> + * accessed from a particular translation unit.
>>>> + */
>>>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>>>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>>>> +
>>>
>>> The parameter is "name" here, and "handle" in other macros.
>>> Just mentioning to make sure you thought about it.
>>>
>>>> +/**
>>>> + * Get pointer to lcore variable instance with the specified lcore
>> id.
>>>> + */
>>>> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle) \
>>>> + ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
>>>> +
>>>> +/**
>>>> + * Get value of a lcore variable instance of the specified lcore id.
>>>> + */
>>>> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
>>>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
>>>> +
>>>> +/**
>>>> + * Set the value of a lcore variable instance of the specified lcore
>> id.
>>>> + */
>>>> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value) \
>>>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
>>>
>>> I still think RTE_LCORE_VAR[_LCORE]_PTR() suffice, and
>> RTE_LCORE_VAR[_LCORE]_GET/SET are superfluous.
>>> But I don't insist on their removal. :-)
>>>
>>
>> I'll remove them. One can always add them later. Nothing I've seen in
>> the DPDK code base so far has been called for their use.
>>
>> Should the RTE_LCORE_VAR_PTR() be renamed RTE_LCORE_VAR_VALUE() (and
>> still return a pointer, obviously)? "PTR" seems a little superfluous
>> (Hungarian). "RTE_LCORE_VAR()" would be short, but not very descriptive.
>
> Good question...
>
> I would try to align this name and the name of the associated foreach macro, currently RTE_LCORE_VAR_FOREACH_VALUE(var, handle).
>
> It seems confusing to have a macro named _VALUE() returning a pointer.
> (Which is why I also dislike the foreach macro's current name and "var" parameter name.)
>
Not sure I agree. In C, you often ask for a value and get a pointer to
that value. I'll leave it VALUE() for now.
> If it is supposed to be frequently used, a shorter name is preferable.
> Which leans towards RTE_LCORE_VAR().
>
> And then RTE_FOREACH_LCORE_VAR(iterator, handle) or RTE_LCORE_VAR_FOREACH(iterator, handle).
>
RTE_LCORE_VAR_FOREACH was the original name, which was changed because
it was confusingly close to RTE_LCORE_FOREACH(), but had a different
semantics in regards to which lcore ids are iterated over (EAL threads
only, versus all lcore ids).
> But then it is not obvious from the name that they operate on pointers.
> We don't use Hungarian style in DPDK, so perhaps that is acceptable.
>
>
> Your conclusion that GET/SET are not generally required inspired me for another idea...
> Maybe returning a pointer is not the right thing to do!
>
> I wonder if there are any obstacles to generally dereferencing the lcore variable pointer, like this:
>
> #define RTE_LCORE_VAR_LCORE(lcore_id, handle) \
> (*(typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
>
> It would work for both get and set:
> RTE_LCORE_VAR(foo) = RTE_LCORE_VAR(bar);
>
> And also for functions being passed the address of the variable.
> E.g. memset(&RTE_LCORE_VAR(foo), ...) would expand to:
> memset(&(*(typeof(foo))__rte_lcore_var_lcore_ptr(rte_lcore_id(), foo)), ...);
>
>
The value is usually accessed by means of a pointer, so no need to
return *pointer.
> One more thought, not related to the above discussion:
>
> The TLS per-lcore variables are built with "per_lcore_" prefix added to the names, like this:
> #define RTE_DEFINE_PER_LCORE(type, name) \
> __thread __typeof__(type) per_lcore_##name
>
> Should the lcore variables have something similar, i.e.:
> #define RTE_LCORE_VAR_HANDLE(type, name) \
> RTE_LCORE_VAR_HANDLE_TYPE(type) lcore_var_##name
>
I started out with a prefix, but I removed it, since you may want to
access (copy, assign) the handler pointer directly, and thus need to
know it's real name. Also, I didn't see why you need a prefix.
For example, consider a section of code where you want to use one of two
variables depending on condition.
RTE_LCORE_VAR_HANDLE(actual, int);
if (something)
actual = some_handle;
else
actual = some_other_handle;
int *value = RTE_LCORE_VAR_VALUE(actual);
This above doesn't work if some_handle is actually named
rte_lcore_var_some_handle or something like that.
If you want to add a prefix (for which there shouldn't be a need), you
would need a macro RTE_LCORE_VAR_NAME() as well, so the user can derive
the actual name (including the prefix).
>
>>
>>> With or without suggested changes...
>>>
>>> For the series,
>>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>>>
>>
>> Thanks for all help.
>
> Thank you for the detailed consideration of my feedback.
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v4 1/6] eal: add static per-lcore memory allocation facility
2024-02-27 16:27 ` Mattias Rönnblom
@ 2024-02-27 16:51 ` Morten Brørup
0 siblings, 0 replies; 323+ messages in thread
From: Morten Brørup @ 2024-02-27 16:51 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev; +Cc: Stephen Hemminger
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Tuesday, 27 February 2024 17.28
>
> On 2024-02-27 16:05, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> >> Sent: Tuesday, 27 February 2024 14.44
> >>
> >> On 2024-02-27 10:58, Morten Brørup wrote:
> >>>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> >>>> Sent: Sunday, 25 February 2024 16.03
> >>>
> >>> [...]
> >>>
> >>>> +static void *
> >>>> +lcore_var_alloc(size_t size, size_t align)
> >>>> +{
> >>>> + void *handle;
> >>>> + void *value;
> >>>> +
> >>>> + offset = RTE_ALIGN_CEIL(offset, align);
> >>>> +
> >>>> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >>>
> >>> This would be the usual comparison:
> >>> if (lcore_buffer == NULL) {
> >>>
> >>>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >>>> + LCORE_BUFFER_SIZE);
> >>>> + RTE_VERIFY(lcore_buffer != NULL);
> >>>> +
> >>>> + offset = 0;
> >>>> + }
> >>>
> >>> [...]
> >>>
> >>>> +/**
> >>>> + * Define a lcore variable handle.
> >>>> + *
> >>>> + * This macro defines a variable which is used as a handle to
> access
> >>>> + * the various per-lcore id instances of a per-lcore id variable.
> >>>> + *
> >>>> + * The aim with this macro is to make clear at the point of
> >>>> + * declaration that this is an lcore handler, rather than a
> regular
> >>>> + * pointer.
> >>>> + *
> >>>> + * Add @b static as a prefix in case the lcore variable are only
> to
> >> be
> >>>> + * accessed from a particular translation unit.
> >>>> + */
> >>>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> >>>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> >>>> +
> >>>
> >>> The parameter is "name" here, and "handle" in other macros.
> >>> Just mentioning to make sure you thought about it.
> >>>
> >>>> +/**
> >>>> + * Get pointer to lcore variable instance with the specified lcore
> >> id.
> >>>> + */
> >>>> +#define RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)
> \
> >>>> + ((typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id,
> handle))
> >>>> +
> >>>> +/**
> >>>> + * Get value of a lcore variable instance of the specified lcore
> id.
> >>>> + */
> >>>> +#define RTE_LCORE_VAR_LCORE_GET(lcore_id, handle) \
> >>>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)))
> >>>> +
> >>>> +/**
> >>>> + * Set the value of a lcore variable instance of the specified
> lcore
> >> id.
> >>>> + */
> >>>> +#define RTE_LCORE_VAR_LCORE_SET(lcore_id, handle, value)
> \
> >>>> + (*(RTE_LCORE_VAR_LCORE_PTR(lcore_id, handle)) = (value))
> >>>
> >>> I still think RTE_LCORE_VAR[_LCORE]_PTR() suffice, and
> >> RTE_LCORE_VAR[_LCORE]_GET/SET are superfluous.
> >>> But I don't insist on their removal. :-)
> >>>
> >>
> >> I'll remove them. One can always add them later. Nothing I've seen in
> >> the DPDK code base so far has been called for their use.
> >>
> >> Should the RTE_LCORE_VAR_PTR() be renamed RTE_LCORE_VAR_VALUE() (and
> >> still return a pointer, obviously)? "PTR" seems a little superfluous
> >> (Hungarian). "RTE_LCORE_VAR()" would be short, but not very
> descriptive.
> >
> > Good question...
> >
> > I would try to align this name and the name of the associated foreach
> macro, currently RTE_LCORE_VAR_FOREACH_VALUE(var, handle).
> >
> > It seems confusing to have a macro named _VALUE() returning a pointer.
> > (Which is why I also dislike the foreach macro's current name and
> "var" parameter name.)
> >
>
> Not sure I agree. In C, you often ask for a value and get a pointer to
> that value. I'll leave it VALUE() for now.
Yes, fopen() is an example of this.
But such functions don't have VALUE in their names.
(I'm not so worried about the "var" parameter name being confusing.)
You can leave it VALUE for now, just keep an open mind for changing it. :-)
>
> > If it is supposed to be frequently used, a shorter name is preferable.
> > Which leans towards RTE_LCORE_VAR().
> >
> > And then RTE_FOREACH_LCORE_VAR(iterator, handle) or
> RTE_LCORE_VAR_FOREACH(iterator, handle).
> >
>
> RTE_LCORE_VAR_FOREACH was the original name, which was changed because
> it was confusingly close to RTE_LCORE_FOREACH(), but had a different
> semantics in regards to which lcore ids are iterated over (EAL threads
> only, versus all lcore ids).
I know I was going in circles here.
Perhaps when we get used to the lcore variables, the similar name might not be confusing anymore. I suppose this happened to me during the review discussions.
I don't have a solid answer, so I'm throwing the ball around to see how it bounces.
>
> > But then it is not obvious from the name that they operate on
> pointers.
> > We don't use Hungarian style in DPDK, so perhaps that is acceptable.
> >
> >
> > Your conclusion that GET/SET are not generally required inspired me
> for another idea...
> > Maybe returning a pointer is not the right thing to do!
> >
> > I wonder if there are any obstacles to generally dereferencing the
> lcore variable pointer, like this:
> >
> > #define RTE_LCORE_VAR_LCORE(lcore_id, handle) \
> > (*(typeof(handle))__rte_lcore_var_lcore_ptr(lcore_id, handle))
> >
> > It would work for both get and set:
> > RTE_LCORE_VAR(foo) = RTE_LCORE_VAR(bar);
> >
> > And also for functions being passed the address of the variable.
> > E.g. memset(&RTE_LCORE_VAR(foo), ...) would expand to:
> > memset(&(*(typeof(foo))__rte_lcore_var_lcore_ptr(rte_lcore_id(),
> foo)), ...);
> >
> >
>
> The value is usually accessed by means of a pointer, so no need to
> return *pointer.
OK. I suppose you have a pretty good overview of the relevant use cases by now.
>
> > One more thought, not related to the above discussion:
> >
> > The TLS per-lcore variables are built with "per_lcore_" prefix added
> to the names, like this:
> > #define RTE_DEFINE_PER_LCORE(type, name) \
> > __thread __typeof__(type) per_lcore_##name
> >
> > Should the lcore variables have something similar, i.e.:
> > #define RTE_LCORE_VAR_HANDLE(type, name) \
> > RTE_LCORE_VAR_HANDLE_TYPE(type) lcore_var_##name
> >
>
> I started out with a prefix, but I removed it, since you may want to
> access (copy, assign) the handler pointer directly, and thus need to
> know it's real name. Also, I didn't see why you need a prefix.
>
> For example, consider a section of code where you want to use one of two
> variables depending on condition.
>
> RTE_LCORE_VAR_HANDLE(actual, int);
>
> if (something)
> actual = some_handle;
> else
> actual = some_other_handle;
>
> int *value = RTE_LCORE_VAR_VALUE(actual);
>
> This above doesn't work if some_handle is actually named
> rte_lcore_var_some_handle or something like that.
>
> If you want to add a prefix (for which there shouldn't be a need), you
> would need a macro RTE_LCORE_VAR_NAME() as well, so the user can derive
> the actual name (including the prefix).
Thanks for the detailed reply.
Let's not add a prefix.
>
> >
> >>
> >>> With or without suggested changes...
> >>>
> >>> For the series,
> >>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >>>
> >>
> >> Thanks for all help.
> >
> > Thank you for the detailed consideration of my feedback.
> >
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v5 0/6] Lcore variables
2024-02-25 15:03 ` [RFC v4 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-02-27 9:58 ` Morten Brørup
@ 2024-02-28 10:09 ` Mattias Rönnblom
2024-02-28 10:09 ` [RFC v5 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (5 more replies)
1 sibling, 6 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-28 10:09 UTC (permalink / raw)
To: dev; +Cc: hofors, Morten Brørup, Stephen Hemminger, Mattias Rönnblom
This RFC presents a new API <rte_lcore_var.h> for static per-lcore id
data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
One thing is unclear to the author is how this API relates to a
potential future per-lcore dynamic allocator (e.g., a per-lcore heap).
Contrary to what the version.map edit suggests, this RFC is not meant
for a proposal for DPDK 24.03.
Mattias Rönnblom (6):
eal: add static per-lcore memory allocation facility
eal: add lcore variable test suite
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
app/test/meson.build | 1 +
app/test/test_lcore_var.c | 432 ++++++++++++++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 68 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 30 +-
lib/eal/common/rte_service.c | 118 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++
lib/eal/version.map | 4 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 36 +--
13 files changed, 990 insertions(+), 88 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v5 1/6] eal: add static per-lcore memory allocation facility
2024-02-28 10:09 ` [RFC v5 0/6] Lcore variables Mattias Rönnblom
@ 2024-02-28 10:09 ` Mattias Rönnblom
2024-03-19 12:52 ` Konstantin Ananyev
2024-05-06 8:27 ` [RFC v6 0/6] Lcore variables Mattias Rönnblom
2024-02-28 10:09 ` [RFC v5 2/6] eal: add lcore variable test suite Mattias Rönnblom
` (4 subsequent siblings)
5 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-02-28 10:09 UTC (permalink / raw)
To: dev; +Cc: hofors, Morten Brørup, Stephen Hemminger, Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small chunks of often-used data, which is related logically, but where
there are performance benefits to reap from having updates being local
to an lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 68 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++++++
lib/eal/version.map | 4 +
7 files changed, 444 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/config/rte_config.h b/config/rte_config.h
index d743a5c3d3..0dac33d3b9 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 8c1eb8fafa..a3b8391570 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore-varible](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..5c353ebd46
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..1db479253d
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,368 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Per-lcore id variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. In other words,
+ * there's one copy of its value for each and every current and future
+ * lcore id-equipped thread, with the total number of copies amounting
+ * to @c RTE_MAX_LCORE.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. A handle may be passed between modules and
+ * threads just like any pointer, but its value is not the address of
+ * any particular object, but rather just an opaque identifier, stored
+ * in a typed pointer (to inform the access macro the type of values).
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define a lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but is should
+ * generally only *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by to different lcore
+ * ids *may* be frequently read or written by the owners without the
+ * risk of false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomics) should
+ * employed to assure there are no data races between the owning
+ * thread and any non-owner threads accessing the same lcore variable
+ * instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value, for which a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct. An application may choose
+ * to define an lcore variable, which it then it goes on to never
+ * allocate.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * The size of a lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, all nearby data structures
+ * should almost-always be written to by a single thread (the lcore
+ * variable owner). Adding padding will increase the effective memory
+ * working set size, and potentially reducing performance.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * } __rte_cache_aligned;
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this forces the
+ * use of cache-line alignment to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables has the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which make use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. One effect of this is thread-local variables must
+ * initialized in a "lazy" manner (e.g., at the point of thread
+ * creation). Lcore variables may be accessed immediately after
+ * having been allocated (which is usually prior any thread beyond
+ * the main thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id, and thus
+ * not for such "regular" threads.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define a lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various per-lcore id instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handler, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable are only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handler pointer type, and initialize its handle.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for a lcore variable.
+ *
+ * @param value
+ * A pointer set successivly set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for a lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The id of the variable, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 5e0cd47c82..e90b86115a 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -393,6 +393,10 @@ EXPERIMENTAL {
# added in 23.07
rte_memzone_max_get;
rte_memzone_max_set;
+
+ # added in 24.03
+ rte_lcore_var_alloc;
+ rte_lcore_var;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v5 1/6] eal: add static per-lcore memory allocation facility
2024-02-28 10:09 ` [RFC v5 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-03-19 12:52 ` Konstantin Ananyev
2024-03-20 10:24 ` Mattias Rönnblom
2024-05-06 8:27 ` [RFC v6 0/6] Lcore variables Mattias Rönnblom
1 sibling, 1 reply; 323+ messages in thread
From: Konstantin Ananyev @ 2024-03-19 12:52 UTC (permalink / raw)
To: Mattias Rönnblom, dev; +Cc: hofors, Morten Brørup, Stephen Hemminger
Hi Mattias,
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small chunks of often-used data, which is related logically, but where
> there are performance benefits to reap from having updates being local
> to an lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
Thanks for the RFC, very interesting one.
Few comments/questions below.
> RFC v5:
> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
> * The RTE_LCORE_VAR_GET() and SET() convience access macros
> covered an uncommon use case, where the lcore value is of a
> primitive type, rather than a struct, and is thus eliminated
> from the API. (Morten Brørup)
> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
> RTE_LCORE_VAR_VALUE().
> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
> signal that this function is a part of the public API.
> * Macro arguments are documented.
>
> RFV v4:
> * Replace large static array with libc heap-allocated memory. One
> implication of this change is there no longer exists a fixed upper
> bound for the total amount of memory used by lcore variables.
> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
> maximum size of any individual lcore variable value.
> * Fix issues in example. (Morten Brørup)
> * Improve access macro type checking. (Morten Brørup)
> * Refer to the lcore variable handle as "handle" and not "name" in
> various macros.
> * Document lack of thread safety in rte_lcore_var_alloc().
> * Provide API-level assurance the lcore variable handle is
> always non-NULL, to all applications to use NULL to mean
> "not yet allocated".
> * Note zero-sized allocations are not allowed.
> * Give API-level guarantee the lcore variable values are zeroed.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> lib/eal/common/eal_common_lcore_var.c | 68 +++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++++++
> lib/eal/version.map | 4 +
> 7 files changed, 444 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/config/rte_config.h b/config/rte_config.h
> index d743a5c3d3..0dac33d3b9 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -41,6 +41,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 8c1eb8fafa..a3b8391570 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..5c353ebd46
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> +
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
Hmm... do I get it right: if offset is <= then RTE_MAX_LCORE_VAR, and offset + size > RTE_MAX_LCORE_VAR
we simply overwrite lcore_buffer with newly allocated buffer of the same size?
I understand that you expect it just never to happen (total size of all lcore vars never exceed 1MB), but still
I think we need to handle it in a some better way then just ignoring such possibility...
Might be RTE_VERIFY() at least?
As a more generic question - do we need to support LCORE_VAR for dlopen()s that could happen after rte_eal_init()
is called and LCORE threads were created?
Because, if no, then we probably can make this construction much more flexible:
one buffer per LCORE, allocate on demand, etc.
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
> +
> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> + offset += size;
> +
> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> + memset(value, 0, size);
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return handle;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> + * assures as well as having the base pointer aligned on cache
> + * size assures that aligned offsets also translate to alipgned
> + * pointers across all values.
> + */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> +
> + /* '0' means asking for worst-case alignment requirements */
> + if (align == 0)
> + align = alignof(max_align_t);
> +
> + RTE_ASSERT(rte_is_power_of_2(align));
> +
> + return lcore_var_alloc(size, align);
> +}
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [RFC v5 1/6] eal: add static per-lcore memory allocation facility
2024-03-19 12:52 ` Konstantin Ananyev
@ 2024-03-20 10:24 ` Mattias Rönnblom
2024-03-20 14:18 ` Konstantin Ananyev
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-03-20 10:24 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger
On 2024-03-19 13:52, Konstantin Ananyev wrote:
>
> Hi Mattias,
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small chunks of often-used data, which is related logically, but where
>> there are performance benefits to reap from having updates being local
>> to an lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>
> Thanks for the RFC, very interesting one.
> Few comments/questions below.
>
>
>> RFC v5:
>> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
>> * The RTE_LCORE_VAR_GET() and SET() convience access macros
>> covered an uncommon use case, where the lcore value is of a
>> primitive type, rather than a struct, and is thus eliminated
>> from the API. (Morten Brørup)
>> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
>> RTE_LCORE_VAR_VALUE().
>> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
>> signal that this function is a part of the public API.
>> * Macro arguments are documented.
>>
>> RFV v4:
>> * Replace large static array with libc heap-allocated memory. One
>> implication of this change is there no longer exists a fixed upper
>> bound for the total amount of memory used by lcore variables.
>> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
>> maximum size of any individual lcore variable value.
>> * Fix issues in example. (Morten Brørup)
>> * Improve access macro type checking. (Morten Brørup)
>> * Refer to the lcore variable handle as "handle" and not "name" in
>> various macros.
>> * Document lack of thread safety in rte_lcore_var_alloc().
>> * Provide API-level assurance the lcore variable handle is
>> always non-NULL, to all applications to use NULL to mean
>> "not yet allocated".
>> * Note zero-sized allocations are not allowed.
>> * Give API-level guarantee the lcore variable values are zeroed.
>>
>> RFC v3:
>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>
>> RFC v2:
>> * Use alignof to derive alignment requirements. (Morten Brørup)
>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>> * Allow user-specified alignment, but limit max to cache line size.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>> ---
>> config/rte_config.h | 1 +
>> doc/api/doxy-api-index.md | 1 +
>> lib/eal/common/eal_common_lcore_var.c | 68 +++++
>> lib/eal/common/meson.build | 1 +
>> lib/eal/include/meson.build | 1 +
>> lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++++++
>> lib/eal/version.map | 4 +
>> 7 files changed, 444 insertions(+)
>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>
>> diff --git a/config/rte_config.h b/config/rte_config.h
>> index d743a5c3d3..0dac33d3b9 100644
>> --- a/config/rte_config.h
>> +++ b/config/rte_config.h
>> @@ -41,6 +41,7 @@
>> /* EAL defines */
>> #define RTE_CACHE_GUARD_LINES 1
>> #define RTE_MAX_HEAPS 32
>> +#define RTE_MAX_LCORE_VAR 1048576
>> #define RTE_MAX_MEMSEG_LISTS 128
>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index 8c1eb8fafa..a3b8391570 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore-varible](@ref rte_lcore_var.h),
>> [per-lcore](@ref rte_per_lcore.h),
>> [service cores](@ref rte_service.h),
>> [keepalive](@ref rte_keepalive.h),
>> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
>> new file mode 100644
>> index 0000000000..5c353ebd46
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,68 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
>> +
>> +static void *lcore_buffer;
>> +static size_t offset = RTE_MAX_LCORE_VAR;
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + void *handle;
>> + void *value;
>> +
>> + offset = RTE_ALIGN_CEIL(offset, align);
>> +
>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>
> Hmm... do I get it right: if offset is <= then RTE_MAX_LCORE_VAR, and offset + size > RTE_MAX_LCORE_VAR
> we simply overwrite lcore_buffer with newly allocated buffer of the same size?
No, it's just the pointer that is overwritten. The old buffer will
remain in memory.
> I understand that you expect it just never to happen (total size of all lcore vars never exceed 1MB), but still
> I think we need to handle it in a some better way then just ignoring such possibility...
> Might be RTE_VERIFY() at least?
>
In this revision of the patch set, RTE_MAX_LCORE_VAR does not represent
an upper bound for the sum of all lcore variables' size, but rather only
the maximum size of a single lcore variable.
Variable alignment and size constraints are RTE_ASSERT()ed at the point
of allocation. One could argue they should be RTE_VERIFY()-ed instead,
since there aren't any performance constraints.
> As a more generic question - do we need to support LCORE_VAR for dlopen()s that could happen after rte_eal_init()
> is called and LCORE threads were created?
Yes, allocations after rte_eal_init() (caused by dlopen() or otherwise)
must be allowed imo, and are allowed. Otherwise applications sitting on
top of DPDK can't use this facility.
> Because, if no, then we probably can make this construction much more flexible:
> one buffer per LCORE, allocate on demand, etc.
>
On-demand allocations are already supported, but one can't do free().
That's why I've called what this module provide "static allocation",
while it may be more appropriately described as "dynamic allocation
without deallocation".
"True" dynamic memory allocation of per-lcore memory would be very
useful, but is an entirely different beast in terms of complexity and
(if to be usable in the packet processing fast path) performance
requirements.
"True" dynamic memory allocation would also result in something less
compact (at least if you use the usual pattern with a per-object heap
header).
>> + RTE_VERIFY(lcore_buffer != NULL);
>> +
>> + offset = 0;
>> + }
>> +
>> + handle = RTE_PTR_ADD(lcore_buffer, offset);
>> +
>> + offset += size;
>> +
>> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
>> + memset(value, 0, size);
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>> +
>> + return handle;
>> +}
>> +
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align)
>> +{
>> + /* Having the per-lcore buffer size aligned on cache lines
>> + * assures as well as having the base pointer aligned on cache
>> + * size assures that aligned offsets also translate to alipgned
>> + * pointers across all values.
>> + */
>> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
>> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
>> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
>> +
>> + /* '0' means asking for worst-case alignment requirements */
>> + if (align == 0)
>> + align = alignof(max_align_t);
>> +
>> + RTE_ASSERT(rte_is_power_of_2(align));
>> +
>> + return lcore_var_alloc(size, align);
>> +}
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [RFC v5 1/6] eal: add static per-lcore memory allocation facility
2024-03-20 10:24 ` Mattias Rönnblom
@ 2024-03-20 14:18 ` Konstantin Ananyev
0 siblings, 0 replies; 323+ messages in thread
From: Konstantin Ananyev @ 2024-03-20 14:18 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger
> >> Introduce DPDK per-lcore id variables, or lcore variables for short.
> >>
> >> An lcore variable has one value for every current and future lcore
> >> id-equipped thread.
> >>
> >> The primary <rte_lcore_var.h> use case is for statically allocating
> >> small chunks of often-used data, which is related logically, but where
> >> there are performance benefits to reap from having updates being local
> >> to an lcore.
> >>
> >> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> >> _Thread_local), but decoupling the values' life time with that of the
> >> threads.
> >>
> >> Lcore variables are also similar in terms of functionality provided by
> >> FreeBSD kernel's DPCPU_*() family of macros and the associated
> >> build-time machinery. DPCPU uses linker scripts, which effectively
> >> prevents the reuse of its, otherwise seemingly viable, approach.
> >>
> >> The currently-prevailing way to solve the same problem as lcore
> >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> >> lcore variables over this approach is that data related to the same
> >> lcore now is close (spatially, in memory), rather than data used by
> >> the same module, which in turn avoid excessive use of padding,
> >> polluting caches with unused data.
> >
> > Thanks for the RFC, very interesting one.
> > Few comments/questions below.
> >
> >
> >> RFC v5:
> >> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
> >> * The RTE_LCORE_VAR_GET() and SET() convience access macros
> >> covered an uncommon use case, where the lcore value is of a
> >> primitive type, rather than a struct, and is thus eliminated
> >> from the API. (Morten Brørup)
> >> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
> >> RTE_LCORE_VAR_VALUE().
> >> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
> >> signal that this function is a part of the public API.
> >> * Macro arguments are documented.
> >>
> >> RFV v4:
> >> * Replace large static array with libc heap-allocated memory. One
> >> implication of this change is there no longer exists a fixed upper
> >> bound for the total amount of memory used by lcore variables.
> >> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
> >> maximum size of any individual lcore variable value.
> >> * Fix issues in example. (Morten Brørup)
> >> * Improve access macro type checking. (Morten Brørup)
> >> * Refer to the lcore variable handle as "handle" and not "name" in
> >> various macros.
> >> * Document lack of thread safety in rte_lcore_var_alloc().
> >> * Provide API-level assurance the lcore variable handle is
> >> always non-NULL, to all applications to use NULL to mean
> >> "not yet allocated".
> >> * Note zero-sized allocations are not allowed.
> >> * Give API-level guarantee the lcore variable values are zeroed.
> >>
> >> RFC v3:
> >> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> >> * Update example to reflect FOREACH macro name change (in RFC v2).
> >>
> >> RFC v2:
> >> * Use alignof to derive alignment requirements. (Morten Brørup)
> >> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> >> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> >> * Allow user-specified alignment, but limit max to cache line size.
> >>
> >> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >> ---
> >> config/rte_config.h | 1 +
> >> doc/api/doxy-api-index.md | 1 +
> >> lib/eal/common/eal_common_lcore_var.c | 68 +++++
> >> lib/eal/common/meson.build | 1 +
> >> lib/eal/include/meson.build | 1 +
> >> lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++++++
> >> lib/eal/version.map | 4 +
> >> 7 files changed, 444 insertions(+)
> >> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> >> create mode 100644 lib/eal/include/rte_lcore_var.h
> >>
> >> diff --git a/config/rte_config.h b/config/rte_config.h
> >> index d743a5c3d3..0dac33d3b9 100644
> >> --- a/config/rte_config.h
> >> +++ b/config/rte_config.h
> >> @@ -41,6 +41,7 @@
> >> /* EAL defines */
> >> #define RTE_CACHE_GUARD_LINES 1
> >> #define RTE_MAX_HEAPS 32
> >> +#define RTE_MAX_LCORE_VAR 1048576
> >> #define RTE_MAX_MEMSEG_LISTS 128
> >> #define RTE_MAX_MEMSEG_PER_LIST 8192
> >> #define RTE_MAX_MEM_MB_PER_LIST 32768
> >> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> >> index 8c1eb8fafa..a3b8391570 100644
> >> --- a/doc/api/doxy-api-index.md
> >> +++ b/doc/api/doxy-api-index.md
> >> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> >> [interrupts](@ref rte_interrupts.h),
> >> [launch](@ref rte_launch.h),
> >> [lcore](@ref rte_lcore.h),
> >> + [lcore-varible](@ref rte_lcore_var.h),
> >> [per-lcore](@ref rte_per_lcore.h),
> >> [service cores](@ref rte_service.h),
> >> [keepalive](@ref rte_keepalive.h),
> >> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> >> new file mode 100644
> >> index 0000000000..5c353ebd46
> >> --- /dev/null
> >> +++ b/lib/eal/common/eal_common_lcore_var.c
> >> @@ -0,0 +1,68 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright(c) 2024 Ericsson AB
> >> + */
> >> +
> >> +#include <inttypes.h>
> >> +
> >> +#include <rte_common.h>
> >> +#include <rte_debug.h>
> >> +#include <rte_log.h>
> >> +
> >> +#include <rte_lcore_var.h>
> >> +
> >> +#include "eal_private.h"
> >> +
> >> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> >> +
> >> +static void *lcore_buffer;
> >> +static size_t offset = RTE_MAX_LCORE_VAR;
> >> +
> >> +static void *
> >> +lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + void *handle;
> >> + void *value;
> >> +
> >> + offset = RTE_ALIGN_CEIL(offset, align);
> >> +
> >> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >
> > Hmm... do I get it right: if offset is <= then RTE_MAX_LCORE_VAR, and offset + size > RTE_MAX_LCORE_VAR
> > we simply overwrite lcore_buffer with newly allocated buffer of the same size?
>
> No, it's just the pointer that is overwritten. The old buffer will
> remain in memory.
Ah ok, I missed that you changed the handle to pointer conversion in new version too.
Now handle is not just an offset, but an actual pointer to lcore0 var, so all we have is to add
lcore_idx offset.
Makes sense, thanks for clarifying.
LGTM then.
>
> > I understand that you expect it just never to happen (total size of all lcore vars never exceed 1MB), but still
> > I think we need to handle it in a some better way then just ignoring such possibility...
> > Might be RTE_VERIFY() at least?
> >
>
> In this revision of the patch set, RTE_MAX_LCORE_VAR does not represent
> an upper bound for the sum of all lcore variables' size, but rather only
> the maximum size of a single lcore variable.
>
> Variable alignment and size constraints are RTE_ASSERT()ed at the point
> of allocation. One could argue they should be RTE_VERIFY()-ed instead,
> since there aren't any performance constraints.
>
> > As a more generic question - do we need to support LCORE_VAR for dlopen()s that could happen after rte_eal_init()
> > is called and LCORE threads were created?
>
> Yes, allocations after rte_eal_init() (caused by dlopen() or otherwise)
> must be allowed imo, and are allowed. Otherwise applications sitting on
> top of DPDK can't use this facility.
>
> > Because, if no, then we probably can make this construction much more flexible:
> > one buffer per LCORE, allocate on demand, etc.
> >
>
> On-demand allocations are already supported, but one can't do free().
> That's why I've called what this module provide "static allocation",
> while it may be more appropriately described as "dynamic allocation
> without deallocation".
>
> "True" dynamic memory allocation of per-lcore memory would be very
> useful, but is an entirely different beast in terms of complexity and
> (if to be usable in the packet processing fast path) performance
> requirements.
>
> "True" dynamic memory allocation would also result in something less
> compact (at least if you use the usual pattern with a per-object heap
> header).
>
> >> + RTE_VERIFY(lcore_buffer != NULL);
> >> +
> >> + offset = 0;
> >> + }
> >> +
> >> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> >> +
> >> + offset += size;
> >> +
> >> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> >> + memset(value, 0, size);
> >> +
> >> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> >> + "%"PRIuPTR"-byte alignment", size, align);
> >> +
> >> + return handle;
> >> +}
> >> +
> >> +void *
> >> +rte_lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + /* Having the per-lcore buffer size aligned on cache lines
> >> + * assures as well as having the base pointer aligned on cache
> >> + * size assures that aligned offsets also translate to alipgned
> >> + * pointers across all values.
> >> + */
> >> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> >> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> >> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> >> +
> >> + /* '0' means asking for worst-case alignment requirements */
> >> + if (align == 0)
> >> + align = alignof(max_align_t);
> >> +
> >> + RTE_ASSERT(rte_is_power_of_2(align));
> >> +
> >> + return lcore_var_alloc(size, align);
> >> +}
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v6 0/6] Lcore variables
2024-02-28 10:09 ` [RFC v5 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-03-19 12:52 ` Konstantin Ananyev
@ 2024-05-06 8:27 ` Mattias Rönnblom
2024-05-06 8:27 ` [RFC v6 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (6 more replies)
1 sibling, 7 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-05-06 8:27 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, Mattias Rönnblom
This RFC presents a new API <rte_lcore_var.h> for static per-lcore id
data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
One thing is unclear to the author is how this API relates to a
potential future per-lcore dynamic allocator (e.g., a per-lcore heap).
Mattias Rönnblom (6):
eal: add static per-lcore memory allocation facility
eal: add lcore variable test suite
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
app/test/meson.build | 1 +
app/test/test_lcore_var.c | 432 ++++++++++++++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 69 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 +++----
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 384 +++++++++++++++++++++++
lib/eal/version.map | 3 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
13 files changed, 1000 insertions(+), 87 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [RFC v6 1/6] eal: add static per-lcore memory allocation facility
2024-05-06 8:27 ` [RFC v6 0/6] Lcore variables Mattias Rönnblom
@ 2024-05-06 8:27 ` Mattias Rönnblom
2024-09-10 7:03 ` [PATCH 0/6] Lcore variables Mattias Rönnblom
2024-05-06 8:27 ` [RFC v6 2/6] eal: add lcore variable test suite Mattias Rönnblom
` (5 subsequent siblings)
6 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-05-06 8:27 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small chunks of often-used data, which is related logically, but where
there are performance benefits to reap from having updates being local
to an lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
lib/eal/common/eal_common_lcore_var.c | 69 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 384 ++++++++++++++++++++++++++
lib/eal/version.map | 3 +
7 files changed, 460 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 8c1eb8fafa..a3b8391570 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore-varible](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..74ad8272ec
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..cfbcac41dd
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,384 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Per-lcore id variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * copy for each current and future lcore id-equipped thread, with the
+ * total number of copies amounting to @c RTE_MAX_LCORE. The value of
+ * an lcore variable for a particular lcore id is independent from
+ * other values (for other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handler type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define a lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by to different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of a lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this forces the
+ * use of cache-line alignment to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables has the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which make use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define a lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various per-lcore id instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handler, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable are only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handler pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for a lcore variable.
+ *
+ * @param value
+ * A pointer set successivly set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for a lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The id of the variable, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 3df50c3fbb..7702642785 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,9 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
+ rte_lcore_var;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH 0/6] Lcore variables
2024-05-06 8:27 ` [RFC v6 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-10 7:03 ` Mattias Rönnblom
2024-09-10 7:03 ` [PATCH 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (5 more replies)
0 siblings, 6 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-10 7:03 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (6):
eal: add static per-lcore memory allocation facility
eal: add lcore variable test suite
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 1 +
app/test/test_lcore_var.c | 432 +++++++++++++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 69 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 384 ++++++++++++++++++++++
lib/eal/version.map | 3 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
15 files changed, 1020 insertions(+), 87 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 7:03 ` [PATCH 0/6] Lcore variables Mattias Rönnblom
@ 2024-09-10 7:03 ` Mattias Rönnblom
2024-09-10 9:32 ` Morten Brørup
` (2 more replies)
2024-09-10 7:03 ` [PATCH 2/6] eal: add lcore variable test suite Mattias Rönnblom
` (4 subsequent siblings)
5 siblings, 3 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-10 7:03 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 69 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 384 +++++++++++++++++++++++++
lib/eal/version.map | 3 +
9 files changed, 480 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..07d7cbc66c 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore-varible](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..adb8eb404d 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each lcore.
+
+ With lcore variables, data is organized spatially on a per-lcore
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..74ad8272ec
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..7d3178c424
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,384 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Per-lcore id variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * copy for each current and future lcore id-equipped thread, with the
+ * total number of copies amounting to @c RTE_MAX_LCORE. The value of
+ * an lcore variable for a particular lcore id is independent from
+ * other values (for other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handler type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define a lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by to different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of a lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this forces the
+ * use of cache-line alignment to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables has the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which make use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define a lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various per-lcore id instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handler, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable are only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handler pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for a lcore variable.
+ *
+ * @param value
+ * A pointer set successivly set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for a lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The id of the variable, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..5f5a3522c0 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,9 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
+ rte_lcore_var;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 7:03 ` [PATCH 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-10 9:32 ` Morten Brørup
2024-09-10 10:44 ` Mattias Rönnblom
2024-09-11 10:32 ` Morten Brørup
2024-09-11 17:04 ` [PATCH v2 0/6] Lcore variables Mattias Rönnblom
2 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-09-10 9:32 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Tuesday, 10 September 2024 09.04
>
> Introduce DPDK per-lcore id variables, or lcore variables for short.
Throughout the descriptions and comments,
please replace "lcore id" with "lcore" (e.g. "per-lcore variables"),
when referring to the lcore, and not the index of the lcore.
(Your intention might be to highlight that it only covers threads with an lcore id,
but if that wasn't the case, you would refer to them as "threads" not "lcores".)
Except, of course, when referring to an actual lcore id, e.g. lcore_id function parameters.
Paraphrasing:
Consider the type of what you are referring to;
use "lcore" if its type is "thread", and
use "lcore id" if its type is "int".
I might be wrong here, but please think hard about it.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> --
> +++ b/doc/api/doxy-api-index.md
> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
Typo: varible -> variable
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -55,6 +55,20 @@ New Features
> Also, make sure to start the actual text at the margin.
> =======================================================
>
> +* **Added EAL per-lcore static memory allocation facility.**
> +
> + Added EAL API <rte_lcore_var.h> for statically allocating small,
> + frequently-accessed data structures, for which one instance should
> + exist for each lcore.
> +
> + With lcore variables, data is organized spatially on a per-lcore
> + basis, rather than per library or PMD, avoiding the need for cache
> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
> + reduces CPU cache internal fragmentation, improving performance.
> +
> + Lcore variables are similar to thread-local storage (TLS, e.g.,
> + C11 _Thread_local), but decoupling the values' life time from that
> + of the threads.
When referring to TLS, you might want to clarify that lcore variables are not instantiated for unregistered threads.
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
To determine if the lcore_buffer memory should be allocated, why not just check if lcore_buffer == NULL?
Then offset wouldn't need an initial value of RTE_MAX_LCORE_VAR.
> +
> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> + offset += size;
> +
> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> + memset(value, 0, size);
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return handle;
> +}
> +/**
> + * @file
> + *
> + * RTE Per-lcore id variables
Suggest mentioning the short form too, e.g.:
"RTE Per-lcore id variables (RTE Lcore variables)"
> + *
> + * This API provides a mechanism to create and access per-lcore id
> + * variables in a space- and cycle-efficient manner.
> + *
> + * A per-lcore id variable (or lcore variable for short) has one value
> + * for each EAL thread and registered non-EAL thread.
And service thread.
> + * There is one
> + * copy for each current and future lcore id-equipped thread, with the
"one copy" -> "one instance"
> + * total number of copies amounting to @c RTE_MAX_LCORE. The value of
"copies" -> "instances"
> + * an lcore variable for a particular lcore id is independent from
> + * other values (for other lcore ids) within the same lcore variable.
> + *
> + * In order to access the values of an lcore variable, a handle is
> + * used. The type of the handle is a pointer to the value's type
> + * (e.g., for @c uint32_t lcore variable, the handle is a
> + * <code>uint32_t *</code>. The handler type is used to inform the
Typo: "handler" -> "handle", I think :-/
Found this typo multiple times; search-replace.
> + * access macros the type of the values. A handle may be passed
> + * between modules and threads just like any pointer, but its value
> + * must be treated as a an opaque identifier. An allocated handle
> + * never has the value NULL.
> + *
> + * @b Creation
> + *
> + * An lcore variable is created in two steps:
> + * 1. Define a lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
> + * 2. Allocate lcore variable storage and initialize the handle with
> + * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
> + * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
> + * module initialization, but may be done at any time.
> + *
> + * An lcore variable is not tied to the owning thread's lifetime. It's
> + * available for use by any thread immediately after having been
> + * allocated, and continues to be available throughout the lifetime of
> + * the EAL.
> + *
> + * Lcore variables cannot and need not be freed.
> + *
> + * @b Access
> + *
> + * The value of any lcore variable for any lcore id may be accessed
> + * from any thread (including unregistered threads), but it should
> + * only be *frequently* read from or written to by the owner.
> + *
> + * Values of the same lcore variable but owned by to different lcore
Typo: to -> two
> + * ids may be frequently read or written by the owners without risking
> + * false sharing.
> + *
> + * An appropriate synchronization mechanism (e.g., atomic loads and
> + * stores) should employed to assure there are no data races between
> + * the owning thread and any non-owner threads accessing the same
> + * lcore variable instance.
> + *
> + * The value of the lcore variable for a particular lcore id is
> + * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
> + *
> + * A common pattern is for an EAL thread or a registered non-EAL
> + * thread to access its own lcore variable value. For this purpose, a
> + * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
> + *
> + * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
> + * pointer with the same type as the value, it may not be directly
> + * dereferenced and must be treated as an opaque identifier.
> + *
> + * Lcore variable handles and value pointers may be freely passed
> + * between different threads.
> + *
> + * @b Storage
> + *
> + * An lcore variable's values may by of a primitive type like @c int,
Two typos: "values may by" -> "value may be"
> + * but would more typically be a @c struct.
> + *
> + * The lcore variable handle introduces a per-variable (not
> + * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
> + * there are some memory footprint gains to be made by organizing all
> + * per-lcore id data for a particular module as one lcore variable
> + * (e.g., as a struct).
> + *
> + * An application may choose to define an lcore variable handle, which
> + * it then it goes on to never allocate.
> + *
> + * The size of a lcore variable's value must be less than the DPDK
> + * build-time constant @c RTE_MAX_LCORE_VAR.
> + *
> + * The lcore variable are stored in a series of lcore buffers, which
> + * are allocated from the libc heap. Heap allocation failures are
> + * treated as fatal.
> + *
> + * Lcore variables should generally *not* be @ref __rte_cache_aligned
> + * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
> + * of these constructs are designed to avoid false sharing. In the
> + * case of an lcore variable instance, the thread most recently
> + * accessing nearby data structures should almost-always the lcore
Missing word: should almost-always *be* the lcore variables' owner.
> + * variables' owner. Adding padding will increase the effective memory
> + * working set size, potentially reducing performance.
> + *
> + * Lcore variable values take on an initial value of zero.
> + *
> + * @b Example
> + *
> + * Below is an example of the use of an lcore variable:
> + *
> + * @code{.c}
> + * struct foo_lcore_state {
> + * int a;
> + * long b;
> + * };
> + *
> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
> + *
> + * long foo_get_a_plus_b(void)
> + * {
> + * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
> + *
> + * return state->a + state->b;
> + * }
> + *
> + * RTE_INIT(rte_foo_init)
> + * {
> + * RTE_LCORE_VAR_ALLOC(lcore_states);
> + *
> + * struct foo_lcore_state *state;
> + * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
> + * (initialize 'state')
Consider: (initialize 'state') -> /* initialize 'state' */
> + * }
> + *
> + * (other initialization)
Consider: (other initialization) -> /* other initialization */
> + * }
> + * @endcode
> + *
> + *
> + * @b Alternatives
> + *
> + * Lcore variables are designed to replace a pattern exemplified below:
> + * @code{.c}
> + * struct __rte_cache_aligned foo_lcore_state {
> + * int a;
> + * long b;
> + * RTE_CACHE_GUARD;
> + * };
> + *
> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
> + * @endcode
> + *
> + * This scheme is simple and effective, but has one drawback: the data
> + * is organized so that objects related to all lcores for a particular
> + * module is kept close in memory. At a bare minimum, this forces the
> + * use of cache-line alignment to avoid false sharing. With CPU
Consider adding: use of *padding to* cache-line alignment
My point here is:
This sentence should somehow include the word "padding".
This paragraph is not only aboud cache line alignment, it is primarily about padding.
> + * hardware prefetching and memory loads resulting from speculative
> + * execution (functions which seemingly are getting more eager faster
> + * than they are getting more intelligent), one or more "guard" cache
> + * lines may be required to separate one lcore's data from another's.
> + *
> + * Lcore variables has the upside of working with, not against, the
Typo: has -> have
> + * CPU's assumptions and for example next-line prefetchers may well
> + * work the way its designers intended (i.e., to the benefit, not
> + * detriment, of system performance).
> + *
> + * Another alternative to @ref rte_lcore_var.h is the @ref
> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
Typo: make -> makes
> + * e.g., GCC __thread or C11 _Thread_local). The main differences
> + * between by using the various forms of TLS (e.g., @ref
> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
> + * variables are:
> + *
> + * * The existence and non-existence of a thread-local variable
> + * instance follow that of particular thread's. The data cannot be
Typo: "thread's" -> "threads", I think. :-/
> + * accessed before the thread has been created, nor after it has
> + * exited. As a result, thread-local variables must initialized in
Missing word: must *be* initialized
> + * a "lazy" manner (e.g., at the point of thread creation). Lcore
> + * variables may be accessed immediately after having been
> + * allocated (which may be prior any thread beyond the main
> + * thread is running).
> + * * A thread-local variable is duplicated across all threads in the
> + * process, including unregistered non-EAL threads (i.e.,
> + * "regular" threads). For DPDK applications heavily relying on
> + * multi-threading (in conjunction to DPDK's "one thread per core"
> + * pattern), either by having many concurrent threads or
> + * creating/destroying threads at a high rate, an excessive use of
> + * thread-local variables may cause inefficiencies (e.g.,
> + * increased thread creation overhead due to thread-local storage
> + * initialization or increased total RAM footprint usage). Lcore
> + * variables *only* exist for threads with an lcore id.
> + * * If data in thread-local storage may be shared between threads
> + * (i.e., can a pointer to a thread-local variable be passed to
> + * and successfully dereferenced by non-owning thread) depends on
> + * the details of the TLS implementation. With GCC __thread and
> + * GCC _Thread_local, such data sharing is supported. In the C11
> + * standard, the result of accessing another thread's
> + * _Thread_local object is implementation-defined. Lcore variable
> + * instances may be accessed reliably by any thread.
> + */
> +
> +#include <stddef.h>
> +#include <stdalign.h>
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_lcore.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Given the lcore variable type, produces the type of the lcore
> + * variable handle.
> + */
> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> + type *
> +
> +/**
> + * Define a lcore variable handle.
Typo: "a lcore" -> "an lcore"
Found this typo multiple times; search-replace "a lcore".
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various per-lcore id instances of a per-lcore id variable.
Suggest:
"the various per-lcore id instances of a per-lcore id variable" ->
"the various instances of a per-lcore id variable"
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handler, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable are only to be
Typo: are -> is
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
Consider adding: "the lcore variable *instances* are initialized"
Found this typo multiple times; search-replace.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
> + handle = rte_lcore_var_alloc(size, align)
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle,
> + * with values aligned for any type of object.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
> +
> +/**
> + * Allocate space for an lcore variable of the size and alignment
> requirements
> + * suggested by the handler pointer type, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC(handle) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
> + alignof(typeof(*(handle))))
> +
> +/**
> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> + * means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> + }
> +
> +/**
> + * Allocate an explicitly-sized lcore variable by means of a @ref
> + * RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> +
> +/**
> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT(name) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC(name); \
> + }
> +
> +/**
> + * Get void pointer to lcore variable instance with the specified
> + * lcore id.
> + *
> + * @param lcore_id
> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> + * instances should be accessed. The lcore id need not be valid
> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + * is also not valid (and thus should not be dereferenced).
> + * @param handle
> + * The lcore variable handle.
> + */
> +static inline void *
> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> +{
> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
> +}
> +
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + *
> + * @param lcore_id
> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> + * instances should be accessed. The lcore id need not be valid
> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + * is also not valid (and thus should not be dereferenced).
> + * @param handle
> + * The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_VALUE(handle) \
> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> +
> +/**
> + * Iterate over each lcore id's value for a lcore variable.
> + *
> + * @param value
> + * A pointer set successivly set to point to lcore variable value
"set successivly set" -> "successivly set"
Thinking out loud, ignore at your preference:
During the RFC discussions, the term used for referring to an lcore variable was discussed;
we considered "pointer", but settled for "value".
Perhaps "instance" would be usable in comments like like the one describing this function...
"A pointer set successivly set to point to lcore variable value" ->
"A pointer set successivly set to point to lcore variable instance".
I don't know.
> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> + * @param handle
> + * The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> + for (unsigned int lcore_id = \
> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> + lcore_id < RTE_MAX_LCORE; \
> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
> +
> +/**
> + * Allocate space in the per-lcore id buffers for a lcore variable.
> + *
> + * The pointer returned is only an opaque identifer of the variable. To
> + * get an actual pointer to a particular instance of the variable use
> + * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
> + *
> + * The lcore variable values' memory is set to zero.
> + *
> + * The allocation is always successful, barring a fatal exhaustion of
> + * the per-lcore id buffer space.
> + *
> + * rte_lcore_var_alloc() is not multi-thread safe.
> + *
> + * @param size
> + * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
> + * @param align
> + * If 0, the values will be suitably aligned for any kind of type
> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
> + * on a multiple of *align*, which must be a power of 2 and equal or
> + * less than @c RTE_CACHE_LINE_SIZE.
> + * @return
> + * The id of the variable, stored in a void pointer value. The value
"id" -> "handle"
> + * is always non-NULL.
> + */
> +__rte_experimental
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_LCORE_VAR_H_ */
> diff --git a/lib/eal/version.map b/lib/eal/version.map
> index e3ff412683..5f5a3522c0 100644
> --- a/lib/eal/version.map
> +++ b/lib/eal/version.map
> @@ -396,6 +396,9 @@ EXPERIMENTAL {
>
> # added in 24.03
> rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
> +
> + rte_lcore_var_alloc;
> + rte_lcore_var;
No such function: rte_lcore_var
> };
>
> INTERNAL {
> --
> 2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 9:32 ` Morten Brørup
@ 2024-09-10 10:44 ` Mattias Rönnblom
2024-09-10 13:07 ` Morten Brørup
2024-09-10 15:55 ` Stephen Hemminger
0 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-10 10:44 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand
On 2024-09-10 11:32, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>> Sent: Tuesday, 10 September 2024 09.04
>>
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> Throughout the descriptions and comments,
> please replace "lcore id" with "lcore" (e.g. "per-lcore variables"),
> when referring to the lcore, and not the index of the lcore.
> (Your intention might be to highlight that it only covers threads with an lcore id,
> but if that wasn't the case, you would refer to them as "threads" not "lcores".)
> Except, of course, when referring to an actual lcore id, e.g. lcore_id function parameters.
"lcore" is just another word for "EAL thread." The lcore variables exist
in one instance for every thread with an lcore id, thus also for
registered non-EAL threads (i.e., threads which are not lcores).
I've tried to summarize the (very confusing) terminology of DPDK's
threading model here:
https://ericsson.github.io/dataplanebook/threading/threading.html#eal-threads
So, in my world, "per-lcore id variables" is pretty accurate. You could
say "variables with per-lcore id values" if you want to make it even
more clear, what's going on.
>
> Paraphrasing:
> Consider the type of what you are referring to;
> use "lcore" if its type is "thread", and
> use "lcore id" if its type is "int".
>
> I might be wrong here, but please think hard about it.
>
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small, frequently-accessed data structures, for which one instance
>> should exist for each lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>>
>> --
>
>> +++ b/doc/api/doxy-api-index.md
>> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore-varible](@ref rte_lcore_var.h),
>
> Typo: varible -> variable
>
>
I'll change it to "lcore variables" (no dash, plural).
>> +++ b/doc/guides/rel_notes/release_24_11.rst
>> @@ -55,6 +55,20 @@ New Features
>> Also, make sure to start the actual text at the margin.
>> =======================================================
>>
>> +* **Added EAL per-lcore static memory allocation facility.**
>> +
>> + Added EAL API <rte_lcore_var.h> for statically allocating small,
>> + frequently-accessed data structures, for which one instance should
>> + exist for each lcore.
>> +
>> + With lcore variables, data is organized spatially on a per-lcore
>> + basis, rather than per library or PMD, avoiding the need for cache
>> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
>> + reduces CPU cache internal fragmentation, improving performance.
>> +
>> + Lcore variables are similar to thread-local storage (TLS, e.g.,
>> + C11 _Thread_local), but decoupling the values' life time from that
>> + of the threads.
>
> When referring to TLS, you might want to clarify that lcore variables are not instantiated for unregistered threads.
>
Isn't that clear from the first paragraph? Although it should say "per
lcore id", rather than "per lcore."
>
>> +static void *lcore_buffer;
>> +static size_t offset = RTE_MAX_LCORE_VAR;
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + void *handle;
>> + void *value;
>> +
>> + offset = RTE_ALIGN_CEIL(offset, align);
>> +
>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>> + RTE_VERIFY(lcore_buffer != NULL);
>> +
>> + offset = 0;
>> + }
>
> To determine if the lcore_buffer memory should be allocated, why not just check if lcore_buffer == NULL?
Because it may be the case, lcore_buffer is non-NULL but the remaining
space is too small to service the allocation.
> Then offset wouldn't need an initial value of RTE_MAX_LCORE_VAR.
>
>> +
>> + handle = RTE_PTR_ADD(lcore_buffer, offset);
>> +
>> + offset += size;
>> +
>> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
>> + memset(value, 0, size);
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>> +
>> + return handle;
>> +}
>
>
>> +/**
>> + * @file
>> + *
>> + * RTE Per-lcore id variables
>
> Suggest mentioning the short form too, e.g.:
> "RTE Per-lcore id variables (RTE Lcore variables)"
What about just "RTE Lcore variables"?
Exactly what they are is thoroughly described in the text that follows.
>
>> + *
>> + * This API provides a mechanism to create and access per-lcore id
>> + * variables in a space- and cycle-efficient manner.
>> + *
>> + * A per-lcore id variable (or lcore variable for short) has one value
>> + * for each EAL thread and registered non-EAL thread.
>
> And service thread.
Service threads are EAL threads, or, at a bare minimum, must have a
lcore id, and thus must be registered.
>
>> + * There is one
>> + * copy for each current and future lcore id-equipped thread, with the
>
> "one copy" -> "one instance"
>
Fixed.
>> + * total number of copies amounting to @c RTE_MAX_LCORE. The value of
>
> "copies" -> "instances"
>
OK, I'll rephrase that sentence.
>> + * an lcore variable for a particular lcore id is independent from
>> + * other values (for other lcore ids) within the same lcore variable.
>> + *
>> + * In order to access the values of an lcore variable, a handle is
>> + * used. The type of the handle is a pointer to the value's type
>> + * (e.g., for @c uint32_t lcore variable, the handle is a
>> + * <code>uint32_t *</code>. The handler type is used to inform the
>
> Typo: "handler" -> "handle", I think :-/
> Found this typo multiple times; search-replace.
Fixed.
>
>> + * access macros the type of the values. A handle may be passed
>> + * between modules and threads just like any pointer, but its value
>> + * must be treated as a an opaque identifier. An allocated handle
>> + * never has the value NULL.
>> + *
>> + * @b Creation
>> + *
>> + * An lcore variable is created in two steps:
>> + * 1. Define a lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
>> + * 2. Allocate lcore variable storage and initialize the handle with
>> + * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
>> + * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
>> + * module initialization, but may be done at any time.
>> + *
>> + * An lcore variable is not tied to the owning thread's lifetime. It's
>> + * available for use by any thread immediately after having been
>> + * allocated, and continues to be available throughout the lifetime of
>> + * the EAL.
>> + *
>> + * Lcore variables cannot and need not be freed.
>> + *
>> + * @b Access
>> + *
>> + * The value of any lcore variable for any lcore id may be accessed
>> + * from any thread (including unregistered threads), but it should
>> + * only be *frequently* read from or written to by the owner.
>> + *
>> + * Values of the same lcore variable but owned by to different lcore
>
> Typo: to -> two
>
Fixed.
>> + * ids may be frequently read or written by the owners without risking
>> + * false sharing.
>> + *
>> + * An appropriate synchronization mechanism (e.g., atomic loads and
>> + * stores) should employed to assure there are no data races between
>> + * the owning thread and any non-owner threads accessing the same
>> + * lcore variable instance.
>> + *
>> + * The value of the lcore variable for a particular lcore id is
>> + * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
>> + *
>> + * A common pattern is for an EAL thread or a registered non-EAL
>> + * thread to access its own lcore variable value. For this purpose, a
>> + * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
>> + *
>> + * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
>> + * pointer with the same type as the value, it may not be directly
>> + * dereferenced and must be treated as an opaque identifier.
>> + *
>> + * Lcore variable handles and value pointers may be freely passed
>> + * between different threads.
>> + *
>> + * @b Storage
>> + *
>> + * An lcore variable's values may by of a primitive type like @c int,
>
> Two typos: "values may by" -> "value may be"
>
That's not a typo. An lcore variable take on multiple values, one for
each lcore id. That said, I guess you could refer to the whole thing
(the set of values) as the "value" as well.
>> + * but would more typically be a @c struct.
>> + *
>> + * The lcore variable handle introduces a per-variable (not
>> + * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
>> + * there are some memory footprint gains to be made by organizing all
>> + * per-lcore id data for a particular module as one lcore variable
>> + * (e.g., as a struct).
>> + *
>> + * An application may choose to define an lcore variable handle, which
>> + * it then it goes on to never allocate.
>> + *
>> + * The size of a lcore variable's value must be less than the DPDK
>> + * build-time constant @c RTE_MAX_LCORE_VAR.
>> + *
>> + * The lcore variable are stored in a series of lcore buffers, which
>> + * are allocated from the libc heap. Heap allocation failures are
>> + * treated as fatal.
>> + *
>> + * Lcore variables should generally *not* be @ref __rte_cache_aligned
>> + * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
>> + * of these constructs are designed to avoid false sharing. In the
>> + * case of an lcore variable instance, the thread most recently
>> + * accessing nearby data structures should almost-always the lcore
>
> Missing word: should almost-always *be* the lcore variables' owner.
>
Fixed.
>
>> + * variables' owner. Adding padding will increase the effective memory
>> + * working set size, potentially reducing performance.
>> + *
>> + * Lcore variable values take on an initial value of zero.
>> + *
>> + * @b Example
>> + *
>> + * Below is an example of the use of an lcore variable:
>> + *
>> + * @code{.c}
>> + * struct foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * };
>> + *
>> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
>> + *
>> + * long foo_get_a_plus_b(void)
>> + * {
>> + * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
>> + *
>> + * return state->a + state->b;
>> + * }
>> + *
>> + * RTE_INIT(rte_foo_init)
>> + * {
>> + * RTE_LCORE_VAR_ALLOC(lcore_states);
>> + *
>> + * struct foo_lcore_state *state;
>> + * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
>> + * (initialize 'state')
>
> Consider: (initialize 'state') -> /* initialize 'state' */
>
I think I tried that, and it failed because the compiler didn't like
nested comments.
>> + * }
>> + *
>> + * (other initialization)
>
> Consider: (other initialization) -> /* other initialization */
>
>> + * }
>> + * @endcode
>> + *
>> + *
>> + * @b Alternatives
>> + *
>> + * Lcore variables are designed to replace a pattern exemplified below:
>> + * @code{.c}
>> + * struct __rte_cache_aligned foo_lcore_state {
>> + * int a;
>> + * long b;
>> + * RTE_CACHE_GUARD;
>> + * };
>> + *
>> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
>> + * @endcode
>> + *
>> + * This scheme is simple and effective, but has one drawback: the data
>> + * is organized so that objects related to all lcores for a particular
>> + * module is kept close in memory. At a bare minimum, this forces the
>> + * use of cache-line alignment to avoid false sharing. With CPU
>
> Consider adding: use of *padding to* cache-line alignment
> My point here is:
> This sentence should somehow include the word "padding".
I'm not sure everyone thinks about __rte_cache_aligned or cache-aligned
heap allocations as "padded."
> This paragraph is not only aboud cache line alignment, it is primarily about padding.
>
"At a bare minimum, this requires sizing data structures (e.g., using
`__rte_cache_aligned`) to an even number of cache lines to avoid false
sharing."
How about this?
>> + * hardware prefetching and memory loads resulting from speculative
>> + * execution (functions which seemingly are getting more eager faster
>> + * than they are getting more intelligent), one or more "guard" cache
>> + * lines may be required to separate one lcore's data from another's.
>> + *
>> + * Lcore variables has the upside of working with, not against, the
>
> Typo: has -> have
>
Fixed.
>> + * CPU's assumptions and for example next-line prefetchers may well
>> + * work the way its designers intended (i.e., to the benefit, not
>> + * detriment, of system performance).
>> + *
>> + * Another alternative to @ref rte_lcore_var.h is the @ref
>> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
>
> Typo: make -> makes >
Fixed.
>> + * e.g., GCC __thread or C11 _Thread_local). The main differences
>> + * between by using the various forms of TLS (e.g., @ref
>> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
>> + * variables are:
>> + *
>> + * * The existence and non-existence of a thread-local variable
>> + * instance follow that of particular thread's. The data cannot be
>
> Typo: "thread's" -> "threads", I think. :-/
>
It's not a typo.
>> + * accessed before the thread has been created, nor after it has
>> + * exited. As a result, thread-local variables must initialized in
>
> Missing word: must *be* initialized
>
Fixed.
>> + * a "lazy" manner (e.g., at the point of thread creation). Lcore
>> + * variables may be accessed immediately after having been
>> + * allocated (which may be prior any thread beyond the main
>> + * thread is running).
>> + * * A thread-local variable is duplicated across all threads in the
>> + * process, including unregistered non-EAL threads (i.e.,
>> + * "regular" threads). For DPDK applications heavily relying on
>> + * multi-threading (in conjunction to DPDK's "one thread per core"
>> + * pattern), either by having many concurrent threads or
>> + * creating/destroying threads at a high rate, an excessive use of
>> + * thread-local variables may cause inefficiencies (e.g.,
>> + * increased thread creation overhead due to thread-local storage
>> + * initialization or increased total RAM footprint usage). Lcore
>> + * variables *only* exist for threads with an lcore id.
>> + * * If data in thread-local storage may be shared between threads
>> + * (i.e., can a pointer to a thread-local variable be passed to
>> + * and successfully dereferenced by non-owning thread) depends on
>> + * the details of the TLS implementation. With GCC __thread and
>> + * GCC _Thread_local, such data sharing is supported. In the C11
>> + * standard, the result of accessing another thread's
>> + * _Thread_local object is implementation-defined. Lcore variable
>> + * instances may be accessed reliably by any thread.
>> + */
>> +
>> +#include <stddef.h>
>> +#include <stdalign.h>
>> +
>> +#include <rte_common.h>
>> +#include <rte_config.h>
>> +#include <rte_lcore.h>
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +/**
>> + * Given the lcore variable type, produces the type of the lcore
>> + * variable handle.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
>> + type *
>> +
>> +/**
>> + * Define a lcore variable handle.
>
> Typo: "a lcore" -> "an lcore"
> Found this typo multiple times; search-replace "a lcore".
>
Yes, fixed.
>> + *
>> + * This macro defines a variable which is used as a handle to access
>> + * the various per-lcore id instances of a per-lcore id variable.
>
> Suggest:
> "the various per-lcore id instances of a per-lcore id variable" ->
> "the various instances of a per-lcore id variable" >
Sounds good.
>> + *
>> + * The aim with this macro is to make clear at the point of
>> + * declaration that this is an lcore handler, rather than a regular
>> + * pointer.
>> + *
>> + * Add @b static as a prefix in case the lcore variable are only to be
>
> Typo: are -> is
>
Fixed.
>> + * accessed from a particular translation unit.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>
> Consider adding: "the lcore variable *instances* are initialized"
> Found this typo multiple times; search-replace.
>
It's not a typo. "Values" is just short for "instances of the value",
just like "instances" is. Using instances everywhere may confuse the
reader that an instance both a name and a value, which is not the case.
I don't know, maybe I should be using "values" everywhere instead of
"instances".
I agree there's some lack of consistency here and potential room for
improvement, but I'm not sure exactly how improvement looks like.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
>> + handle = rte_lcore_var_alloc(size, align)
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle,
>> + * with values aligned for any type of object.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
>> +
>> +/**
>> + * Allocate space for an lcore variable of the size and alignment
>> requirements
>> + * suggested by the handler pointer type, and initialize its handle.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC(handle) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
>> + alignof(typeof(*(handle))))
>> +
>> +/**
>> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
>> + * means of a @ref RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
>> + }
>> +
>> +/**
>> + * Allocate an explicitly-sized lcore variable by means of a @ref
>> + * RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
>> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
>> +
>> +/**
>> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT(name) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC(name); \
>> + }
>> +
>> +/**
>> + * Get void pointer to lcore variable instance with the specified
>> + * lcore id.
>> + *
>> + * @param lcore_id
>> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
>> + * instances should be accessed. The lcore id need not be valid
>> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
>> + * is also not valid (and thus should not be dereferenced).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +static inline void *
>> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
>> +{
>> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
>> +}
>> +
>> +/**
>> + * Get pointer to lcore variable instance with the specified lcore id.
>> + *
>> + * @param lcore_id
>> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
>> + * instances should be accessed. The lcore id need not be valid
>> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
>> + * is also not valid (and thus should not be dereferenced).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
>> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
>> +
>> +/**
>> + * Get pointer to lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_VALUE(handle) \
>> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
>> +
>> +/**
>> + * Iterate over each lcore id's value for a lcore variable.
>> + *
>> + * @param value
>> + * A pointer set successivly set to point to lcore variable value
>
> "set successivly set" -> "successivly set"
>
> Thinking out loud, ignore at your preference:
> During the RFC discussions, the term used for referring to an lcore variable was discussed;
> we considered "pointer", but settled for "value".
> Perhaps "instance" would be usable in comments like like the one describing this function...
> "A pointer set successivly set to point to lcore variable value" ->
> "A pointer set successivly set to point to lcore variable instance".
> I don't know.
>
I also don't know.
>
>> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
>> + for (unsigned int lcore_id = \
>> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
>> + lcore_id < RTE_MAX_LCORE; \
>> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
>> +
>> +/**
>> + * Allocate space in the per-lcore id buffers for a lcore variable.
>> + *
>> + * The pointer returned is only an opaque identifer of the variable. To
>> + * get an actual pointer to a particular instance of the variable use
>> + * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
>> + *
>> + * The lcore variable values' memory is set to zero.
>> + *
>> + * The allocation is always successful, barring a fatal exhaustion of
>> + * the per-lcore id buffer space.
>> + *
>> + * rte_lcore_var_alloc() is not multi-thread safe.
>> + *
>> + * @param size
>> + * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
>> + * @param align
>> + * If 0, the values will be suitably aligned for any kind of type
>> + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
>> + * on a multiple of *align*, which must be a power of 2 and equal or
>> + * less than @c RTE_CACHE_LINE_SIZE.
>> + * @return
>> + * The id of the variable, stored in a void pointer value. The value
>
> "id" -> "handle"
>
Fixed.
>> + * is always non-NULL.
>> + */
>> +__rte_experimental
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_LCORE_VAR_H_ */
>> diff --git a/lib/eal/version.map b/lib/eal/version.map
>> index e3ff412683..5f5a3522c0 100644
>> --- a/lib/eal/version.map
>> +++ b/lib/eal/version.map
>> @@ -396,6 +396,9 @@ EXPERIMENTAL {
>>
>> # added in 24.03
>> rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
>> +
>> + rte_lcore_var_alloc;
>> + rte_lcore_var;
>
> No such function: rte_lcore_var
Indeed. That variable is gone. Fixed.
Thanks a lot of your review Morten.
>
>> };
>>
>> INTERNAL {
>> --
>> 2.34.1
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 10:44 ` Mattias Rönnblom
@ 2024-09-10 13:07 ` Morten Brørup
2024-09-10 15:55 ` Stephen Hemminger
1 sibling, 0 replies; 323+ messages in thread
From: Morten Brørup @ 2024-09-10 13:07 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Tuesday, 10 September 2024 12.45
>
> On 2024-09-10 11:32, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> >> Sent: Tuesday, 10 September 2024 09.04
> >>
> >> Introduce DPDK per-lcore id variables, or lcore variables for short.
> >
> > Throughout the descriptions and comments,
> > please replace "lcore id" with "lcore" (e.g. "per-lcore variables"),
> > when referring to the lcore, and not the index of the lcore.
> > (Your intention might be to highlight that it only covers threads with
> an lcore id,
> > but if that wasn't the case, you would refer to them as "threads" not
> "lcores".)
> > Except, of course, when referring to an actual lcore id, e.g. lcore_id
> function parameters.
>
> "lcore" is just another word for "EAL thread." The lcore variables exist
> in one instance for every thread with an lcore id, thus also for
> registered non-EAL threads (i.e., threads which are not lcores).
>
> I've tried to summarize the (very confusing) terminology of DPDK's
> threading model here:
> https://ericsson.github.io/dataplanebook/threading/threading.html#eal-
> threads
>
> So, in my world, "per-lcore id variables" is pretty accurate. You could
> say "variables with per-lcore id values" if you want to make it even
> more clear, what's going on.
With your reference terminology in mind, "per-lcore id variables" is OK with me.
<rant>
DPDK's lcore terminology has drifted quite far away from its original 1:1 meaning, but I'm not going to try to clean it up.
It also seems the meaning of "socket" is drifting.
And the DPDK's project's API/API compatibility ambitions seem to favor bolting on new features to the pile, rather than replacing APIs that have grown misleading with new APIs serving new requirements.
</rant>
>
> >
> > Paraphrasing:
> > Consider the type of what you are referring to;
> > use "lcore" if its type is "thread", and
> > use "lcore id" if its type is "int".
> >
> > I might be wrong here, but please think hard about it.
> >
> >>
> >> An lcore variable has one value for every current and future lcore
> >> id-equipped thread.
> >>
> >> The primary <rte_lcore_var.h> use case is for statically allocating
> >> small, frequently-accessed data structures, for which one instance
> >> should exist for each lcore.
> >>
> >> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> >> _Thread_local), but decoupling the values' life time with that of the
> >> threads.
> >>
> >> Lcore variables are also similar in terms of functionality provided
> by
> >> FreeBSD kernel's DPCPU_*() family of macros and the associated
> >> build-time machinery. DPCPU uses linker scripts, which effectively
> >> prevents the reuse of its, otherwise seemingly viable, approach.
> >>
> >> The currently-prevailing way to solve the same problem as lcore
> >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> >> lcore variables over this approach is that data related to the same
> >> lcore now is close (spatially, in memory), rather than data used by
> >> the same module, which in turn avoid excessive use of padding,
> >> polluting caches with unused data.
> >>
> >> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >>
> >> --
> >
> >> +++ b/doc/api/doxy-api-index.md
> >> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> >> [interrupts](@ref rte_interrupts.h),
> >> [launch](@ref rte_launch.h),
> >> [lcore](@ref rte_lcore.h),
> >> + [lcore-varible](@ref rte_lcore_var.h),
> >
> > Typo: varible -> variable
> >
> >
>
> I'll change it to "lcore variables" (no dash, plural).
+1
>
> >> +++ b/doc/guides/rel_notes/release_24_11.rst
> >> @@ -55,6 +55,20 @@ New Features
> >> Also, make sure to start the actual text at the margin.
> >> =======================================================
> >>
> >> +* **Added EAL per-lcore static memory allocation facility.**
> >> +
> >> + Added EAL API <rte_lcore_var.h> for statically allocating small,
> >> + frequently-accessed data structures, for which one instance
> should
> >> + exist for each lcore.
> >> +
> >> + With lcore variables, data is organized spatially on a per-lcore
> >> + basis, rather than per library or PMD, avoiding the need for
> cache
> >> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
> >> + reduces CPU cache internal fragmentation, improving performance.
> >> +
> >> + Lcore variables are similar to thread-local storage (TLS, e.g.,
> >> + C11 _Thread_local), but decoupling the values' life time from
> that
> >> + of the threads.
> >
> > When referring to TLS, you might want to clarify that lcore variables
> are not instantiated for unregistered threads.
> >
>
> Isn't that clear from the first paragraph? Although it should say "per
> lcore id", rather than "per lcore."
Yes, almost.
But in this paragraph, when you mention that they are similar to TLS, someone might not catch that it still applies (that they only are instantiated for lcores and not all threads). So clarify one extra time, just to ensure that everyone gets it.
>
> >
> >> +static void *lcore_buffer;
> >> +static size_t offset = RTE_MAX_LCORE_VAR;
> >> +
> >> +static void *
> >> +lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + void *handle;
> >> + void *value;
> >> +
> >> + offset = RTE_ALIGN_CEIL(offset, align);
> >> +
> >> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >> + RTE_VERIFY(lcore_buffer != NULL);
> >> +
> >> + offset = 0;
> >> + }
> >
> > To determine if the lcore_buffer memory should be allocated, why not
> just check if lcore_buffer == NULL?
>
> Because it may be the case, lcore_buffer is non-NULL but the remaining
> space is too small to service the allocation.
There's no error handling of that case. You simply forget about the allocated memory, and behave like initial allocation/initialization.
>
> > Then offset wouldn't need an initial value of RTE_MAX_LCORE_VAR.
> >
> >> +
> >> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> >> +
> >> + offset += size;
> >> +
> >> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> >> + memset(value, 0, size);
> >> +
> >> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with
> a "
> >> + "%"PRIuPTR"-byte alignment", size, align);
> >> +
> >> + return handle;
> >> +}
> >
> >
> >> +/**
> >> + * @file
> >> + *
> >> + * RTE Per-lcore id variables
> >
> > Suggest mentioning the short form too, e.g.:
> > "RTE Per-lcore id variables (RTE Lcore variables)"
>
> What about just "RTE Lcore variables"?
+1
>
> Exactly what they are is thoroughly described in the text that follows.
>
> >
> >> + *
> >> + * This API provides a mechanism to create and access per-lcore id
> >> + * variables in a space- and cycle-efficient manner.
> >> + *
> >> + * A per-lcore id variable (or lcore variable for short) has one
> value
> >> + * for each EAL thread and registered non-EAL thread.
> >
> > And service thread.
>
> Service threads are EAL threads, or, at a bare minimum, must have a
> lcore id, and thus must be registered.
Service threads have an lcore id, yes, but they have rte_lcore_role_t enum value ROLE_SERVICE, which differs from that of EAL threads (ROLE_EAL). Registered non-EAL threads have yet another role, ROLE_NON_EAL.
>
> >
> >> + * There is one
> >> + * copy for each current and future lcore id-equipped thread, with
> the
> >
> > "one copy" -> "one instance"
> >
>
> Fixed.
>
> >> + * total number of copies amounting to @c RTE_MAX_LCORE. The value
> of
> >
> > "copies" -> "instances"
> >
>
> OK, I'll rephrase that sentence.
>
> >> + * an lcore variable for a particular lcore id is independent from
> >> + * other values (for other lcore ids) within the same lcore
> variable.
> >> + *
> >> + * In order to access the values of an lcore variable, a handle is
> >> + * used. The type of the handle is a pointer to the value's type
> >> + * (e.g., for @c uint32_t lcore variable, the handle is a
> >> + * <code>uint32_t *</code>. The handler type is used to inform the
> >
> > Typo: "handler" -> "handle", I think :-/
> > Found this typo multiple times; search-replace.
>
> Fixed.
>
> >
> >> + * access macros the type of the values. A handle may be passed
> >> + * between modules and threads just like any pointer, but its value
> >> + * must be treated as a an opaque identifier. An allocated handle
> >> + * never has the value NULL.
> >> + *
> >> + * @b Creation
> >> + *
> >> + * An lcore variable is created in two steps:
> >> + * 1. Define a lcore variable handle by using @ref
> RTE_LCORE_VAR_HANDLE.
> >> + * 2. Allocate lcore variable storage and initialize the handle
> with
> >> + * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
> >> + * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time
> of
> >> + * module initialization, but may be done at any time.
> >> + *
> >> + * An lcore variable is not tied to the owning thread's lifetime.
> It's
> >> + * available for use by any thread immediately after having been
> >> + * allocated, and continues to be available throughout the lifetime
> of
> >> + * the EAL.
> >> + *
> >> + * Lcore variables cannot and need not be freed.
> >> + *
> >> + * @b Access
> >> + *
> >> + * The value of any lcore variable for any lcore id may be accessed
> >> + * from any thread (including unregistered threads), but it should
> >> + * only be *frequently* read from or written to by the owner.
> >> + *
> >> + * Values of the same lcore variable but owned by to different lcore
> >
> > Typo: to -> two
> >
>
> Fixed.
>
> >> + * ids may be frequently read or written by the owners without
> risking
> >> + * false sharing.
> >> + *
> >> + * An appropriate synchronization mechanism (e.g., atomic loads and
> >> + * stores) should employed to assure there are no data races between
> >> + * the owning thread and any non-owner threads accessing the same
> >> + * lcore variable instance.
> >> + *
> >> + * The value of the lcore variable for a particular lcore id is
> >> + * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
> >> + *
> >> + * A common pattern is for an EAL thread or a registered non-EAL
> >> + * thread to access its own lcore variable value. For this purpose,
> a
> >> + * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
> >> + *
> >> + * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is
> a
> >> + * pointer with the same type as the value, it may not be directly
> >> + * dereferenced and must be treated as an opaque identifier.
> >> + *
> >> + * Lcore variable handles and value pointers may be freely passed
> >> + * between different threads.
> >> + *
> >> + * @b Storage
> >> + *
> >> + * An lcore variable's values may by of a primitive type like @c
> int,
> >
> > Two typos: "values may by" -> "value may be"
> >
>
> That's not a typo. An lcore variable take on multiple values, one for
> each lcore id. That said, I guess you could refer to the whole thing
> (the set of values) as the "value" as well.
OK. Reading it the way you explain, I get it. No typo.
>
> >> + * but would more typically be a @c struct.
> >> + *
> >> + * The lcore variable handle introduces a per-variable (not
> >> + * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
> >> + * there are some memory footprint gains to be made by organizing
> all
> >> + * per-lcore id data for a particular module as one lcore variable
> >> + * (e.g., as a struct).
> >> + *
> >> + * An application may choose to define an lcore variable handle,
> which
> >> + * it then it goes on to never allocate.
> >> + *
> >> + * The size of a lcore variable's value must be less than the DPDK
> >> + * build-time constant @c RTE_MAX_LCORE_VAR.
> >> + *
> >> + * The lcore variable are stored in a series of lcore buffers, which
> >> + * are allocated from the libc heap. Heap allocation failures are
> >> + * treated as fatal.
> >> + *
> >> + * Lcore variables should generally *not* be @ref
> __rte_cache_aligned
> >> + * and need *not* include a @ref RTE_CACHE_GUARD field, since the
> use
> >> + * of these constructs are designed to avoid false sharing. In the
> >> + * case of an lcore variable instance, the thread most recently
> >> + * accessing nearby data structures should almost-always the lcore
> >
> > Missing word: should almost-always *be* the lcore variables' owner.
> >
>
> Fixed.
>
> >
> >> + * variables' owner. Adding padding will increase the effective
> memory
> >> + * working set size, potentially reducing performance.
> >> + *
> >> + * Lcore variable values take on an initial value of zero.
> >> + *
> >> + * @b Example
> >> + *
> >> + * Below is an example of the use of an lcore variable:
> >> + *
> >> + * @code{.c}
> >> + * struct foo_lcore_state {
> >> + * int a;
> >> + * long b;
> >> + * };
> >> + *
> >> + * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state,
> lcore_states);
> >> + *
> >> + * long foo_get_a_plus_b(void)
> >> + * {
> >> + * struct foo_lcore_state *state =
> RTE_LCORE_VAR_VALUE(lcore_states);
> >> + *
> >> + * return state->a + state->b;
> >> + * }
> >> + *
> >> + * RTE_INIT(rte_foo_init)
> >> + * {
> >> + * RTE_LCORE_VAR_ALLOC(lcore_states);
> >> + *
> >> + * struct foo_lcore_state *state;
> >> + * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
> >> + * (initialize 'state')
> >
> > Consider: (initialize 'state') -> /* initialize 'state' */
> >
>
> I think I tried that, and it failed because the compiler didn't like
> nested comments.
OK, no objections. Just leave it as is.
>
> >> + * }
> >> + *
> >> + * (other initialization)
> >
> > Consider: (other initialization) -> /* other initialization */
> >
> >> + * }
> >> + * @endcode
> >> + *
> >> + *
> >> + * @b Alternatives
> >> + *
> >> + * Lcore variables are designed to replace a pattern exemplified
> below:
> >> + * @code{.c}
> >> + * struct __rte_cache_aligned foo_lcore_state {
> >> + * int a;
> >> + * long b;
> >> + * RTE_CACHE_GUARD;
> >> + * };
> >> + *
> >> + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
> >> + * @endcode
> >> + *
> >> + * This scheme is simple and effective, but has one drawback: the
> data
> >> + * is organized so that objects related to all lcores for a
> particular
> >> + * module is kept close in memory. At a bare minimum, this forces
> the
> >> + * use of cache-line alignment to avoid false sharing. With CPU
> >
> > Consider adding: use of *padding to* cache-line alignment
> > My point here is:
> > This sentence should somehow include the word "padding".
>
> I'm not sure everyone thinks about __rte_cache_aligned or cache-aligned
> heap allocations as "padded."
>
> > This paragraph is not only aboud cache line alignment, it is primarily
> about padding.
> >
>
> "At a bare minimum, this requires sizing data structures (e.g., using
> `__rte_cache_aligned`) to an even number of cache lines to avoid false
> sharing."
>
> How about this?
OK. Sizing might imply padding, so it serves the point I was targeting.
But "even number" -> "whole number". The number might be odd. :-)
>
> >> + * hardware prefetching and memory loads resulting from speculative
> >> + * execution (functions which seemingly are getting more eager
> faster
> >> + * than they are getting more intelligent), one or more "guard"
> cache
> >> + * lines may be required to separate one lcore's data from
> another's.
> >> + *
> >> + * Lcore variables has the upside of working with, not against, the
> >
> > Typo: has -> have
> >
>
> Fixed.
>
> >> + * CPU's assumptions and for example next-line prefetchers may well
> >> + * work the way its designers intended (i.e., to the benefit, not
> >> + * detriment, of system performance).
> >> + *
> >> + * Another alternative to @ref rte_lcore_var.h is the @ref
> >> + * rte_per_lcore.h API, which make use of thread-local storage (TLS,
> >
> > Typo: make -> makes >
>
> Fixed.
>
> >> + * e.g., GCC __thread or C11 _Thread_local). The main differences
> >> + * between by using the various forms of TLS (e.g., @ref
> >> + * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
> >> + * variables are:
> >> + *
> >> + * * The existence and non-existence of a thread-local variable
> >> + * instance follow that of particular thread's. The data cannot
> be
> >
> > Typo: "thread's" -> "threads", I think. :-/
> >
>
> It's not a typo.
OK.
>
> >> + * accessed before the thread has been created, nor after it has
> >> + * exited. As a result, thread-local variables must initialized
> in
> >
> > Missing word: must *be* initialized
> >
>
> Fixed.
>
> >> + * a "lazy" manner (e.g., at the point of thread creation).
> Lcore
> >> + * variables may be accessed immediately after having been
> >> + * allocated (which may be prior any thread beyond the main
> >> + * thread is running).
> >> + * * A thread-local variable is duplicated across all threads in
> the
> >> + * process, including unregistered non-EAL threads (i.e.,
> >> + * "regular" threads). For DPDK applications heavily relying on
> >> + * multi-threading (in conjunction to DPDK's "one thread per
> core"
> >> + * pattern), either by having many concurrent threads or
> >> + * creating/destroying threads at a high rate, an excessive use
> of
> >> + * thread-local variables may cause inefficiencies (e.g.,
> >> + * increased thread creation overhead due to thread-local
> storage
> >> + * initialization or increased total RAM footprint usage). Lcore
> >> + * variables *only* exist for threads with an lcore id.
> >> + * * If data in thread-local storage may be shared between threads
> >> + * (i.e., can a pointer to a thread-local variable be passed to
> >> + * and successfully dereferenced by non-owning thread) depends
> on
> >> + * the details of the TLS implementation. With GCC __thread and
> >> + * GCC _Thread_local, such data sharing is supported. In the C11
> >> + * standard, the result of accessing another thread's
> >> + * _Thread_local object is implementation-defined. Lcore
> variable
> >> + * instances may be accessed reliably by any thread.
> >> + */
> >> +
> >> +#include <stddef.h>
> >> +#include <stdalign.h>
> >> +
> >> +#include <rte_common.h>
> >> +#include <rte_config.h>
> >> +#include <rte_lcore.h>
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +/**
> >> + * Given the lcore variable type, produces the type of the lcore
> >> + * variable handle.
> >> + */
> >> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> >> + type *
> >> +
> >> +/**
> >> + * Define a lcore variable handle.
> >
> > Typo: "a lcore" -> "an lcore"
> > Found this typo multiple times; search-replace "a lcore".
> >
>
> Yes, fixed.
>
> >> + *
> >> + * This macro defines a variable which is used as a handle to access
> >> + * the various per-lcore id instances of a per-lcore id variable.
> >
> > Suggest:
> > "the various per-lcore id instances of a per-lcore id variable" ->
> > "the various instances of a per-lcore id variable" >
>
> Sounds good.
>
> >> + *
> >> + * The aim with this macro is to make clear at the point of
> >> + * declaration that this is an lcore handler, rather than a regular
> >> + * pointer.
> >> + *
> >> + * Add @b static as a prefix in case the lcore variable are only to
> be
> >
> > Typo: are -> is
> >
>
> Fixed.
>
> >> + * accessed from a particular translation unit.
> >> + */
> >> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> >> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> >> +
> >> +/**
> >> + * Allocate space for an lcore variable, and initialize its handle.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >
> > Consider adding: "the lcore variable *instances* are initialized"
> > Found this typo multiple times; search-replace.
> >
>
> It's not a typo. "Values" is just short for "instances of the value",
> just like "instances" is. Using instances everywhere may confuse the
> reader that an instance both a name and a value, which is not the case.
> I don't know, maybe I should be using "values" everywhere instead of
> "instances".
>
> I agree there's some lack of consistency here and potential room for
> improvement, but I'm not sure exactly how improvement looks like.
Yes, perhaps using "values" (instead of "instances of the value") everywhere,
and avoiding "instances", might be better.
If you repeat/paraphrase your above explanation in the documentation and/or source code, it should cover it.
>
> >> + */
> >> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
> >> + handle = rte_lcore_var_alloc(size, align)
> >> +
> >> +/**
> >> + * Allocate space for an lcore variable, and initialize its handle,
> >> + * with values aligned for any type of object.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >> + */
> >> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
> >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
> >> +
> >> +/**
> >> + * Allocate space for an lcore variable of the size and alignment
> >> requirements
> >> + * suggested by the handler pointer type, and initialize its handle.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >> + */
> >> +#define RTE_LCORE_VAR_ALLOC(handle) \
> >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
> >> + alignof(typeof(*(handle))))
> >> +
> >> +/**
> >> + * Allocate an explicitly-sized, explicitly-aligned lcore variable
> by
> >> + * means of a @ref RTE_INIT constructor.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >> + */
> >> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> >> + RTE_INIT(rte_lcore_var_init_ ## name) \
> >> + { \
> >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> >> + }
> >> +
> >> +/**
> >> + * Allocate an explicitly-sized lcore variable by means of a @ref
> >> + * RTE_INIT constructor.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >> + */
> >> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> >> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> >> +
> >> +/**
> >> + * Allocate an lcore variable by means of a @ref RTE_INIT
> constructor.
> >> + *
> >> + * The values of the lcore variable are initialized to zero.
> >> + */
> >> +#define RTE_LCORE_VAR_INIT(name) \
> >> + RTE_INIT(rte_lcore_var_init_ ## name) \
> >> + { \
> >> + RTE_LCORE_VAR_ALLOC(name); \
> >> + }
> >> +
> >> +/**
> >> + * Get void pointer to lcore variable instance with the specified
> >> + * lcore id.
> >> + *
> >> + * @param lcore_id
> >> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> >> + * instances should be accessed. The lcore id need not be valid
> >> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the
> pointer
> >> + * is also not valid (and thus should not be dereferenced).
> >> + * @param handle
> >> + * The lcore variable handle.
> >> + */
> >> +static inline void *
> >> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> >> +{
> >> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
> >> +}
> >> +
> >> +/**
> >> + * Get pointer to lcore variable instance with the specified lcore
> id.
> >> + *
> >> + * @param lcore_id
> >> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> >> + * instances should be accessed. The lcore id need not be valid
> >> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the
> pointer
> >> + * is also not valid (and thus should not be dereferenced).
> >> + * @param handle
> >> + * The lcore variable handle.
> >> + */
> >> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)
> \
> >> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> >> +
> >> +/**
> >> + * Get pointer to lcore variable instance of the current thread.
> >> + *
> >> + * May only be used by EAL threads and registered non-EAL threads.
> >> + */
> >> +#define RTE_LCORE_VAR_VALUE(handle) \
> >> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> >> +
> >> +/**
> >> + * Iterate over each lcore id's value for a lcore variable.
> >> + *
> >> + * @param value
> >> + * A pointer set successivly set to point to lcore variable value
> >
> > "set successivly set" -> "successivly set"
Don't forget.
> >
> > Thinking out loud, ignore at your preference:
> > During the RFC discussions, the term used for referring to an lcore
> variable was discussed;
> > we considered "pointer", but settled for "value".
> > Perhaps "instance" would be usable in comments like like the one
> describing this function...
> > "A pointer set successivly set to point to lcore variable value" ->
> > "A pointer set successivly set to point to lcore variable instance".
> > I don't know.
> >
>
> I also don't know.
Referring to the terminology above, if you go for "value" rather than "instance" (or "instance of the value"), stick with "value" here too.
>
> >
> >> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> >> + * @param handle
> >> + * The lcore variable handle.
> >> + */
> >> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> >> + for (unsigned int lcore_id = \
> >> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0);
> \
> >> + lcore_id < RTE_MAX_LCORE; \
> >> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
> handle))
> >> +
> >> +/**
> >> + * Allocate space in the per-lcore id buffers for a lcore variable.
> >> + *
> >> + * The pointer returned is only an opaque identifer of the variable.
> To
> >> + * get an actual pointer to a particular instance of the variable
> use
> >> + * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
> >> + *
> >> + * The lcore variable values' memory is set to zero.
> >> + *
> >> + * The allocation is always successful, barring a fatal exhaustion
> of
> >> + * the per-lcore id buffer space.
> >> + *
> >> + * rte_lcore_var_alloc() is not multi-thread safe.
> >> + *
> >> + * @param size
> >> + * The size (in bytes) of the variable's per-lcore id value. Must
> be > 0.
> >> + * @param align
> >> + * If 0, the values will be suitably aligned for any kind of type
> >> + * (i.e., alignof(max_align_t)). Otherwise, the values will be
> aligned
> >> + * on a multiple of *align*, which must be a power of 2 and equal
> or
> >> + * less than @c RTE_CACHE_LINE_SIZE.
> >> + * @return
> >> + * The id of the variable, stored in a void pointer value. The
> value
> >
> > "id" -> "handle"
> >
>
> Fixed.
>
> >> + * is always non-NULL.
> >> + */
> >> +__rte_experimental
> >> +void *
> >> +rte_lcore_var_alloc(size_t size, size_t align);
> >> +
> >> +#ifdef __cplusplus
> >> +}
> >> +#endif
> >> +
> >> +#endif /* _RTE_LCORE_VAR_H_ */
> >> diff --git a/lib/eal/version.map b/lib/eal/version.map
> >> index e3ff412683..5f5a3522c0 100644
> >> --- a/lib/eal/version.map
> >> +++ b/lib/eal/version.map
> >> @@ -396,6 +396,9 @@ EXPERIMENTAL {
> >>
> >> # added in 24.03
> >> rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
> >> +
> >> + rte_lcore_var_alloc;
> >> + rte_lcore_var;
> >
> > No such function: rte_lcore_var
>
> Indeed. That variable is gone. Fixed.
>
> Thanks a lot of your review Morten.
Thanks a lot for your contribution, Mattias. :-)
>
> >
> >> };
> >>
> >> INTERNAL {
> >> --
> >> 2.34.1
> >
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 10:44 ` Mattias Rönnblom
2024-09-10 13:07 ` Morten Brørup
@ 2024-09-10 15:55 ` Stephen Hemminger
1 sibling, 0 replies; 323+ messages in thread
From: Stephen Hemminger @ 2024-09-10 15:55 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Morten Brørup, Mattias Rönnblom, dev,
Konstantin Ananyev, David Marchand, Nandini Persad
On Tue, 10 Sep 2024 12:44:49 +0200
Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> "lcore" is just another word for "EAL thread." The lcore variables exist
> in one instance for every thread with an lcore id, thus also for
> registered non-EAL threads (i.e., threads which are not lcores).
>
> I've tried to summarize the (very confusing) terminology of DPDK's
> threading model here:
> https://ericsson.github.io/dataplanebook/threading/threading.html#eal-threads
>
> So, in my world, "per-lcore id variables" is pretty accurate. You could
> say "variables with per-lcore id values" if you want to make it even
> more clear, what's going on.
This is good and should be in DPDK documentation along with references
to other Intel/Arm documentation.
I don't see a glossary section in current documentation.
The issue goes deeper there is no clear introduction in the current DPDK documentation.
My suggestion would be something similar to Fd.io VPP and other projects
About DPDK
- Introduction
- Glossary
- Supported platforms
- Release notes
- FAQ
Getting stated
- Getting started on Linux
...
- Sample Applications
Developer documentation
- Programmer’s Guide
- HowTo Guides
- DPDK Tools User Guides
- Testpmd Application User Guide
- Drivers
- Network Interface
- Baseband
...
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-10 7:03 ` [PATCH 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-10 9:32 ` Morten Brørup
@ 2024-09-11 10:32 ` Morten Brørup
2024-09-11 15:05 ` Mattias Rönnblom
2024-09-11 17:04 ` [PATCH v2 0/6] Lcore variables Mattias Rönnblom
2 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-09-11 10:32 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Tyler Retzlaff
> +static void *lcore_buffer;
[...]
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
Since lcore_buffer is never freed again, it is easy to support Windows:
#ifdef RTE_EXEC_ENV_WINDOWS
#include <malloc.h>
#endif
#ifndef RTE_EXEC_ENV_WINDOWS
lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
LCORE_BUFFER_SIZE);
#else
/* Never freed again, so don't worry about _aligned_free(). */
lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
RTE_CACHE_LINE_SIZE);
#endif
Ref:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/aligned-malloc?view=msvc-170
NB: Note the opposite parameter order.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-11 10:32 ` Morten Brørup
@ 2024-09-11 15:05 ` Mattias Rönnblom
2024-09-11 15:07 ` Morten Brørup
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-11 15:05 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand, Tyler Retzlaff
On 2024-09-11 12:32, Morten Brørup wrote:
>> +static void *lcore_buffer;
> [...]
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>
> Since lcore_buffer is never freed again, it is easy to support Windows:
>
> #ifdef RTE_EXEC_ENV_WINDOWS
> #include <malloc.h>
> #endif
>
> #ifndef RTE_EXEC_ENV_WINDOWS
> lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> LCORE_BUFFER_SIZE);
> #else
> /* Never freed again, so don't worry about _aligned_free(). */
What is the reason for this comment? It seems like it addresses the
Windows code path in particular.
> lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> RTE_CACHE_LINE_SIZE);
> #endif
>
> Ref:
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/aligned-malloc?view=msvc-170
>
> NB: Note the opposite parameter order.
>
Thanks. I will add something like this.
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH 1/6] eal: add static per-lcore memory allocation facility
2024-09-11 15:05 ` Mattias Rönnblom
@ 2024-09-11 15:07 ` Morten Brørup
0 siblings, 0 replies; 323+ messages in thread
From: Morten Brørup @ 2024-09-11 15:07 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand, Tyler Retzlaff
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Wednesday, 11 September 2024 17.05
>
> On 2024-09-11 12:32, Morten Brørup wrote:
> >> +static void *lcore_buffer;
> > [...]
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >
> > Since lcore_buffer is never freed again, it is easy to support
> Windows:
> >
> > #ifdef RTE_EXEC_ENV_WINDOWS
> > #include <malloc.h>
> > #endif
> >
> > #ifndef RTE_EXEC_ENV_WINDOWS
> > lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> > LCORE_BUFFER_SIZE);
> > #else
> > /* Never freed again, so don't worry about _aligned_free(). */
>
> What is the reason for this comment? It seems like it addresses the
> Windows code path in particular.
It is Windows specific.
Memory allocated with _aligned_malloc() cannot be freed with free(); it needs to be freed with _aligned_free().
>
> > lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> > RTE_CACHE_LINE_SIZE);
> > #endif
> >
> > Ref:
> > https://learn.microsoft.com/en-us/cpp/c-runtime-
> library/reference/aligned-malloc?view=msvc-170
> >
> > NB: Note the opposite parameter order.
> >
>
> Thanks. I will add something like this.
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v2 0/6] Lcore variables
2024-09-10 7:03 ` [PATCH 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-10 9:32 ` Morten Brørup
2024-09-11 10:32 ` Morten Brørup
@ 2024-09-11 17:04 ` Mattias Rönnblom
2024-09-11 17:04 ` [PATCH v2 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (5 more replies)
2 siblings, 6 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-11 17:04 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (6):
eal: add static per-lcore memory allocation facility
eal: add lcore variable test suite
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 1 +
app/test/test_lcore_var.c | 432 +++++++++++++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 ++++++++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
15 files changed, 1029 insertions(+), 87 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v2 1/6] eal: add static per-lcore memory allocation facility
2024-09-11 17:04 ` [PATCH v2 0/6] Lcore variables Mattias Rönnblom
@ 2024-09-11 17:04 ` Mattias Rönnblom
2024-09-12 2:33 ` fengchengwen
` (2 more replies)
2024-09-11 17:04 ` [PATCH v2 2/6] eal: add lcore variable test suite Mattias Rönnblom
` (4 subsequent siblings)
5 siblings, 3 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-11 17:04 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
lib/eal/version.map | 2 +
9 files changed, 489 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..309822039b
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..ec3ab714a8
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,385 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ * A pointer successively set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility
2024-09-11 17:04 ` [PATCH v2 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-12 2:33 ` fengchengwen
2024-09-12 5:35 ` Mattias Rönnblom
2024-09-12 8:44 ` [PATCH v3 0/7] Lcore variables Mattias Rönnblom
2024-09-12 9:10 ` [PATCH v2 1/6] eal: add static per-lcore memory allocation facility Morten Brørup
2 siblings, 1 reply; 323+ messages in thread
From: fengchengwen @ 2024-09-12 2:33 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand
On 2024/9/12 1:04, Mattias Rönnblom wrote:
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> --
>
> PATCH v2:
> * Add Windows support. (Morten Brørup)
> * Fix lcore variables API index reference. (Morten Brørup)
> * Various improvements of the API documentation. (Morten Brørup)
> * Elimination of unused symbol in version.map. (Morten Brørup)
these history could move to the cover letter.
>
> PATCH:
> * Update MAINTAINERS and release notes.
> * Stop covering included files in extern "C" {}.
>
> RFC v6:
> * Include <stdlib.h> to get aligned_alloc().
> * Tweak documentation (grammar).
> * Provide API-level guarantees that lcore variable values take on an
> initial value of zero.
> * Fix misplaced __rte_cache_aligned in the API doc example.
>
> RFC v5:
> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
> * The RTE_LCORE_VAR_GET() and SET() convience access macros
> covered an uncommon use case, where the lcore value is of a
> primitive type, rather than a struct, and is thus eliminated
> from the API. (Morten Brørup)
> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
> RTE_LCORE_VAR_VALUE().
> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
> signal that this function is a part of the public API.
> * Macro arguments are documented.
>
> RFV v4:
> * Replace large static array with libc heap-allocated memory. One
> implication of this change is there no longer exists a fixed upper
> bound for the total amount of memory used by lcore variables.
> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
> maximum size of any individual lcore variable value.
> * Fix issues in example. (Morten Brørup)
> * Improve access macro type checking. (Morten Brørup)
> * Refer to the lcore variable handle as "handle" and not "name" in
> various macros.
> * Document lack of thread safety in rte_lcore_var_alloc().
> * Provide API-level assurance the lcore variable handle is
> always non-NULL, to all applications to use NULL to mean
> "not yet allocated".
> * Note zero-sized allocations are not allowed.
> * Give API-level guarantee the lcore variable values are zeroed.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
> ---
> MAINTAINERS | 6 +
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> doc/guides/rel_notes/release_24_11.rst | 14 +
> lib/eal/common/eal_common_lcore_var.c | 78 +++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
> lib/eal/version.map | 2 +
> 9 files changed, 489 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c5a703b5c0..362d9a3f28 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
> F: lib/eal/common/rte_random.c
> F: app/test/test_rand_perf.c
>
> +Lcore Variables
> +M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> +F: lib/eal/include/rte_lcore_var.h
> +F: lib/eal/common/eal_common_lcore_var.c
> +F: app/test/test_lcore_var.c
> +
> ARM v7
> M: Wathsala Vithanage <wathsala.vithanage@arm.com>
> F: config/arm/
> diff --git a/config/rte_config.h b/config/rte_config.h
> index dd7bb0d35b..311692e498 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -41,6 +41,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index f9f0300126..ed577f14ee 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore variables](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
> index 0ff70d9057..a3884f7491 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -55,6 +55,20 @@ New Features
> Also, make sure to start the actual text at the margin.
> =======================================================
>
> +* **Added EAL per-lcore static memory allocation facility.**
> +
> + Added EAL API <rte_lcore_var.h> for statically allocating small,
> + frequently-accessed data structures, for which one instance should
> + exist for each EAL thread and registered non-EAL thread.
> +
> + With lcore variables, data is organized spatially on a per-lcore id
> + basis, rather than per library or PMD, avoiding the need for cache
> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
> + reduces CPU cache internal fragmentation, improving performance.
> +
> + Lcore variables are similar to thread-local storage (TLS, e.g.,
> + C11 _Thread_local), but decoupling the values' life time from that
> + of the threads.
>
> Removed Items
> -------------
> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..309822039b
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <stdlib.h>
> +
> +#ifdef RTE_EXEC_ENV_WINDOWS
> +#include <malloc.h>
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> +
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
> +#ifdef RTE_EXEC_ENV_WINDOWS
> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> + RTE_CACHE_LINE_SIZE);
> +#else
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
> +#endif
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
> +
> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> + offset += size;
> +
> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> + memset(value, 0, size);
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
Currrent the data was malloc by libc function, I think it's mainly for such INIT macro which will be init before main.
But it will introduce following problem:
1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for each lcore, if we could place it in huge-pages it will reduce the TLB miss rate, especially it freq access data.
2\ it can't across multi-process. many of current lcore-data also don't support multi-process, but I think it worth do that, and it will help us to some service recovery when sub-process failed and reboot.
...
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility
2024-09-12 2:33 ` fengchengwen
@ 2024-09-12 5:35 ` Mattias Rönnblom
2024-09-12 7:05 ` fengchengwen
2024-09-12 7:28 ` Jerin Jacob
0 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-12 5:35 UTC (permalink / raw)
To: fengchengwen, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand
On 2024-09-12 04:33, fengchengwen wrote:
> On 2024/9/12 1:04, Mattias Rönnblom wrote:
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small, frequently-accessed data structures, for which one instance
>> should exist for each lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>>
>> --
>>
>> PATCH v2:
>> * Add Windows support. (Morten Brørup)
>> * Fix lcore variables API index reference. (Morten Brørup)
>> * Various improvements of the API documentation. (Morten Brørup)
>> * Elimination of unused symbol in version.map. (Morten Brørup)
>
> these history could move to the cover letter.
>
>>
>> PATCH:
>> * Update MAINTAINERS and release notes.
>> * Stop covering included files in extern "C" {}.
>>
>> RFC v6:
>> * Include <stdlib.h> to get aligned_alloc().
>> * Tweak documentation (grammar).
>> * Provide API-level guarantees that lcore variable values take on an
>> initial value of zero.
>> * Fix misplaced __rte_cache_aligned in the API doc example.
>>
>> RFC v5:
>> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
>> * The RTE_LCORE_VAR_GET() and SET() convience access macros
>> covered an uncommon use case, where the lcore value is of a
>> primitive type, rather than a struct, and is thus eliminated
>> from the API. (Morten Brørup)
>> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
>> RTE_LCORE_VAR_VALUE().
>> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
>> signal that this function is a part of the public API.
>> * Macro arguments are documented.
>>
>> RFV v4:
>> * Replace large static array with libc heap-allocated memory. One
>> implication of this change is there no longer exists a fixed upper
>> bound for the total amount of memory used by lcore variables.
>> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
>> maximum size of any individual lcore variable value.
>> * Fix issues in example. (Morten Brørup)
>> * Improve access macro type checking. (Morten Brørup)
>> * Refer to the lcore variable handle as "handle" and not "name" in
>> various macros.
>> * Document lack of thread safety in rte_lcore_var_alloc().
>> * Provide API-level assurance the lcore variable handle is
>> always non-NULL, to all applications to use NULL to mean
>> "not yet allocated".
>> * Note zero-sized allocations are not allowed.
>> * Give API-level guarantee the lcore variable values are zeroed.
>>
>> RFC v3:
>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>
>> RFC v2:
>> * Use alignof to derive alignment requirements. (Morten Brørup)
>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>> * Allow user-specified alignment, but limit max to cache line size.
>> ---
>> MAINTAINERS | 6 +
>> config/rte_config.h | 1 +
>> doc/api/doxy-api-index.md | 1 +
>> doc/guides/rel_notes/release_24_11.rst | 14 +
>> lib/eal/common/eal_common_lcore_var.c | 78 +++++
>> lib/eal/common/meson.build | 1 +
>> lib/eal/include/meson.build | 1 +
>> lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
>> lib/eal/version.map | 2 +
>> 9 files changed, 489 insertions(+)
>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index c5a703b5c0..362d9a3f28 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
>> F: lib/eal/common/rte_random.c
>> F: app/test/test_rand_perf.c
>>
>> +Lcore Variables
>> +M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> +F: lib/eal/include/rte_lcore_var.h
>> +F: lib/eal/common/eal_common_lcore_var.c
>> +F: app/test/test_lcore_var.c
>> +
>> ARM v7
>> M: Wathsala Vithanage <wathsala.vithanage@arm.com>
>> F: config/arm/
>> diff --git a/config/rte_config.h b/config/rte_config.h
>> index dd7bb0d35b..311692e498 100644
>> --- a/config/rte_config.h
>> +++ b/config/rte_config.h
>> @@ -41,6 +41,7 @@
>> /* EAL defines */
>> #define RTE_CACHE_GUARD_LINES 1
>> #define RTE_MAX_HEAPS 32
>> +#define RTE_MAX_LCORE_VAR 1048576
>> #define RTE_MAX_MEMSEG_LISTS 128
>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index f9f0300126..ed577f14ee 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
>> [interrupts](@ref rte_interrupts.h),
>> [launch](@ref rte_launch.h),
>> [lcore](@ref rte_lcore.h),
>> + [lcore variables](@ref rte_lcore_var.h),
>> [per-lcore](@ref rte_per_lcore.h),
>> [service cores](@ref rte_service.h),
>> [keepalive](@ref rte_keepalive.h),
>> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
>> index 0ff70d9057..a3884f7491 100644
>> --- a/doc/guides/rel_notes/release_24_11.rst
>> +++ b/doc/guides/rel_notes/release_24_11.rst
>> @@ -55,6 +55,20 @@ New Features
>> Also, make sure to start the actual text at the margin.
>> =======================================================
>>
>> +* **Added EAL per-lcore static memory allocation facility.**
>> +
>> + Added EAL API <rte_lcore_var.h> for statically allocating small,
>> + frequently-accessed data structures, for which one instance should
>> + exist for each EAL thread and registered non-EAL thread.
>> +
>> + With lcore variables, data is organized spatially on a per-lcore id
>> + basis, rather than per library or PMD, avoiding the need for cache
>> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
>> + reduces CPU cache internal fragmentation, improving performance.
>> +
>> + Lcore variables are similar to thread-local storage (TLS, e.g.,
>> + C11 _Thread_local), but decoupling the values' life time from that
>> + of the threads.
>>
>> Removed Items
>> -------------
>> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
>> new file mode 100644
>> index 0000000000..309822039b
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,78 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +#include <stdlib.h>
>> +
>> +#ifdef RTE_EXEC_ENV_WINDOWS
>> +#include <malloc.h>
>> +#endif
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
>> +
>> +static void *lcore_buffer;
>> +static size_t offset = RTE_MAX_LCORE_VAR;
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + void *handle;
>> + void *value;
>> +
>> + offset = RTE_ALIGN_CEIL(offset, align);
>> +
>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>> +#ifdef RTE_EXEC_ENV_WINDOWS
>> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
>> + RTE_CACHE_LINE_SIZE);
>> +#else
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>> +#endif
>> + RTE_VERIFY(lcore_buffer != NULL);
>> +
>> + offset = 0;
>> + }
>> +
>> + handle = RTE_PTR_ADD(lcore_buffer, offset);
>> +
>> + offset += size;
>> +
>> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
>> + memset(value, 0, size);
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>
> Currrent the data was malloc by libc function, I think it's mainly for such INIT macro which will be init before main.
> But it will introduce following problem:
> 1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for each lcore, if we could place it in huge-pages it will reduce the TLB miss rate, especially it freq access data.
This mechanism is for small allocations, which the sum of is also
expected to be small (although the system won't break if they aren't).
If you have large allocations, you are better off using lazy huge page
allocations further down the initialization process. Otherwise, you will
end up using memory for RTE_MAX_LCORE instances, rather than the actual
lcore count, which could be substantially smaller.
But sure, everything else being equal, you could have used huge pages
for these lcore variable values. But everything isn't equal.
> 2\ it can't across multi-process. many of current lcore-data also don't support multi-process, but I think it worth do that, and it will help us to some service recovery when sub-process failed and reboot.
>
> ...
>
Not sure I think that's a downside. Further cementing that anti-pattern
into DPDK seems to be a bad idea to me.
lcore variables doesn't *introduce* any of these issues, since the
mechanisms it's replacing also have these shortcomings (if you think
about them as such - I'm not sure I do).
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility
2024-09-12 5:35 ` Mattias Rönnblom
@ 2024-09-12 7:05 ` fengchengwen
2024-09-12 7:28 ` Jerin Jacob
1 sibling, 0 replies; 323+ messages in thread
From: fengchengwen @ 2024-09-12 7:05 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand
On 2024/9/12 13:35, Mattias Rönnblom wrote:
> On 2024-09-12 04:33, fengchengwen wrote:
>> On 2024/9/12 1:04, Mattias Rönnblom wrote:
>>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>>
>>> An lcore variable has one value for every current and future lcore
>>> id-equipped thread.
>>>
>>> The primary <rte_lcore_var.h> use case is for statically allocating
>>> small, frequently-accessed data structures, for which one instance
>>> should exist for each lcore.
>>>
>>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>>> _Thread_local), but decoupling the values' life time with that of the
>>> threads.
>>>
>>> Lcore variables are also similar in terms of functionality provided by
>>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>>> build-time machinery. DPCPU uses linker scripts, which effectively
>>> prevents the reuse of its, otherwise seemingly viable, approach.
>>>
>>> The currently-prevailing way to solve the same problem as lcore
>>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>>> lcore variables over this approach is that data related to the same
>>> lcore now is close (spatially, in memory), rather than data used by
>>> the same module, which in turn avoid excessive use of padding,
>>> polluting caches with unused data.
>>>
>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>>>
>>> --
>>>
>>> PATCH v2:
>>> * Add Windows support. (Morten Brørup)
>>> * Fix lcore variables API index reference. (Morten Brørup)
>>> * Various improvements of the API documentation. (Morten Brørup)
>>> * Elimination of unused symbol in version.map. (Morten Brørup)
>>
>> these history could move to the cover letter.
>>
>>>
>>> PATCH:
>>> * Update MAINTAINERS and release notes.
>>> * Stop covering included files in extern "C" {}.
>>>
>>> RFC v6:
>>> * Include <stdlib.h> to get aligned_alloc().
>>> * Tweak documentation (grammar).
>>> * Provide API-level guarantees that lcore variable values take on an
>>> initial value of zero.
>>> * Fix misplaced __rte_cache_aligned in the API doc example.
>>>
>>> RFC v5:
>>> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
>>> * The RTE_LCORE_VAR_GET() and SET() convience access macros
>>> covered an uncommon use case, where the lcore value is of a
>>> primitive type, rather than a struct, and is thus eliminated
>>> from the API. (Morten Brørup)
>>> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
>>> RTE_LCORE_VAR_VALUE().
>>> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
>>> signal that this function is a part of the public API.
>>> * Macro arguments are documented.
>>>
>>> RFV v4:
>>> * Replace large static array with libc heap-allocated memory. One
>>> implication of this change is there no longer exists a fixed upper
>>> bound for the total amount of memory used by lcore variables.
>>> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
>>> maximum size of any individual lcore variable value.
>>> * Fix issues in example. (Morten Brørup)
>>> * Improve access macro type checking. (Morten Brørup)
>>> * Refer to the lcore variable handle as "handle" and not "name" in
>>> various macros.
>>> * Document lack of thread safety in rte_lcore_var_alloc().
>>> * Provide API-level assurance the lcore variable handle is
>>> always non-NULL, to all applications to use NULL to mean
>>> "not yet allocated".
>>> * Note zero-sized allocations are not allowed.
>>> * Give API-level guarantee the lcore variable values are zeroed.
>>>
>>> RFC v3:
>>> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
>>> * Update example to reflect FOREACH macro name change (in RFC v2).
>>>
>>> RFC v2:
>>> * Use alignof to derive alignment requirements. (Morten Brørup)
>>> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
>>> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
>>> * Allow user-specified alignment, but limit max to cache line size.
>>> ---
>>> MAINTAINERS | 6 +
>>> config/rte_config.h | 1 +
>>> doc/api/doxy-api-index.md | 1 +
>>> doc/guides/rel_notes/release_24_11.rst | 14 +
>>> lib/eal/common/eal_common_lcore_var.c | 78 +++++
>>> lib/eal/common/meson.build | 1 +
>>> lib/eal/include/meson.build | 1 +
>>> lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
>>> lib/eal/version.map | 2 +
>>> 9 files changed, 489 insertions(+)
>>> create mode 100644 lib/eal/common/eal_common_lcore_var.c
>>> create mode 100644 lib/eal/include/rte_lcore_var.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index c5a703b5c0..362d9a3f28 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
>>> F: lib/eal/common/rte_random.c
>>> F: app/test/test_rand_perf.c
>>> +Lcore Variables
>>> +M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>> +F: lib/eal/include/rte_lcore_var.h
>>> +F: lib/eal/common/eal_common_lcore_var.c
>>> +F: app/test/test_lcore_var.c
>>> +
>>> ARM v7
>>> M: Wathsala Vithanage <wathsala.vithanage@arm.com>
>>> F: config/arm/
>>> diff --git a/config/rte_config.h b/config/rte_config.h
>>> index dd7bb0d35b..311692e498 100644
>>> --- a/config/rte_config.h
>>> +++ b/config/rte_config.h
>>> @@ -41,6 +41,7 @@
>>> /* EAL defines */
>>> #define RTE_CACHE_GUARD_LINES 1
>>> #define RTE_MAX_HEAPS 32
>>> +#define RTE_MAX_LCORE_VAR 1048576
>>> #define RTE_MAX_MEMSEG_LISTS 128
>>> #define RTE_MAX_MEMSEG_PER_LIST 8192
>>> #define RTE_MAX_MEM_MB_PER_LIST 32768
>>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>>> index f9f0300126..ed577f14ee 100644
>>> --- a/doc/api/doxy-api-index.md
>>> +++ b/doc/api/doxy-api-index.md
>>> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
>>> [interrupts](@ref rte_interrupts.h),
>>> [launch](@ref rte_launch.h),
>>> [lcore](@ref rte_lcore.h),
>>> + [lcore variables](@ref rte_lcore_var.h),
>>> [per-lcore](@ref rte_per_lcore.h),
>>> [service cores](@ref rte_service.h),
>>> [keepalive](@ref rte_keepalive.h),
>>> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
>>> index 0ff70d9057..a3884f7491 100644
>>> --- a/doc/guides/rel_notes/release_24_11.rst
>>> +++ b/doc/guides/rel_notes/release_24_11.rst
>>> @@ -55,6 +55,20 @@ New Features
>>> Also, make sure to start the actual text at the margin.
>>> =======================================================
>>> +* **Added EAL per-lcore static memory allocation facility.**
>>> +
>>> + Added EAL API <rte_lcore_var.h> for statically allocating small,
>>> + frequently-accessed data structures, for which one instance should
>>> + exist for each EAL thread and registered non-EAL thread.
>>> +
>>> + With lcore variables, data is organized spatially on a per-lcore id
>>> + basis, rather than per library or PMD, avoiding the need for cache
>>> + aligning (or RTE_CACHE_GUARDing) data structures, which in turn
>>> + reduces CPU cache internal fragmentation, improving performance.
>>> +
>>> + Lcore variables are similar to thread-local storage (TLS, e.g.,
>>> + C11 _Thread_local), but decoupling the values' life time from that
>>> + of the threads.
>>> Removed Items
>>> -------------
>>> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
>>> new file mode 100644
>>> index 0000000000..309822039b
>>> --- /dev/null
>>> +++ b/lib/eal/common/eal_common_lcore_var.c
>>> @@ -0,0 +1,78 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2024 Ericsson AB
>>> + */
>>> +
>>> +#include <inttypes.h>
>>> +#include <stdlib.h>
>>> +
>>> +#ifdef RTE_EXEC_ENV_WINDOWS
>>> +#include <malloc.h>
>>> +#endif
>>> +
>>> +#include <rte_common.h>
>>> +#include <rte_debug.h>
>>> +#include <rte_log.h>
>>> +
>>> +#include <rte_lcore_var.h>
>>> +
>>> +#include "eal_private.h"
>>> +
>>> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
>>> +
>>> +static void *lcore_buffer;
>>> +static size_t offset = RTE_MAX_LCORE_VAR;
>>> +
>>> +static void *
>>> +lcore_var_alloc(size_t size, size_t align)
>>> +{
>>> + void *handle;
>>> + void *value;
>>> +
>>> + offset = RTE_ALIGN_CEIL(offset, align);
>>> +
>>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>>> +#ifdef RTE_EXEC_ENV_WINDOWS
>>> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
>>> + RTE_CACHE_LINE_SIZE);
>>> +#else
>>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>>> + LCORE_BUFFER_SIZE);
>>> +#endif
>>> + RTE_VERIFY(lcore_buffer != NULL);
>>> +
>>> + offset = 0;
>>> + }
>>> +
>>> + handle = RTE_PTR_ADD(lcore_buffer, offset);
>>> +
>>> + offset += size;
>>> +
>>> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
>>> + memset(value, 0, size);
>>> +
>>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>>> + "%"PRIuPTR"-byte alignment", size, align);
>>
>> Currrent the data was malloc by libc function, I think it's mainly for such INIT macro which will be init before main.
>> But it will introduce following problem:
>> 1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for each lcore, if we could place it in huge-pages it will reduce the TLB miss rate, especially it freq access data.
>
> This mechanism is for small allocations, which the sum of is also expected to be small (although the system won't break if they aren't).
>
> If you have large allocations, you are better off using lazy huge page allocations further down the initialization process. Otherwise, you will end up using memory for RTE_MAX_LCORE instances, rather than the actual lcore count, which could be substantially smaller.
Yes, it may cost two much memory if allocated from hugepage memory.
>
> But sure, everything else being equal, you could have used huge pages for these lcore variable values. But everything isn't equal.
>
>> 2\ it can't across multi-process. many of current lcore-data also don't support multi-process, but I think it worth do that, and it will help us to some service recovery when sub-process failed and reboot.
>>
>> ...
>>
>
> Not sure I think that's a downside. Further cementing that anti-pattern into DPDK seems to be a bad idea to me.
>
> lcore variables doesn't *introduce* any of these issues, since the mechanisms it's replacing also have these shortcomings (if you think about them as such - I'm not sure I do).
Got it.
This feature is a enhanced for current lcore variables, which bring together scattered data from the point view of a single core.
and current it seemmed hard to extend support hugepage memory.
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility
2024-09-12 5:35 ` Mattias Rönnblom
2024-09-12 7:05 ` fengchengwen
@ 2024-09-12 7:28 ` Jerin Jacob
1 sibling, 0 replies; 323+ messages in thread
From: Jerin Jacob @ 2024-09-12 7:28 UTC (permalink / raw)
To: Mattias Rönnblom, Anatoly Burakov
Cc: fengchengwen, Mattias Rönnblom, dev, Morten Brørup,
Stephen Hemminger, Konstantin Ananyev, David Marchand
On Thu, Sep 12, 2024 at 11:05 AM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> On 2024-09-12 04:33, fengchengwen wrote:
> > On 2024/9/12 1:04, Mattias Rönnblom wrote:
> >> Introduce DPDK per-lcore id variables, or lcore variables for short.
> >>
> >> An lcore variable has one value for every current and future lcore
> >> id-equipped thread.
> >>
> >> The primary <rte_lcore_var.h> use case is for statically allocating
> >> small, frequently-accessed data structures, for which one instance
> >> should exist for each lcore.
> >>
> >> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> >> _Thread_local), but decoupling the values' life time with that of the
> >> threads.
> >>
> >> Lcore variables are also similar in terms of functionality provided by
> >> FreeBSD kernel's DPCPU_*() family of macros and the associated
> >> build-time machinery. DPCPU uses linker scripts, which effectively
> >> prevents the reuse of its, otherwise seemingly viable, approach.
> >>
> >> The currently-prevailing way to solve the same problem as lcore
> >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> >> lcore variables over this approach is that data related to the same
> >> lcore now is close (spatially, in memory), rather than data used by
> >> the same module, which in turn avoid excessive use of padding,
> >> polluting caches with unused data.
> >>
> >> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >>
> >> --
> >>
> >> +
> >> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> >> +
> >> +static void *lcore_buffer;
> >> +static size_t offset = RTE_MAX_LCORE_VAR;
> >> +
> >> +static void *
> >> +lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + void *handle;
> >> + void *value;
> >> +
> >> + offset = RTE_ALIGN_CEIL(offset, align);
> >> +
> >> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >> +#ifdef RTE_EXEC_ENV_WINDOWS
> >> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> >> + RTE_CACHE_LINE_SIZE);
> >> +#else
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >> +#endif
> >> + RTE_VERIFY(lcore_buffer != NULL);
> >> +
> >> + offset = 0;
> >> + }
> >> +
> >> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> >> +
> >> + offset += size;
> >> +
> >> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> >> + memset(value, 0, size);
> >> +
> >> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> >> + "%"PRIuPTR"-byte alignment", size, align);
> >
> > Currrent the data was malloc by libc function, I think it's mainly for such INIT macro which will be init before main.
> > But it will introduce following problem:
> > 1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for each lcore, if we could place it in huge-pages it will reduce the TLB miss rate, especially it freq access data.
>
> This mechanism is for small allocations, which the sum of is also
> expected to be small (although the system won't break if they aren't).
>
> If you have large allocations, you are better off using lazy huge page
> allocations further down the initialization process. Otherwise, you will
> end up using memory for RTE_MAX_LCORE instances, rather than the actual
> lcore count, which could be substantially smaller.
+ @Anatoly Burakov
If I am not wrong, DPDK huge page memory allocator (rte_malloc()), may
have similar overhead glibc once. Meaning, The hugepage allocated only
when needed and space is over.
if so, why not use rte_malloc() if available.
>
> But sure, everything else being equal, you could have used huge pages
> for these lcore variable values. But everything isn't equal.
>
> > 2\ it can't across multi-process. many of current lcore-data also don't support multi-process, but I think it worth do that, and it will help us to some service recovery when sub-process failed and reboot.
> >
> > ...
> >
>
> Not sure I think that's a downside. Further cementing that anti-pattern
> into DPDK seems to be a bad idea to me.
>
> lcore variables doesn't *introduce* any of these issues, since the
> mechanisms it's replacing also have these shortcomings (if you think
> about them as such - I'm not sure I do).
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v3 0/7] Lcore variables
2024-09-11 17:04 ` [PATCH v2 1/6] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-12 2:33 ` fengchengwen
@ 2024-09-12 8:44 ` Mattias Rönnblom
2024-09-12 8:44 ` [PATCH v3 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (6 more replies)
2024-09-12 9:10 ` [PATCH v2 1/6] eal: add static per-lcore memory allocation facility Morten Brørup
2 siblings, 7 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-12 8:44 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (7):
eal: add static per-lcore memory allocation facility
eal: add lcore variable functional tests
eal: add lcore variable performance test
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 2 +
app/test/test_lcore_var.c | 432 +++++++++++++++++++++++++
app/test/test_lcore_var_perf.c | 160 +++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 ++++++++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
16 files changed, 1190 insertions(+), 87 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 app/test/test_lcore_var_perf.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v3 1/7] eal: add static per-lcore memory allocation facility
2024-09-12 8:44 ` [PATCH v3 0/7] Lcore variables Mattias Rönnblom
@ 2024-09-12 8:44 ` Mattias Rönnblom
2024-09-16 10:52 ` [PATCH v4 0/7] Lcore variables Mattias Rönnblom
2024-09-12 8:44 ` [PATCH v3 2/7] eal: add lcore variable functional tests Mattias Rönnblom
` (5 subsequent siblings)
6 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-12 8:44 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
lib/eal/version.map | 2 +
9 files changed, 489 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..309822039b
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..ec3ab714a8
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,385 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ * A pointer successively set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v4 0/7] Lcore variables
2024-09-12 8:44 ` [PATCH v3 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-16 10:52 ` Mattias Rönnblom
2024-09-16 10:52 ` [PATCH v4 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (6 more replies)
0 siblings, 7 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-16 10:52 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (7):
eal: add static per-lcore memory allocation facility
eal: add lcore variable functional tests
eal: add lcore variable performance test
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 2 +
app/test/test_lcore_var.c | 432 +++++++++++++++++++++++++
app/test/test_lcore_var_perf.c | 244 ++++++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 ++++++++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
16 files changed, 1274 insertions(+), 87 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 app/test/test_lcore_var_perf.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 10:52 ` [PATCH v4 0/7] Lcore variables Mattias Rönnblom
@ 2024-09-16 10:52 ` Mattias Rönnblom
2024-09-16 14:02 ` Konstantin Ananyev
2024-09-17 14:32 ` [PATCH v5 0/7] Lcore variables Mattias Rönnblom
2024-09-16 10:52 ` [PATCH v4 2/7] eal: add lcore variable functional tests Mattias Rönnblom
` (5 subsequent siblings)
6 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-16 10:52 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 +++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++
lib/eal/version.map | 2 +
9 files changed, 489 insertions(+)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..309822039b
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..ec3ab714a8
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,385 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ * A pointer successively set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 10:52 ` [PATCH v4 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-16 14:02 ` Konstantin Ananyev
2024-09-16 17:39 ` Morten Brørup
2024-09-17 14:28 ` Mattias Rönnblom
2024-09-17 14:32 ` [PATCH v5 0/7] Lcore variables Mattias Rönnblom
1 sibling, 2 replies; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-16 14:02 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
LGTM in general, few small questions (mostly nits), see below.
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <stdlib.h>
> +
> +#ifdef RTE_EXEC_ENV_WINDOWS
> +#include <malloc.h>
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> +
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
> +#ifdef RTE_EXEC_ENV_WINDOWS
> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> + RTE_CACHE_LINE_SIZE);
> +#else
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
> +#endif
Don't remember did that question already arise or not:
For debugging and health-checking purposes - would it make sense to link all
lcore_buffer values into a linked list?
So user/developer/some tool can walk over it to check that provided handle value
is really a valid lcore_var, etc.
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
> +
> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> + offset += size;
> +
> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> + memset(value, 0, size);
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return handle;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> + * assures as well as having the base pointer aligned on cache
> + * size assures that aligned offsets also translate to alipgned
> + * pointers across all values.
> + */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> +
> + /* '0' means asking for worst-case alignment requirements */
> + if (align == 0)
> + align = alignof(max_align_t);
> +
> + RTE_ASSERT(rte_is_power_of_2(align));
> +
> + return lcore_var_alloc(size, align);
> +}
....
> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
> new file mode 100644
> index 0000000000..ec3ab714a8
> --- /dev/null
> +++ b/lib/eal/include/rte_lcore_var.h
...
> +/**
> + * Given the lcore variable type, produces the type of the lcore
> + * variable handle.
> + */
> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> + type *
> +
> +/**
> + * Define an lcore variable handle.
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various instances of a per-lcore id variable.
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handle, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable is only to be
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name) \
> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
> + handle = rte_lcore_var_alloc(size, align)
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle,
> + * with values aligned for any type of object.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
> +
> +/**
> + * Allocate space for an lcore variable of the size and alignment requirements
> + * suggested by the handle pointer type, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC(handle) \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
> + alignof(typeof(*(handle))))
> +
> +/**
> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> + * means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> + }
> +
> +/**
> + * Allocate an explicitly-sized lcore variable by means of a @ref
> + * RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> +
> +/**
> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT(name) \
> + RTE_INIT(rte_lcore_var_init_ ## name) \
> + { \
> + RTE_LCORE_VAR_ALLOC(name); \
> + }
> +
> +/**
> + * Get void pointer to lcore variable instance with the specified
> + * lcore id.
> + *
> + * @param lcore_id
> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> + * instances should be accessed. The lcore id need not be valid
> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + * is also not valid (and thus should not be dereferenced).
> + * @param handle
> + * The lcore variable handle.
> + */
> +static inline void *
> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> +{
> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
> +}
> +
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + *
> + * @param lcore_id
> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> + * instances should be accessed. The lcore id need not be valid
> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + * is also not valid (and thus should not be dereferenced).
> + * @param handle
> + * The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_VALUE(handle) \
> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
After all if people do not want this extra check, they can probably use
RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
explicitly.
> +
> +/**
> + * Iterate over each lcore id's value for an lcore variable.
> + *
> + * @param value
> + * A pointer successively set to point to lcore variable value
> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> + * @param handle
> + * The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> + for (unsigned int lcore_id = \
> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> + lcore_id < RTE_MAX_LCORE; \
> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
Might be a bit better (and safer) to make lcore_id a macro parameter?
I.E.:
define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
for ((lcore_id) = ...
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 14:02 ` Konstantin Ananyev
@ 2024-09-16 17:39 ` Morten Brørup
2024-09-16 23:19 ` Konstantin Ananyev
2024-09-17 14:28 ` Mattias Rönnblom
1 sibling, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-09-16 17:39 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Jerin Jacob
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Monday, 16 September 2024 16.02
>
> > Introduce DPDK per-lcore id variables, or lcore variables for short.
> >
> > An lcore variable has one value for every current and future lcore
> > id-equipped thread.
> >
> > The primary <rte_lcore_var.h> use case is for statically allocating
> > small, frequently-accessed data structures, for which one instance
> > should exist for each lcore.
> >
> > Lcore variables are similar to thread-local storage (TLS, e.g., C11
> > _Thread_local), but decoupling the values' life time with that of the
> > threads.
> >
> > Lcore variables are also similar in terms of functionality provided by
> > FreeBSD kernel's DPCPU_*() family of macros and the associated
> > build-time machinery. DPCPU uses linker scripts, which effectively
> > prevents the reuse of its, otherwise seemingly viable, approach.
> >
> > The currently-prevailing way to solve the same problem as lcore
> > variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> > array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> > lcore variables over this approach is that data related to the same
> > lcore now is close (spatially, in memory), rather than data used by
> > the same module, which in turn avoid excessive use of padding,
> > polluting caches with unused data.
> >
> > Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> LGTM in general, few small questions (mostly nits), see below.
>
> > --- /dev/null
> > +++ b/lib/eal/common/eal_common_lcore_var.c
> > @@ -0,0 +1,78 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2024 Ericsson AB
> > + */
> > +
> > +#include <inttypes.h>
> > +#include <stdlib.h>
> > +
> > +#ifdef RTE_EXEC_ENV_WINDOWS
> > +#include <malloc.h>
> > +#endif
> > +
> > +#include <rte_common.h>
> > +#include <rte_debug.h>
> > +#include <rte_log.h>
> > +
> > +#include <rte_lcore_var.h>
> > +
> > +#include "eal_private.h"
> > +
> > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> > +
> > +static void *lcore_buffer;
> > +static size_t offset = RTE_MAX_LCORE_VAR;
> > +
> > +static void *
> > +lcore_var_alloc(size_t size, size_t align)
> > +{
> > + void *handle;
> > + void *value;
> > +
> > + offset = RTE_ALIGN_CEIL(offset, align);
> > +
> > + if (offset + size > RTE_MAX_LCORE_VAR) {
> > +#ifdef RTE_EXEC_ENV_WINDOWS
> > + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> > + RTE_CACHE_LINE_SIZE);
> > +#else
> > + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> > + LCORE_BUFFER_SIZE);
> > +#endif
>
> Don't remember did that question already arise or not:
> For debugging and health-checking purposes - would it make sense to link
> all
> lcore_buffer values into a linked list?
> So user/developer/some tool can walk over it to check that provided
> handle value
> is really a valid lcore_var, etc.
Nice idea.
Such a list, along with an accompanying dump function can be added later.
>
> > + RTE_VERIFY(lcore_buffer != NULL);
> > +
> > + offset = 0;
> > + }
> > +
> > + handle = RTE_PTR_ADD(lcore_buffer, offset);
> > +
> > + offset += size;
> > +
> > + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> > + memset(value, 0, size);
> > +
> > + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with
> a "
> > + "%"PRIuPTR"-byte alignment", size, align);
> > +
> > + return handle;
> > +}
> > +
> > +void *
> > +rte_lcore_var_alloc(size_t size, size_t align)
> > +{
> > + /* Having the per-lcore buffer size aligned on cache lines
> > + * assures as well as having the base pointer aligned on cache
> > + * size assures that aligned offsets also translate to alipgned
> > + * pointers across all values.
> > + */
> > + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> > + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> > + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> > +
> > + /* '0' means asking for worst-case alignment requirements */
> > + if (align == 0)
> > + align = alignof(max_align_t);
> > +
> > + RTE_ASSERT(rte_is_power_of_2(align));
> > +
> > + return lcore_var_alloc(size, align);
> > +}
>
> ....
>
> > diff --git a/lib/eal/include/rte_lcore_var.h
> b/lib/eal/include/rte_lcore_var.h
> > new file mode 100644
> > index 0000000000..ec3ab714a8
> > --- /dev/null
> > +++ b/lib/eal/include/rte_lcore_var.h
>
> ...
>
> > +/**
> > + * Given the lcore variable type, produces the type of the lcore
> > + * variable handle.
> > + */
> > +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
> > + type *
> > +
> > +/**
> > + * Define an lcore variable handle.
> > + *
> > + * This macro defines a variable which is used as a handle to access
> > + * the various instances of a per-lcore id variable.
> > + *
> > + * The aim with this macro is to make clear at the point of
> > + * declaration that this is an lcore handle, rather than a regular
> > + * pointer.
> > + *
> > + * Add @b static as a prefix in case the lcore variable is only to be
> > + * accessed from a particular translation unit.
> > + */
> > +#define RTE_LCORE_VAR_HANDLE(type, name) \
> > + RTE_LCORE_VAR_HANDLE_TYPE(type) name
> > +
> > +/**
> > + * Allocate space for an lcore variable, and initialize its handle.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
> > + handle = rte_lcore_var_alloc(size, align)
> > +
> > +/**
> > + * Allocate space for an lcore variable, and initialize its handle,
> > + * with values aligned for any type of object.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
> > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
> > +
> > +/**
> > + * Allocate space for an lcore variable of the size and alignment
> requirements
> > + * suggested by the handle pointer type, and initialize its handle.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_ALLOC(handle) \
> > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
> > + alignof(typeof(*(handle))))
> > +
> > +/**
> > + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> > + * means of a @ref RTE_INIT constructor.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
> > + RTE_INIT(rte_lcore_var_init_ ## name) \
> > + { \
> > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
> > + }
> > +
> > +/**
> > + * Allocate an explicitly-sized lcore variable by means of a @ref
> > + * RTE_INIT constructor.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
> > + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> > +
> > +/**
> > + * Allocate an lcore variable by means of a @ref RTE_INIT
> constructor.
> > + *
> > + * The values of the lcore variable are initialized to zero.
> > + */
> > +#define RTE_LCORE_VAR_INIT(name) \
> > + RTE_INIT(rte_lcore_var_init_ ## name) \
> > + { \
> > + RTE_LCORE_VAR_ALLOC(name); \
> > + }
> > +
> > +/**
> > + * Get void pointer to lcore variable instance with the specified
> > + * lcore id.
> > + *
> > + * @param lcore_id
> > + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> > + * instances should be accessed. The lcore id need not be valid
> > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the
> pointer
> > + * is also not valid (and thus should not be dereferenced).
> > + * @param handle
> > + * The lcore variable handle.
> > + */
> > +static inline void *
> > +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> > +{
> > + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
> > +}
> > +
> > +/**
> > + * Get pointer to lcore variable instance with the specified lcore
> id.
> > + *
> > + * @param lcore_id
> > + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> > + * instances should be accessed. The lcore id need not be valid
> > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the
> pointer
> > + * is also not valid (and thus should not be dereferenced).
> > + * @param handle
> > + * The lcore variable handle.
> > + */
> > +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
> > + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> > +
> > +/**
> > + * Get pointer to lcore variable instance of the current thread.
> > + *
> > + * May only be used by EAL threads and registered non-EAL threads.
> > + */
> > +#define RTE_LCORE_VAR_VALUE(handle) \
> > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
>
> Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> After all if people do not want this extra check, they can probably use
> RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> explicitly.
Not generally. I prefer keeping it brief.
We could add a _SAFE variant with this extra check, like LIST_FOREACH has LIST_FOREACH_SAFE (although for a different purpose).
Come to think of it: In the name of brevity, consider renaming RTE_LCORE_VAR_VALUE to RTE_LCORE_VAR. (And RTE_LCORE_VAR_FOREACH_VALUE to RTE_LCORE_VAR_FOREACH.) We want to see these everywhere in the code.
>
> > +
> > +/**
> > + * Iterate over each lcore id's value for an lcore variable.
> > + *
> > + * @param value
> > + * A pointer successively set to point to lcore variable value
> > + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> > + * @param handle
> > + * The lcore variable handle.
> > + */
> > +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> > + for (unsigned int lcore_id = \
> > + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0);
> \
> > + lcore_id < RTE_MAX_LCORE; \
> > + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
> handle))
>
> Might be a bit better (and safer) to make lcore_id a macro parameter?
> I.E.:
> define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
> for ((lcore_id) = ...
The same thought have struck me, so I checked the scope of lcore_id.
The scope of lcore_id remains limited to the for loop, i.e. it is available inside the for loop, but not after it.
IMO this suffices, and lcore_id doesn't need to be a macro parameter.
Maybe renaming lcore_id to _lcore_id would be an improvement, if lcore_id is already defined and used for other purposes within the for loop.
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 17:39 ` Morten Brørup
@ 2024-09-16 23:19 ` Konstantin Ananyev
2024-09-17 7:12 ` Morten Brørup
0 siblings, 1 reply; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-16 23:19 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Jerin Jacob
> > > +/**
> > > + * Get pointer to lcore variable instance of the current thread.
> > > + *
> > > + * May only be used by EAL threads and registered non-EAL threads.
> > > + */
> > > +#define RTE_LCORE_VAR_VALUE(handle) \
> > > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> >
> > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> > After all if people do not want this extra check, they can probably use
> > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > explicitly.
>
> Not generally. I prefer keeping it brief.
> We could add a _SAFE variant with this extra check, like LIST_FOREACH has LIST_FOREACH_SAFE (although for a different purpose).
>
> Come to think of it: In the name of brevity, consider renaming RTE_LCORE_VAR_VALUE to RTE_LCORE_VAR. (And
> RTE_LCORE_VAR_FOREACH_VALUE to RTE_LCORE_VAR_FOREACH.) We want to see these everywhere in the code.
Well, it is not about brevity...
I just feel uncomfortable that our own public macro doesn't check value
returned by rte_lcore_id() and introduce a possible out-of-bound memory access.
> >
> > > +
> > > +/**
> > > + * Iterate over each lcore id's value for an lcore variable.
> > > + *
> > > + * @param value
> > > + * A pointer successively set to point to lcore variable value
> > > + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> > > + * @param handle
> > > + * The lcore variable handle.
> > > + */
> > > +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> > > + for (unsigned int lcore_id = \
> > > + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0);
> > \
> > > + lcore_id < RTE_MAX_LCORE; \
> > > + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
> > handle))
> >
> > Might be a bit better (and safer) to make lcore_id a macro parameter?
> > I.E.:
> > define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
> > for ((lcore_id) = ...
>
> The same thought have struck me, so I checked the scope of lcore_id.
> The scope of lcore_id remains limited to the for loop, i.e. it is available inside the for loop, but not after it.
Variable with the same name (and type) can be defined by used before the loop,
With the intention to use it inside the loop.
Just like it happens here (in patch #2):
+ unsigned int lcore_id;
.....
+ /* take the opportunity to test the foreach macro */
+ int *v;
+ lcore_id = 0;
+ RTE_LCORE_VAR_FOREACH_VALUE(v, test_int) {
+ TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
+ "Unexpected value on lcore %d during "
+ "iteration", lcore_id);
+ lcore_id++;
+ }
+
> IMO this suffices, and lcore_id doesn't need to be a macro parameter.
> Maybe renaming lcore_id to _lcore_id would be an improvement, if lcore_id is already defined and used for other purposes within
> the for loop.
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 23:19 ` Konstantin Ananyev
@ 2024-09-17 7:12 ` Morten Brørup
2024-09-17 8:09 ` Konstantin Ananyev
0 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-09-17 7:12 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Jerin Jacob
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Tuesday, 17 September 2024 01.20
>
> > > > +/**
> > > > + * Get pointer to lcore variable instance of the current thread.
> > > > + *
> > > > + * May only be used by EAL threads and registered non-EAL threads.
> > > > + */
> > > > +#define RTE_LCORE_VAR_VALUE(handle) \
> > > > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > >
> > > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> > > After all if people do not want this extra check, they can probably use
> > > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > > explicitly.
> >
> > Not generally. I prefer keeping it brief.
> > We could add a _SAFE variant with this extra check, like LIST_FOREACH has
> LIST_FOREACH_SAFE (although for a different purpose).
> >
> > Come to think of it: In the name of brevity, consider renaming
> RTE_LCORE_VAR_VALUE to RTE_LCORE_VAR. (And
> > RTE_LCORE_VAR_FOREACH_VALUE to RTE_LCORE_VAR_FOREACH.) We want to see these
> everywhere in the code.
>
> Well, it is not about brevity...
> I just feel uncomfortable that our own public macro doesn't check value
> returned by rte_lcore_id() and introduce a possible out-of-bound memory
> access.
For performance reasons, we generally don't check parameter validity in fast path functions/macros; lots of code in DPDK uses ptr->array[rte_lcore_id()] without checking rte_lcore_id() validity.
We shouldn't do it here either.
There's a secondary benefit:
RTE_LCORE_VAR_VALUE() returns a pointer, so this macro can always be used.
Especially, the pointer can be initialized with other variables at the start of a function:
struct mystruct * const state = RTE_LCORE_VAR_VALUE(state_handle);
The out-of-bound memory access will occur if dereferencing the pointer.
>
>
> > >
> > > > +
> > > > +/**
> > > > + * Iterate over each lcore id's value for an lcore variable.
> > > > + *
> > > > + * @param value
> > > > + * A pointer successively set to point to lcore variable value
> > > > + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> > > > + * @param handle
> > > > + * The lcore variable handle.
> > > > + */
> > > > +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> > > > + for (unsigned int lcore_id = \
> > > > + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)),
> 0);
> > > \
> > > > + lcore_id < RTE_MAX_LCORE; \
> > > > + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
> > > handle))
> > >
> > > Might be a bit better (and safer) to make lcore_id a macro parameter?
> > > I.E.:
> > > define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
> > > for ((lcore_id) = ...
> >
> > The same thought have struck me, so I checked the scope of lcore_id.
> > The scope of lcore_id remains limited to the for loop, i.e. it is available
> inside the for loop, but not after it.
>
> Variable with the same name (and type) can be defined by used before the loop,
> With the intention to use it inside the loop.
> Just like it happens here (in patch #2):
> + unsigned int lcore_id;
> .....
> + /* take the opportunity to test the foreach macro */
> + int *v;
> + lcore_id = 0;
> + RTE_LCORE_VAR_FOREACH_VALUE(v, test_int) {
> + TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
> + "Unexpected value on lcore %d during "
> + "iteration", lcore_id);
> + lcore_id++;
> + }
> +
>
You convinced me here, Konstantin.
Adding the iterator (lcore_id) as a macro parameter reduces the risk of bugs, and has no real disadvantages.
>
> > IMO this suffices, and lcore_id doesn't need to be a macro parameter.
> > Maybe renaming lcore_id to _lcore_id would be an improvement, if lcore_id is
> already defined and used for other purposes within
> > the for loop.
PS:
We discussed the _VALUE postfix previously, Mattias, and I agreed to it. But now that I have become more familiar with the code, I think the _VALUE postfix should be dropped.
I'm usually in favor of long variable/function/macro names, arguing that they improve code readability.
But I don't think the _VALUE postfix really improves readability.
Especially when RTE_LCORE_VAR() has become widely used, and everyone is familiar with it, a long name (RTE_LCORE_VAR_VALUE()) will be more annoying than helpful.
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 7:12 ` Morten Brørup
@ 2024-09-17 8:09 ` Konstantin Ananyev
0 siblings, 0 replies; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-17 8:09 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Jerin Jacob
> > From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> > Sent: Tuesday, 17 September 2024 01.20
> >
> > > > > +/**
> > > > > + * Get pointer to lcore variable instance of the current thread.
> > > > > + *
> > > > > + * May only be used by EAL threads and registered non-EAL threads.
> > > > > + */
> > > > > +#define RTE_LCORE_VAR_VALUE(handle) \
> > > > > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > > >
> > > > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> > > > After all if people do not want this extra check, they can probably use
> > > > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > > > explicitly.
> > >
> > > Not generally. I prefer keeping it brief.
> > > We could add a _SAFE variant with this extra check, like LIST_FOREACH has
> > LIST_FOREACH_SAFE (although for a different purpose).
> > >
> > > Come to think of it: In the name of brevity, consider renaming
> > RTE_LCORE_VAR_VALUE to RTE_LCORE_VAR. (And
> > > RTE_LCORE_VAR_FOREACH_VALUE to RTE_LCORE_VAR_FOREACH.) We want to see these
> > everywhere in the code.
> >
> > Well, it is not about brevity...
> > I just feel uncomfortable that our own public macro doesn't check value
> > returned by rte_lcore_id() and introduce a possible out-of-bound memory
> > access.
>
> For performance reasons, we generally don't check parameter validity in fast path functions/macros; lots of code in DPDK uses ptr-
> >array[rte_lcore_id()] without checking rte_lcore_id() validity.
Yes there are plenty of such places inside DPDK...
Ok, I'll leave it for the author to decide, after all there is a clear comment
in front of it forbidding to use that macro for non-EAL threads.
Hope users will read it before using ;)
> We shouldn't do it here either.
>
> There's a secondary benefit:
> RTE_LCORE_VAR_VALUE() returns a pointer, so this macro can always be used.
> Especially, the pointer can be initialized with other variables at the start of a function:
> struct mystruct * const state = RTE_LCORE_VAR_VALUE(state_handle);
> The out-of-bound memory access will occur if dereferencing the pointer.
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-16 14:02 ` Konstantin Ananyev
2024-09-16 17:39 ` Morten Brørup
@ 2024-09-17 14:28 ` Mattias Rönnblom
2024-09-17 16:11 ` Konstantin Ananyev
2024-09-17 16:29 ` Konstantin Ananyev
1 sibling, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-17 14:28 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
On 2024-09-16 16:02, Konstantin Ananyev wrote:
>
>
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small, frequently-accessed data structures, for which one instance
>> should exist for each lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> LGTM in general, few small questions (mostly nits), see below.
>
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,78 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +#include <stdlib.h>
>> +
>> +#ifdef RTE_EXEC_ENV_WINDOWS
>> +#include <malloc.h>
>> +#endif
>> +
>> +#include <rte_common.h>
>> +#include <rte_debug.h>
>> +#include <rte_log.h>
>> +
>> +#include <rte_lcore_var.h>
>> +
>> +#include "eal_private.h"
>> +
>> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
>> +
>> +static void *lcore_buffer;
>> +static size_t offset = RTE_MAX_LCORE_VAR;
>> +
>> +static void *
>> +lcore_var_alloc(size_t size, size_t align)
>> +{
>> + void *handle;
>> + void *value;
>> +
>> + offset = RTE_ALIGN_CEIL(offset, align);
>> +
>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>> +#ifdef RTE_EXEC_ENV_WINDOWS
>> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
>> + RTE_CACHE_LINE_SIZE);
>> +#else
>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>> + LCORE_BUFFER_SIZE);
>> +#endif
>
> Don't remember did that question already arise or not:
> For debugging and health-checking purposes - would it make sense to link all
> lcore_buffer values into a linked list?
> So user/developer/some tool can walk over it to check that provided handle value
> is really a valid lcore_var, etc.
>
At least you could add some basic statistics, like the total size
allocated my lcore variables, and the number of variables.
One could also add tracing.
>> + RTE_VERIFY(lcore_buffer != NULL);
>> +
>> + offset = 0;
>> + }
>> +
>> + handle = RTE_PTR_ADD(lcore_buffer, offset);
>> +
>> + offset += size;
>> +
>> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
>> + memset(value, 0, size);
>> +
>> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
>> + "%"PRIuPTR"-byte alignment", size, align);
>> +
>> + return handle;
>> +}
>> +
>> +void *
>> +rte_lcore_var_alloc(size_t size, size_t align)
>> +{
>> + /* Having the per-lcore buffer size aligned on cache lines
>> + * assures as well as having the base pointer aligned on cache
>> + * size assures that aligned offsets also translate to alipgned
>> + * pointers across all values.
>> + */
>> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
>> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
>> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
>> +
>> + /* '0' means asking for worst-case alignment requirements */
>> + if (align == 0)
>> + align = alignof(max_align_t);
>> +
>> + RTE_ASSERT(rte_is_power_of_2(align));
>> +
>> + return lcore_var_alloc(size, align);
>> +}
>
> ....
>
>> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
>> new file mode 100644
>> index 0000000000..ec3ab714a8
>> --- /dev/null
>> +++ b/lib/eal/include/rte_lcore_var.h
>
> ...
>
>> +/**
>> + * Given the lcore variable type, produces the type of the lcore
>> + * variable handle.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
>> + type *
>> +
>> +/**
>> + * Define an lcore variable handle.
>> + *
>> + * This macro defines a variable which is used as a handle to access
>> + * the various instances of a per-lcore id variable.
>> + *
>> + * The aim with this macro is to make clear at the point of
>> + * declaration that this is an lcore handle, rather than a regular
>> + * pointer.
>> + *
>> + * Add @b static as a prefix in case the lcore variable is only to be
>> + * accessed from a particular translation unit.
>> + */
>> +#define RTE_LCORE_VAR_HANDLE(type, name) \
>> + RTE_LCORE_VAR_HANDLE_TYPE(type) name
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
>> + handle = rte_lcore_var_alloc(size, align)
>> +
>> +/**
>> + * Allocate space for an lcore variable, and initialize its handle,
>> + * with values aligned for any type of object.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
>> +
>> +/**
>> + * Allocate space for an lcore variable of the size and alignment requirements
>> + * suggested by the handle pointer type, and initialize its handle.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_ALLOC(handle) \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
>> + alignof(typeof(*(handle))))
>> +
>> +/**
>> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
>> + * means of a @ref RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
>> + }
>> +
>> +/**
>> + * Allocate an explicitly-sized lcore variable by means of a @ref
>> + * RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
>> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
>> +
>> +/**
>> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
>> + *
>> + * The values of the lcore variable are initialized to zero.
>> + */
>> +#define RTE_LCORE_VAR_INIT(name) \
>> + RTE_INIT(rte_lcore_var_init_ ## name) \
>> + { \
>> + RTE_LCORE_VAR_ALLOC(name); \
>> + }
>> +
>> +/**
>> + * Get void pointer to lcore variable instance with the specified
>> + * lcore id.
>> + *
>> + * @param lcore_id
>> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
>> + * instances should be accessed. The lcore id need not be valid
>> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
>> + * is also not valid (and thus should not be dereferenced).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +static inline void *
>> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
>> +{
>> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
>> +}
>> +
>> +/**
>> + * Get pointer to lcore variable instance with the specified lcore id.
>> + *
>> + * @param lcore_id
>> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
>> + * instances should be accessed. The lcore id need not be valid
>> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
>> + * is also not valid (and thus should not be dereferenced).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
>> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
>> +
>> +/**
>> + * Get pointer to lcore variable instance of the current thread.
>> + *
>> + * May only be used by EAL threads and registered non-EAL threads.
>> + */
>> +#define RTE_LCORE_VAR_VALUE(handle) \
>> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
>
> Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> After all if people do not want this extra check, they can probably use
> RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> explicitly.
>
It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't think
so. Attempting to gracefully handle API violations is bad practice, imo.
>> +
>> +/**
>> + * Iterate over each lcore id's value for an lcore variable.
>> + *
>> + * @param value
>> + * A pointer successively set to point to lcore variable value
>> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
>> + for (unsigned int lcore_id = \
>> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
>> + lcore_id < RTE_MAX_LCORE; \
>> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
>
> Might be a bit better (and safer) to make lcore_id a macro parameter?
> I.E.:
> define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
> for ((lcore_id) = ...
>
Why?
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 14:28 ` Mattias Rönnblom
@ 2024-09-17 16:11 ` Konstantin Ananyev
2024-09-18 7:00 ` Mattias Rönnblom
2024-09-17 16:29 ` Konstantin Ananyev
1 sibling, 1 reply; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-17 16:11 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
> >> +
> >> +/**
> >> + * Get pointer to lcore variable instance with the specified lcore id.
> >> + *
> >> + * @param lcore_id
> >> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
> >> + * instances should be accessed. The lcore id need not be valid
> >> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> >> + * is also not valid (and thus should not be dereferenced).
> >> + * @param handle
> >> + * The lcore variable handle.
> >> + */
> >> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
> >> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> >> +
> >> +/**
> >> + * Get pointer to lcore variable instance of the current thread.
> >> + *
> >> + * May only be used by EAL threads and registered non-EAL threads.
> >> + */
> >> +#define RTE_LCORE_VAR_VALUE(handle) \
> >> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> >
> > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
> > After all if people do not want this extra check, they can probably use
> > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > explicitly.
> >
>
> It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't think
> so. Attempting to gracefully handle API violations is bad practice, imo.
Ok, RTE_ASSERT() might be a good compromise.
As I said in another mail for that thread, I wouldn't insist here.
>
> >> +
> >> +/**
> >> + * Iterate over each lcore id's value for an lcore variable.
> >> + *
> >> + * @param value
> >> + * A pointer successively set to point to lcore variable value
> >> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> >> + * @param handle
> >> + * The lcore variable handle.
> >> + */
> >> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
> >> + for (unsigned int lcore_id = \
> >> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> >> + lcore_id < RTE_MAX_LCORE; \
> >> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
> >
> > Might be a bit better (and safer) to make lcore_id a macro parameter?
> > I.E.:
> > define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
> > for ((lcore_id) = ...
> >
>
> Why?
Variable with the same name (and type) can be defined by user before the loop,
With the intention to use it inside the loop.
Just like it happens here (in patch #2):
+ unsigned int lcore_id;
.....
+ /* take the opportunity to test the foreach macro */
+ int *v;
+ lcore_id = 0;
+ RTE_LCORE_VAR_FOREACH_VALUE(v, test_int) {
+ TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
+ "Unexpected value on lcore %d during "
+ "iteration", lcore_id);
+ lcore_id++;
+ }
+
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 16:11 ` Konstantin Ananyev
@ 2024-09-18 7:00 ` Mattias Rönnblom
0 siblings, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 7:00 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
On 2024-09-17 18:11, Konstantin Ananyev wrote:
>>>> +
>>>> +/**
>>>> + * Get pointer to lcore variable instance with the specified lcore id.
>>>> + *
>>>> + * @param lcore_id
>>>> + * The lcore id specifying which of the @c RTE_MAX_LCORE value
>>>> + * instances should be accessed. The lcore id need not be valid
>>>> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
>>>> + * is also not valid (and thus should not be dereferenced).
>>>> + * @param handle
>>>> + * The lcore variable handle.
>>>> + */
>>>> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
>>>> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
>>>> +
>>>> +/**
>>>> + * Get pointer to lcore variable instance of the current thread.
>>>> + *
>>>> + * May only be used by EAL threads and registered non-EAL threads.
>>>> + */
>>>> +#define RTE_LCORE_VAR_VALUE(handle) \
>>>> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
>>>
>>> Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY?
>>> After all if people do not want this extra check, they can probably use
>>> RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
>>> explicitly.
>>>
>>
>> It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't think
>> so. Attempting to gracefully handle API violations is bad practice, imo.
>
> Ok, RTE_ASSERT() might be a good compromise.
> As I said in another mail for that thread, I wouldn't insist here.
>
After a having a closer look at this issue, I'm not so sure any more.
Such an assertion would disallow the use of the macros to retrieve a
potentially-invalid pointer, which is then never used, in case it is
invalid.
>>
>>>> +
>>>> +/**
>>>> + * Iterate over each lcore id's value for an lcore variable.
>>>> + *
>>>> + * @param value
>>>> + * A pointer successively set to point to lcore variable value
>>>> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
>>>> + * @param handle
>>>> + * The lcore variable handle.
>>>> + */
>>>> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
>>>> + for (unsigned int lcore_id = \
>>>> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
>>>> + lcore_id < RTE_MAX_LCORE; \
>>>> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
>>>
>>> Might be a bit better (and safer) to make lcore_id a macro parameter?
>>> I.E.:
>>> define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
>>> for ((lcore_id) = ...
>>>
>>
>> Why?
>
> Variable with the same name (and type) can be defined by user before the loop,
> With the intention to use it inside the loop.
> Just like it happens here (in patch #2):
> + unsigned int lcore_id;
> .....
> + /* take the opportunity to test the foreach macro */
> + int *v;
> + lcore_id = 0;
> + RTE_LCORE_VAR_FOREACH_VALUE(v, test_int) {
> + TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
> + "Unexpected value on lcore %d during "
> + "iteration", lcore_id);
> + lcore_id++;
> + }
> +
>
>
Indeed. I'll change it. I suppose you could also have issues if you
nested the macro, although those could be solved by using something like
__COUNTER__ to create a unique name.
Supplying the variable name does defeat part of the purpose of the
RTE_LCORE_VAR_FOREACH_VALUE.
>
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 14:28 ` Mattias Rönnblom
2024-09-17 16:11 ` Konstantin Ananyev
@ 2024-09-17 16:29 ` Konstantin Ananyev
2024-09-18 7:50 ` Mattias Rönnblom
1 sibling, 1 reply; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-17 16:29 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
> >> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> >> +
> >> +static void *lcore_buffer;
> >> +static size_t offset = RTE_MAX_LCORE_VAR;
> >> +
> >> +static void *
> >> +lcore_var_alloc(size_t size, size_t align)
> >> +{
> >> + void *handle;
> >> + void *value;
> >> +
> >> + offset = RTE_ALIGN_CEIL(offset, align);
> >> +
> >> + if (offset + size > RTE_MAX_LCORE_VAR) {
> >> +#ifdef RTE_EXEC_ENV_WINDOWS
> >> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> >> + RTE_CACHE_LINE_SIZE);
> >> +#else
> >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> >> + LCORE_BUFFER_SIZE);
> >> +#endif
> >
> > Don't remember did that question already arise or not:
> > For debugging and health-checking purposes - would it make sense to link all
> > lcore_buffer values into a linked list?
> > So user/developer/some tool can walk over it to check that provided handle value
> > is really a valid lcore_var, etc.
> >
>
> At least you could add some basic statistics, like the total size
> allocated my lcore variables, and the number of variables.
My thought was more about easing debugging/health-cheking,
but yes, some stats can also be collected.
> One could also add tracing.
>
> >> + RTE_VERIFY(lcore_buffer != NULL);
> >> +
> >> + offset = 0;
> >> + }
> >> +
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 16:29 ` Konstantin Ananyev
@ 2024-09-18 7:50 ` Mattias Rönnblom
0 siblings, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 7:50 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
On 2024-09-17 18:29, Konstantin Ananyev wrote:
>
>>>> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
>>>> +
>>>> +static void *lcore_buffer;
>>>> +static size_t offset = RTE_MAX_LCORE_VAR;
>>>> +
>>>> +static void *
>>>> +lcore_var_alloc(size_t size, size_t align)
>>>> +{
>>>> + void *handle;
>>>> + void *value;
>>>> +
>>>> + offset = RTE_ALIGN_CEIL(offset, align);
>>>> +
>>>> + if (offset + size > RTE_MAX_LCORE_VAR) {
>>>> +#ifdef RTE_EXEC_ENV_WINDOWS
>>>> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
>>>> + RTE_CACHE_LINE_SIZE);
>>>> +#else
>>>> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
>>>> + LCORE_BUFFER_SIZE);
>>>> +#endif
>>>
>>> Don't remember did that question already arise or not:
>>> For debugging and health-checking purposes - would it make sense to link all
>>> lcore_buffer values into a linked list?
>>> So user/developer/some tool can walk over it to check that provided handle value
>>> is really a valid lcore_var, etc.
>>>
>>
>> At least you could add some basic statistics, like the total size
>> allocated my lcore variables, and the number of variables.
>
> My thought was more about easing debugging/health-cheking,
> but yes, some stats can also be collected.
>
Statistics could be used for debugging and maybe some kind of
rudimentary sanity check.
Maintaining per-variable state is not necessarily something you want to
do, at least not close (spatially) to the lcore variable values.
In summary, I'm yet to form an opinion what, if anything, we should have
here to help debugging. To avoid bloat, I would suggest this being
deferred up to a point where we have more experience with lcore variables.
>> One could also add tracing.
>>
>>>> + RTE_VERIFY(lcore_buffer != NULL);
>>>> +
>>>> + offset = 0;
>>>> + }
>>>> +
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v5 0/7] Lcore variables
2024-09-16 10:52 ` [PATCH v4 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-16 14:02 ` Konstantin Ananyev
@ 2024-09-17 14:32 ` Mattias Rönnblom
2024-09-17 14:32 ` [PATCH v5 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (6 more replies)
1 sibling, 7 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-17 14:32 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (7):
eal: add static per-lcore memory allocation facility
eal: add lcore variable functional tests
eal: add lcore variable performance test
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 2 +
app/test/test_lcore_var.c | 432 ++++++++++++++++++
app/test/test_lcore_var_perf.c | 257 +++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 ++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 34 +-
17 files changed, 1326 insertions(+), 93 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 app/test/test_lcore_var_perf.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v5 1/7] eal: add static per-lcore memory allocation facility
2024-09-17 14:32 ` [PATCH v5 0/7] Lcore variables Mattias Rönnblom
@ 2024-09-17 14:32 ` Mattias Rönnblom
2024-09-18 8:00 ` [PATCH v6 0/7] Lcore variables Mattias Rönnblom
2024-09-17 14:32 ` [PATCH v5 2/7] eal: add lcore variable functional tests Mattias Rönnblom
` (5 subsequent siblings)
6 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-17 14:32 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v5:
* Update EAL programming guide.
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 78 ++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 385 ++++++++++++++++++
lib/eal/version.map | 2 +
10 files changed, 528 insertions(+), 6 deletions(-)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 9559c12a98..12b49672a6 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -433,12 +433,45 @@ with them once they're registered.
Per-lcore and Shared Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. note::
-
- lcore refers to a logical execution unit of the processor, sometimes called a hardware *thread*.
-
-Shared variables are the default behavior.
-Per-lcore variables are implemented using *Thread Local Storage* (TLS) to provide per-thread local storage.
+By default static variables, blocks allocated on the DPDK heap, and
+other type of memory is shared by all DPDK threads.
+
+An application, a DPDK library or PMD may keep opt to keep per-thread
+state.
+
+Per-thread data may be maintained using either *lcore variables*
+(``rte_lcore_var.h``), *thread-local storage (TLS)*
+(``rte_per_lcore.h``), or a static array of ``RTE_MAX_LCORE``
+elements, index by ``rte_lcore_id()``. These methods allows for
+per-lcore data to be a largely module-internal affair, and not
+directly visible in its API. Another possibility is to have deal
+explicitly with per-thread aspects in the API (e.g., the ports of the
+Eventdev API).
+
+Lcore varibles are suitable for small object statically allocated at
+the time of module or application initialization. An lcore variable
+take on one value for each lcore id-equipped thread (i.e., for EAL
+threads and registered non-EAL threads, in total ``RTE_MAX_LCORE``
+instances). The lifetime of lcore variables are detached from that of
+the owning threads, and may thus be initialized prior to the owner
+having been created.
+
+Variables with thread-local storage are allocated at the time of
+thread creation, and exists until the thread terminates, for every
+thread in the process. Only very small object should be allocated in
+TLS, since large TLS objects significantly slows down thread creation
+and may needlessly increase memory footprint for application that make
+extensive use of unregistered threads.
+
+A common but now largely obsolete DPDK pattern is to use a static
+array sized according to the maximum number of lcore id-equipped
+threads (i.e., with ``RTE_MAX_LCORE`` elements). To avoid *false
+sharing*, each element must both cache-aligned, and include a
+``RTE_CACHE_GUARD``. Such extensive use of padding cause internal
+fragmentation (i.e., unused space) and lower cache hit rates.
+
+For more discussions on per-lcore state, see the ``rte_lcore_var.h``
+API documentation.
Logs
~~~~
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..309822039b
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..ec3ab714a8
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,385 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ * A pointer successively set to point to lcore variable value
+ * corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+ for (unsigned int lcore_id = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v6 0/7] Lcore variables
2024-09-17 14:32 ` [PATCH v5 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-18 8:00 ` Mattias Rönnblom
2024-09-18 8:00 ` [PATCH v6 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (6 more replies)
0 siblings, 7 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 8:00 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (7):
eal: add static per-lcore memory allocation facility
eal: add lcore variable functional tests
eal: add lcore variable performance test
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 2 +
app/test/test_lcore_var.c | 436 ++++++++++++++++++
app/test/test_lcore_var_perf.c | 257 +++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 79 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 115 ++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 388 ++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 35 +-
17 files changed, 1335 insertions(+), 93 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 app/test/test_lcore_var_perf.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v6 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:00 ` [PATCH v6 0/7] Lcore variables Mattias Rönnblom
@ 2024-09-18 8:00 ` Mattias Rönnblom
2024-09-18 8:24 ` Konstantin Ananyev
2024-09-18 8:26 ` [PATCH v7 0/7] Lcore variables Mattias Rönnblom
2024-09-18 8:00 ` [PATCH v6 2/7] eal: add lcore variable functional tests Mattias Rönnblom
` (5 subsequent siblings)
6 siblings, 2 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 8:00 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v6:
* Have API user provide the loop variable in the FOREACH macro, to
avoid subtle bugs where the loop variable name clashes with some
other user-defined variable. (Konstantin Ananyev)
PATCH v5:
* Update EAL programming guide.
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 79 ++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 388 ++++++++++++++++++
lib/eal/version.map | 2 +
10 files changed, 532 insertions(+), 6 deletions(-)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 9559c12a98..12b49672a6 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -433,12 +433,45 @@ with them once they're registered.
Per-lcore and Shared Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. note::
-
- lcore refers to a logical execution unit of the processor, sometimes called a hardware *thread*.
-
-Shared variables are the default behavior.
-Per-lcore variables are implemented using *Thread Local Storage* (TLS) to provide per-thread local storage.
+By default static variables, blocks allocated on the DPDK heap, and
+other type of memory is shared by all DPDK threads.
+
+An application, a DPDK library or PMD may keep opt to keep per-thread
+state.
+
+Per-thread data may be maintained using either *lcore variables*
+(``rte_lcore_var.h``), *thread-local storage (TLS)*
+(``rte_per_lcore.h``), or a static array of ``RTE_MAX_LCORE``
+elements, index by ``rte_lcore_id()``. These methods allows for
+per-lcore data to be a largely module-internal affair, and not
+directly visible in its API. Another possibility is to have deal
+explicitly with per-thread aspects in the API (e.g., the ports of the
+Eventdev API).
+
+Lcore varibles are suitable for small object statically allocated at
+the time of module or application initialization. An lcore variable
+take on one value for each lcore id-equipped thread (i.e., for EAL
+threads and registered non-EAL threads, in total ``RTE_MAX_LCORE``
+instances). The lifetime of lcore variables are detached from that of
+the owning threads, and may thus be initialized prior to the owner
+having been created.
+
+Variables with thread-local storage are allocated at the time of
+thread creation, and exists until the thread terminates, for every
+thread in the process. Only very small object should be allocated in
+TLS, since large TLS objects significantly slows down thread creation
+and may needlessly increase memory footprint for application that make
+extensive use of unregistered threads.
+
+A common but now largely obsolete DPDK pattern is to use a static
+array sized according to the maximum number of lcore id-equipped
+threads (i.e., with ``RTE_MAX_LCORE`` elements). To avoid *false
+sharing*, each element must both cache-aligned, and include a
+``RTE_CACHE_GUARD``. Such extensive use of padding cause internal
+fragmentation (i.e., unused space) and lower cache hit rates.
+
+For more discussions on per-lcore state, see the ``rte_lcore_var.h``
+API documentation.
Logs
~~~~
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..6b7690795e
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ unsigned int lcore_id;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..e8db1391fe
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,388 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for an @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * unsigned int lcore_id;
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param lcore_id
+ * An <code>unsigned int</code> variable successively set to the
+ * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
+ * @param value
+ * A pointer variable successively set to point to lcore variable
+ * value instance of the current lcore id being processed.
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \
+ for (lcore_id = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ lcore_id < RTE_MAX_LCORE; \
+ lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v6 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:00 ` [PATCH v6 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-18 8:24 ` Konstantin Ananyev
2024-09-18 8:25 ` Mattias Rönnblom
2024-09-18 8:26 ` [PATCH v7 0/7] Lcore variables Mattias Rönnblom
1 sibling, 1 reply; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-18 8:24 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob
> +/**
> + * Iterate over each lcore id's value for an lcore variable.
> + *
> + * @param lcore_id
> + * An <code>unsigned int</code> variable successively set to the
> + * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
> + * @param value
> + * A pointer variable successively set to point to lcore variable
> + * value instance of the current lcore id being processed.
> + * @param handle
> + * The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \
> + for (lcore_id = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> + lcore_id < RTE_MAX_LCORE; \
> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
> +
I think we need a '()' around references to lcore_id:
for ((lcore_id) = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
(lcore_id) < RTE_MAX_LCORE; \
(lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v6 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:24 ` Konstantin Ananyev
@ 2024-09-18 8:25 ` Mattias Rönnblom
0 siblings, 0 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 8:25 UTC (permalink / raw)
To: Konstantin Ananyev, Mattias Rönnblom, dev
Cc: Morten Brørup, Stephen Hemminger, Konstantin Ananyev,
David Marchand, Jerin Jacob
On 2024-09-18 10:24, Konstantin Ananyev wrote:
>> +/**
>> + * Iterate over each lcore id's value for an lcore variable.
>> + *
>> + * @param lcore_id
>> + * An <code>unsigned int</code> variable successively set to the
>> + * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
>> + * @param value
>> + * A pointer variable successively set to point to lcore variable
>> + * value instance of the current lcore id being processed.
>> + * @param handle
>> + * The lcore variable handle.
>> + */
>> +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \
>> + for (lcore_id = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
>> + lcore_id < RTE_MAX_LCORE; \
>> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
>> +
>
> I think we need a '()' around references to lcore_id:
> for ((lcore_id) = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> (lcore_id) < RTE_MAX_LCORE; \
> (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
Yes, of course. Thanks.
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v7 0/7] Lcore variables
2024-09-18 8:00 ` [PATCH v6 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-18 8:24 ` Konstantin Ananyev
@ 2024-09-18 8:26 ` Mattias Rönnblom
2024-09-18 8:26 ` [PATCH v7 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
` (8 more replies)
1 sibling, 9 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 8:26 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
This patch set introduces a new API <rte_lcore_var.h> for static
per-lcore id data allocation.
Please refer to the <rte_lcore_var.h> API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.
The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.
The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.
Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.
Mattias Rönnblom (7):
eal: add static per-lcore memory allocation facility
eal: add lcore variable functional tests
eal: add lcore variable performance test
random: keep PRNG state in lcore variable
power: keep per-lcore state in lcore variable
service: keep per-lcore state in lcore variable
eal: keep per-lcore power intrinsics state in lcore variable
MAINTAINERS | 6 +
app/test/meson.build | 2 +
app/test/test_lcore_var.c | 436 ++++++++++++++++++
app/test/test_lcore_var_perf.c | 257 +++++++++++
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 79 ++++
lib/eal/common/meson.build | 1 +
lib/eal/common/rte_random.c | 28 +-
lib/eal/common/rte_service.c | 117 ++---
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 390 ++++++++++++++++
lib/eal/version.map | 2 +
lib/eal/x86/rte_power_intrinsics.c | 17 +-
lib/power/rte_power_pmd_mgmt.c | 35 +-
17 files changed, 1339 insertions(+), 93 deletions(-)
create mode 100644 app/test/test_lcore_var.c
create mode 100644 app/test/test_lcore_var_perf.c
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* [PATCH v7 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:26 ` [PATCH v7 0/7] Lcore variables Mattias Rönnblom
@ 2024-09-18 8:26 ` Mattias Rönnblom
2024-09-18 9:23 ` Konstantin Ananyev
` (2 more replies)
2024-09-18 8:26 ` [PATCH v7 2/7] eal: add lcore variable functional tests Mattias Rönnblom
` (7 subsequent siblings)
8 siblings, 3 replies; 323+ messages in thread
From: Mattias Rönnblom @ 2024-09-18 8:26 UTC (permalink / raw)
To: dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob,
Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.
An lcore variable has one value for every current and future lcore
id-equipped thread.
The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.
Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.
Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.
The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
--
PATCH v7:
* Add () to the FOREACH lcore id macro parameter, to allow arbitrary
expression, not just a simple variable name, being passed.
(Konstantin Ananyev)
PATCH v6:
* Have API user provide the loop variable in the FOREACH macro, to
avoid subtle bugs where the loop variable name clashes with some
other user-defined variable. (Konstantin Ananyev)
PATCH v5:
* Update EAL programming guide.
PATCH v2:
* Add Windows support. (Morten Brørup)
* Fix lcore variables API index reference. (Morten Brørup)
* Various improvements of the API documentation. (Morten Brørup)
* Elimination of unused symbol in version.map. (Morten Brørup)
PATCH:
* Update MAINTAINERS and release notes.
* Stop covering included files in extern "C" {}.
RFC v6:
* Include <stdlib.h> to get aligned_alloc().
* Tweak documentation (grammar).
* Provide API-level guarantees that lcore variable values take on an
initial value of zero.
* Fix misplaced __rte_cache_aligned in the API doc example.
RFC v5:
* In Doxygen, consistenly use @<cmd> (and not \<cmd>).
* The RTE_LCORE_VAR_GET() and SET() convience access macros
covered an uncommon use case, where the lcore value is of a
primitive type, rather than a struct, and is thus eliminated
from the API. (Morten Brørup)
* In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
RTE_LCORE_VAR_VALUE().
* The underscores are removed from __rte_lcore_var_lcore_ptr() to
signal that this function is a part of the public API.
* Macro arguments are documented.
RFV v4:
* Replace large static array with libc heap-allocated memory. One
implication of this change is there no longer exists a fixed upper
bound for the total amount of memory used by lcore variables.
RTE_MAX_LCORE_VAR has changed meaning, and now represent the
maximum size of any individual lcore variable value.
* Fix issues in example. (Morten Brørup)
* Improve access macro type checking. (Morten Brørup)
* Refer to the lcore variable handle as "handle" and not "name" in
various macros.
* Document lack of thread safety in rte_lcore_var_alloc().
* Provide API-level assurance the lcore variable handle is
always non-NULL, to all applications to use NULL to mean
"not yet allocated".
* Note zero-sized allocations are not allowed.
* Give API-level guarantee the lcore variable values are zeroed.
RFC v3:
* Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
* Update example to reflect FOREACH macro name change (in RFC v2).
RFC v2:
* Use alignof to derive alignment requirements. (Morten Brørup)
* Change name of FOREACH to make it distinct from <rte_lcore.h>'s
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
* Allow user-specified alignment, but limit max to cache line size.
---
MAINTAINERS | 6 +
config/rte_config.h | 1 +
doc/api/doxy-api-index.md | 1 +
.../prog_guide/env_abstraction_layer.rst | 45 +-
doc/guides/rel_notes/release_24_11.rst | 14 +
lib/eal/common/eal_common_lcore_var.c | 79 ++++
lib/eal/common/meson.build | 1 +
lib/eal/include/meson.build | 1 +
lib/eal/include/rte_lcore_var.h | 390 ++++++++++++++++++
lib/eal/version.map | 2 +
10 files changed, 534 insertions(+), 6 deletions(-)
create mode 100644 lib/eal/common/eal_common_lcore_var.c
create mode 100644 lib/eal/include/rte_lcore_var.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
F: lib/eal/common/rte_random.c
F: app/test/test_rand_perf.c
+Lcore Variables
+M: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
+F: lib/eal/include/rte_lcore_var.h
+F: lib/eal/common/eal_common_lcore_var.c
+F: app/test/test_lcore_var.c
+
ARM v7
M: Wathsala Vithanage <wathsala.vithanage@arm.com>
F: config/arm/
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..311692e498 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -41,6 +41,7 @@
/* EAL defines */
#define RTE_CACHE_GUARD_LINES 1
#define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
#define RTE_MAX_MEMSEG_LISTS 128
#define RTE_MAX_MEMSEG_PER_LIST 8192
#define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..ed577f14ee 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -99,6 +99,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+ [lcore variables](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 9559c12a98..12b49672a6 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -433,12 +433,45 @@ with them once they're registered.
Per-lcore and Shared Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. note::
-
- lcore refers to a logical execution unit of the processor, sometimes called a hardware *thread*.
-
-Shared variables are the default behavior.
-Per-lcore variables are implemented using *Thread Local Storage* (TLS) to provide per-thread local storage.
+By default static variables, blocks allocated on the DPDK heap, and
+other type of memory is shared by all DPDK threads.
+
+An application, a DPDK library or PMD may keep opt to keep per-thread
+state.
+
+Per-thread data may be maintained using either *lcore variables*
+(``rte_lcore_var.h``), *thread-local storage (TLS)*
+(``rte_per_lcore.h``), or a static array of ``RTE_MAX_LCORE``
+elements, index by ``rte_lcore_id()``. These methods allows for
+per-lcore data to be a largely module-internal affair, and not
+directly visible in its API. Another possibility is to have deal
+explicitly with per-thread aspects in the API (e.g., the ports of the
+Eventdev API).
+
+Lcore varibles are suitable for small object statically allocated at
+the time of module or application initialization. An lcore variable
+take on one value for each lcore id-equipped thread (i.e., for EAL
+threads and registered non-EAL threads, in total ``RTE_MAX_LCORE``
+instances). The lifetime of lcore variables are detached from that of
+the owning threads, and may thus be initialized prior to the owner
+having been created.
+
+Variables with thread-local storage are allocated at the time of
+thread creation, and exists until the thread terminates, for every
+thread in the process. Only very small object should be allocated in
+TLS, since large TLS objects significantly slows down thread creation
+and may needlessly increase memory footprint for application that make
+extensive use of unregistered threads.
+
+A common but now largely obsolete DPDK pattern is to use a static
+array sized according to the maximum number of lcore id-equipped
+threads (i.e., with ``RTE_MAX_LCORE`` elements). To avoid *false
+sharing*, each element must both cache-aligned, and include a
+``RTE_CACHE_GUARD``. Such extensive use of padding cause internal
+fragmentation (i.e., unused space) and lower cache hit rates.
+
+For more discussions on per-lcore state, see the ``rte_lcore_var.h``
+API documentation.
Logs
~~~~
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..a3884f7491 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,20 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added EAL per-lcore static memory allocation facility.**
+
+ Added EAL API <rte_lcore_var.h> for statically allocating small,
+ frequently-accessed data structures, for which one instance should
+ exist for each EAL thread and registered non-EAL thread.
+
+ With lcore variables, data is organized spatially on a per-lcore id
+ basis, rather than per library or PMD, avoiding the need for cache
+ aligning (or RTE_CACHE_GUARDing) data structures, which in turn
+ reduces CPU cache internal fragmentation, improving performance.
+
+ Lcore variables are similar to thread-local storage (TLS, e.g.,
+ C11 _Thread_local), but decoupling the values' life time from that
+ of the threads.
Removed Items
-------------
diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 0000000000..6b7690795e
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+ void *handle;
+ unsigned int lcore_id;
+ void *value;
+
+ offset = RTE_ALIGN_CEIL(offset, align);
+
+ if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+ lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+ RTE_CACHE_LINE_SIZE);
+#else
+ lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+ LCORE_BUFFER_SIZE);
+#endif
+ RTE_VERIFY(lcore_buffer != NULL);
+
+ offset = 0;
+ }
+
+ handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+ offset += size;
+
+ RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)
+ memset(value, 0, size);
+
+ EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+ "%"PRIuPTR"-byte alignment", size, align);
+
+ return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+ /* Having the per-lcore buffer size aligned on cache lines
+ * assures as well as having the base pointer aligned on cache
+ * size assures that aligned offsets also translate to alipgned
+ * pointers across all values.
+ */
+ RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+ RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+ RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+ /* '0' means asking for worst-case alignment requirements */
+ if (align == 0)
+ align = alignof(max_align_t);
+
+ RTE_ASSERT(rte_is_power_of_2(align));
+
+ return lcore_var_alloc(size, align);
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 22a626ba6f..d41403680b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -18,6 +18,7 @@ sources += files(
'eal_common_interrupts.c',
'eal_common_launch.c',
'eal_common_lcore.c',
+ 'eal_common_lcore_var.c',
'eal_common_mcfg.c',
'eal_common_memalloc.c',
'eal_common_memory.c',
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index e94b056d46..9449253e23 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -27,6 +27,7 @@ headers += files(
'rte_keepalive.h',
'rte_launch.h',
'rte_lcore.h',
+ 'rte_lcore_var.h',
'rte_lock_annotations.h',
'rte_malloc.h',
'rte_mcslock.h',
diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..894100d1e4
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h
@@ -0,0 +1,390 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#ifndef _RTE_LCORE_VAR_H_
+#define _RTE_LCORE_VAR_H_
+
+/**
+ * @file
+ *
+ * RTE Lcore variables
+ *
+ * This API provides a mechanism to create and access per-lcore id
+ * variables in a space- and cycle-efficient manner.
+ *
+ * A per-lcore id variable (or lcore variable for short) has one value
+ * for each EAL thread and registered non-EAL thread. There is one
+ * instance for each current and future lcore id-equipped thread, with
+ * a total of RTE_MAX_LCORE instances. The value of an lcore variable
+ * for a particular lcore id is independent from other values (for
+ * other lcore ids) within the same lcore variable.
+ *
+ * In order to access the values of an lcore variable, a handle is
+ * used. The type of the handle is a pointer to the value's type
+ * (e.g., for an @c uint32_t lcore variable, the handle is a
+ * <code>uint32_t *</code>. The handle type is used to inform the
+ * access macros the type of the values. A handle may be passed
+ * between modules and threads just like any pointer, but its value
+ * must be treated as a an opaque identifier. An allocated handle
+ * never has the value NULL.
+ *
+ * @b Creation
+ *
+ * An lcore variable is created in two steps:
+ * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDLE.
+ * 2. Allocate lcore variable storage and initialize the handle with
+ * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or
+ * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of
+ * module initialization, but may be done at any time.
+ *
+ * An lcore variable is not tied to the owning thread's lifetime. It's
+ * available for use by any thread immediately after having been
+ * allocated, and continues to be available throughout the lifetime of
+ * the EAL.
+ *
+ * Lcore variables cannot and need not be freed.
+ *
+ * @b Access
+ *
+ * The value of any lcore variable for any lcore id may be accessed
+ * from any thread (including unregistered threads), but it should
+ * only be *frequently* read from or written to by the owner.
+ *
+ * Values of the same lcore variable but owned by two different lcore
+ * ids may be frequently read or written by the owners without risking
+ * false sharing.
+ *
+ * An appropriate synchronization mechanism (e.g., atomic loads and
+ * stores) should employed to assure there are no data races between
+ * the owning thread and any non-owner threads accessing the same
+ * lcore variable instance.
+ *
+ * The value of the lcore variable for a particular lcore id is
+ * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * A common pattern is for an EAL thread or a registered non-EAL
+ * thread to access its own lcore variable value. For this purpose, a
+ * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE.
+ *
+ * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a
+ * pointer with the same type as the value, it may not be directly
+ * dereferenced and must be treated as an opaque identifier.
+ *
+ * Lcore variable handles and value pointers may be freely passed
+ * between different threads.
+ *
+ * @b Storage
+ *
+ * An lcore variable's values may by of a primitive type like @c int,
+ * but would more typically be a @c struct.
+ *
+ * The lcore variable handle introduces a per-variable (not
+ * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so
+ * there are some memory footprint gains to be made by organizing all
+ * per-lcore id data for a particular module as one lcore variable
+ * (e.g., as a struct).
+ *
+ * An application may choose to define an lcore variable handle, which
+ * it then it goes on to never allocate.
+ *
+ * The size of an lcore variable's value must be less than the DPDK
+ * build-time constant @c RTE_MAX_LCORE_VAR.
+ *
+ * The lcore variable are stored in a series of lcore buffers, which
+ * are allocated from the libc heap. Heap allocation failures are
+ * treated as fatal.
+ *
+ * Lcore variables should generally *not* be @ref __rte_cache_aligned
+ * and need *not* include a @ref RTE_CACHE_GUARD field, since the use
+ * of these constructs are designed to avoid false sharing. In the
+ * case of an lcore variable instance, the thread most recently
+ * accessing nearby data structures should almost-always be the lcore
+ * variables' owner. Adding padding will increase the effective memory
+ * working set size, potentially reducing performance.
+ *
+ * Lcore variable values take on an initial value of zero.
+ *
+ * @b Example
+ *
+ * Below is an example of the use of an lcore variable:
+ *
+ * @code{.c}
+ * struct foo_lcore_state {
+ * int a;
+ * long b;
+ * };
+ *
+ * static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states);
+ *
+ * long foo_get_a_plus_b(void)
+ * {
+ * struct foo_lcore_state *state = RTE_LCORE_VAR_VALUE(lcore_states);
+ *
+ * return state->a + state->b;
+ * }
+ *
+ * RTE_INIT(rte_foo_init)
+ * {
+ * RTE_LCORE_VAR_ALLOC(lcore_states);
+ *
+ * unsigned int lcore_id;
+ * struct foo_lcore_state *state;
+ * RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, state, lcore_states) {
+ * (initialize 'state')
+ * }
+ *
+ * (other initialization)
+ * }
+ * @endcode
+ *
+ *
+ * @b Alternatives
+ *
+ * Lcore variables are designed to replace a pattern exemplified below:
+ * @code{.c}
+ * struct __rte_cache_aligned foo_lcore_state {
+ * int a;
+ * long b;
+ * RTE_CACHE_GUARD;
+ * };
+ *
+ * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE];
+ * @endcode
+ *
+ * This scheme is simple and effective, but has one drawback: the data
+ * is organized so that objects related to all lcores for a particular
+ * module is kept close in memory. At a bare minimum, this requires
+ * sizing data structures (e.g., using `__rte_cache_aligned`) to an
+ * even number of cache lines to avoid false sharing. With CPU
+ * hardware prefetching and memory loads resulting from speculative
+ * execution (functions which seemingly are getting more eager faster
+ * than they are getting more intelligent), one or more "guard" cache
+ * lines may be required to separate one lcore's data from another's.
+ *
+ * Lcore variables have the upside of working with, not against, the
+ * CPU's assumptions and for example next-line prefetchers may well
+ * work the way its designers intended (i.e., to the benefit, not
+ * detriment, of system performance).
+ *
+ * Another alternative to @ref rte_lcore_var.h is the @ref
+ * rte_per_lcore.h API, which makes use of thread-local storage (TLS,
+ * e.g., GCC __thread or C11 _Thread_local). The main differences
+ * between by using the various forms of TLS (e.g., @ref
+ * RTE_DEFINE_PER_LCORE or _Thread_local) and the use of lcore
+ * variables are:
+ *
+ * * The existence and non-existence of a thread-local variable
+ * instance follow that of particular thread's. The data cannot be
+ * accessed before the thread has been created, nor after it has
+ * exited. As a result, thread-local variables must be initialized in
+ * a "lazy" manner (e.g., at the point of thread creation). Lcore
+ * variables may be accessed immediately after having been
+ * allocated (which may be prior any thread beyond the main
+ * thread is running).
+ * * A thread-local variable is duplicated across all threads in the
+ * process, including unregistered non-EAL threads (i.e.,
+ * "regular" threads). For DPDK applications heavily relying on
+ * multi-threading (in conjunction to DPDK's "one thread per core"
+ * pattern), either by having many concurrent threads or
+ * creating/destroying threads at a high rate, an excessive use of
+ * thread-local variables may cause inefficiencies (e.g.,
+ * increased thread creation overhead due to thread-local storage
+ * initialization or increased total RAM footprint usage). Lcore
+ * variables *only* exist for threads with an lcore id.
+ * * If data in thread-local storage may be shared between threads
+ * (i.e., can a pointer to a thread-local variable be passed to
+ * and successfully dereferenced by non-owning thread) depends on
+ * the details of the TLS implementation. With GCC __thread and
+ * GCC _Thread_local, such data sharing is supported. In the C11
+ * standard, the result of accessing another thread's
+ * _Thread_local object is implementation-defined. Lcore variable
+ * instances may be accessed reliably by any thread.
+ */
+
+#include <stddef.h>
+#include <stdalign.h>
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type) \
+ type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name) \
+ RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \
+ handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle) \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \
+ alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \
+ }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size) \
+ RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name) \
+ RTE_INIT(rte_lcore_var_init_ ## name) \
+ { \
+ RTE_LCORE_VAR_ALLOC(name); \
+ }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+ return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ * The lcore id specifying which of the @c RTE_MAX_LCORE value
+ * instances should be accessed. The lcore id need not be valid
+ * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ * is also not valid (and thus should not be dereferenced).
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \
+ ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+ RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param lcore_id
+ * An <code>unsigned int</code> variable successively set to the
+ * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
+ * @param value
+ * A pointer variable successively set to point to lcore variable
+ * value instance of the current lcore id being processed.
+ * @param handle
+ * The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \
+ for ((lcore_id) = \
+ (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+ (lcore_id) < RTE_MAX_LCORE; \
+ (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, \
+ handle))
+
+/**
+ * Allocate space in the per-lcore id buffers for an lcore variable.
+ *
+ * The pointer returned is only an opaque identifer of the variable. To
+ * get an actual pointer to a particular instance of the variable use
+ * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE.
+ *
+ * The lcore variable values' memory is set to zero.
+ *
+ * The allocation is always successful, barring a fatal exhaustion of
+ * the per-lcore id buffer space.
+ *
+ * rte_lcore_var_alloc() is not multi-thread safe.
+ *
+ * @param size
+ * The size (in bytes) of the variable's per-lcore id value. Must be > 0.
+ * @param align
+ * If 0, the values will be suitably aligned for any kind of type
+ * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned
+ * on a multiple of *align*, which must be a power of 2 and equal or
+ * less than @c RTE_CACHE_LINE_SIZE.
+ * @return
+ * The variable's handle, stored in a void pointer value. The value
+ * is always non-NULL.
+ */
+__rte_experimental
+void *
+rte_lcore_var_alloc(size_t size, size_t align);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LCORE_VAR_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index e3ff412683..0c80bf7331 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -396,6 +396,8 @@ EXPERIMENTAL {
# added in 24.03
rte_vfio_get_device_info; # WINDOWS_NO_EXPORT
+
+ rte_lcore_var_alloc;
};
INTERNAL {
--
2.34.1
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v7 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:26 ` [PATCH v7 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
@ 2024-09-18 9:23 ` Konstantin Ananyev
2024-10-09 22:15 ` Morten Brørup
2024-10-10 14:13 ` [PATCH v8 0/7] Lcore variables Mattias Rönnblom
2 siblings, 0 replies; 323+ messages in thread
From: Konstantin Ananyev @ 2024-09-18 9:23 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: hofors, Morten Brørup, Stephen Hemminger,
Konstantin Ananyev, David Marchand, Jerin Jacob
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 2.34.1
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v7 1/7] eal: add static per-lcore memory allocation facility
2024-09-18 8:26 ` [PATCH v7 1/7] eal: add static per-lcore memory allocation facility Mattias Rönnblom
2024-09-18 9:23 ` Konstantin Ananyev
@ 2024-10-09 22:15 ` Morten Brørup
2024-10-10 10:40 ` Mattias Rönnblom
2024-10-10 14:13 ` [PATCH v8 0/7] Lcore variables Mattias Rönnblom
2 siblings, 1 reply; 323+ messages in thread
From: Morten Brørup @ 2024-10-09 22:15 UTC (permalink / raw)
To: Mattias Rönnblom, dev, Tyler Retzlaff
Cc: hofors, Stephen Hemminger, Konstantin Ananyev, David Marchand,
Jerin Jacob
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Wednesday, 18 September 2024 10.26
>
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,79 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <stdlib.h>
> +
> +#ifdef RTE_EXEC_ENV_WINDOWS
> +#include <malloc.h>
> +#endif
From what I can read on the internet, max_align_t is missing in stddef.h in MSVC [1], so try adding this to fix the Windows CI compilation failure:
#ifdef RTE_TOOLCHAIN_MSVC
#include <cstddef>
#endif
[1]: https://learn.microsoft.com/en-ie/answers/questions/1726147/why-max-align-t-not-defined-in-stddef-h-in-windows
^ permalink raw reply [flat|nested] 323+ messages in thread
* Re: [PATCH v7 1/7] eal: add static per-lcore memory allocation facility
2024-10-09 22:15 ` Morten Brørup
@ 2024-10-10 10:40 ` Mattias Rönnblom
2024-10-10 11:47 ` Morten Brørup
0 siblings, 1 reply; 323+ messages in thread
From: Mattias Rönnblom @ 2024-10-10 10:40 UTC (permalink / raw)
To: Morten Brørup, Mattias Rönnblom, dev, Tyler Retzlaff
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand, Jerin Jacob
On 2024-10-10 00:15, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
>> Sent: Wednesday, 18 September 2024 10.26
>>
>> Introduce DPDK per-lcore id variables, or lcore variables for short.
>>
>> An lcore variable has one value for every current and future lcore
>> id-equipped thread.
>>
>> The primary <rte_lcore_var.h> use case is for statically allocating
>> small, frequently-accessed data structures, for which one instance
>> should exist for each lcore.
>>
>> Lcore variables are similar to thread-local storage (TLS, e.g., C11
>> _Thread_local), but decoupling the values' life time with that of the
>> threads.
>>
>> Lcore variables are also similar in terms of functionality provided by
>> FreeBSD kernel's DPCPU_*() family of macros and the associated
>> build-time machinery. DPCPU uses linker scripts, which effectively
>> prevents the reuse of its, otherwise seemingly viable, approach.
>>
>> The currently-prevailing way to solve the same problem as lcore
>> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
>> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
>> lcore variables over this approach is that data related to the same
>> lcore now is close (spatially, in memory), rather than data used by
>> the same module, which in turn avoid excessive use of padding,
>> polluting caches with unused data.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_lcore_var.c
>> @@ -0,0 +1,79 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 Ericsson AB
>> + */
>> +
>> +#include <inttypes.h>
>> +#include <stdlib.h>
>> +
>> +#ifdef RTE_EXEC_ENV_WINDOWS
>> +#include <malloc.h>
>> +#endif
>
> From what I can read on the internet, max_align_t is missing in stddef.h in MSVC [1], so try adding this to fix the Windows CI compilation failure:
>
> #ifdef RTE_TOOLCHAIN_MSVC
> #include <cstddef>
> #endif
Please excuse my MSVC ignorance, but will this work in C? Looks like C++.
>
> [1]: https://learn.microsoft.com/en-ie/answers/questions/1726147/why-max-align-t-not-defined-in-stddef-h-in-windows
>
^ permalink raw reply [flat|nested] 323+ messages in thread
* RE: [PATCH v7 1/7] eal: add static per-lcore memory allocation facility
2024-10-10 10:40 ` Mattias Rönnblom
@ 2024-10-10 11:47 ` Morten Brørup
2024-10-10 13:12 ` Morten Brørup
2024-10-10 13:40 ` Mattias Rönnblom
0 siblings, 2 replies; 323+ messages in thread
From: Morten Brørup @ 2024-10-10 11:47 UTC (permalink / raw)
To: Mattias Rönnblom, Mattias Rönnblom, dev, Tyler Retzlaff
Cc: Stephen Hemminger, Konstantin Ananyev, David Marchand, Jerin Jacob
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Thursday, 10 October 2024 12.40
>
> On 2024-10-10 00:15, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> >> Sent: Wednesday, 18 September 2024 10.26
> >>
> >> Introduce DPDK per-lcore id variables, or lcore variables for short.
> >>
> >> An lcore variable has one value for every current and future lcore
> >> id-equipped thread.
> >>
> >> The primary <rte_lcore_var.h> use case is for statically allocating
> >> small, frequently-accessed data structures, for which one instance
> >> should exist for each lcore.
> >>
> >> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> >> _Thread_local), but decoupling the values' life time with that of
> the
> >> threads.
> >>
> >> Lcore variables are also similar in terms of functionality provided
> by
> >> FreeBSD kernel's DPCPU_*() family of macros and the associated
> >> build-time machinery. DPCPU uses linker scripts, which effectively
> >> prevents the reuse of its, otherwise seemingly viable, approach.
> >>
> >> The currently-prevailing way to solve the same problem as lcore
> >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-
> sized
> >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> >> lcore variables over this approach is that data related to the same
> >> lcore now is close (spatially, in memory), rather than data used by
> >> the same module, which in turn avoid excessive use of padding,
> >> polluting caches with unused data.
> >>
> >> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >
> >> --- /dev/null
> >> +++ b/lib/eal/common/eal_common_lcore_var.c
> >> @@ -0,0 +1,79 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright(c) 2024 Ericsson AB
> >> + */
> >> +
> >> +#include <inttypes.h>
> >> +#include <stdlib.h>
> >> +
> >> +#ifdef RTE_EXEC_ENV_WINDOWS
> >> +#include <malloc.h>
> >> +#endif
> >
> > From what I can read on the internet, max_align_t is missing in
> stddef.h in MSVC [1], so try adding this to fix the Windows CI
> compilation failure:
> >
> > #ifdef RTE_TOOLCHAIN_MSVC
> > #include <cstddef>
> > #endif
>
> Please excuse my MSVC ignorance, but will this work in C? Looks like
> C++.
I have no clue. Just parroting what Microsoft says on the internet.
You can try it out and see if the CI accepts it.
>
> >
> > [1]: https://learn.microsoft.com/en-ie/answers/questions/1726147/why-
> max-align-t-not-defined-in-stddef-h-in-windows
> >
I would like to see this series go into 24.11, and then it needs to work for MSVC too.
@Tyler, any better suggestions for fixing the missing max_align_t in stddef.h?
^ permalink raw reply [flat|nested] 323+ messages in thread