From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 78D7C45966; Thu, 12 Sep 2024 04:33:20 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0CC3F4027D; Thu, 12 Sep 2024 04:33:20 +0200 (CEST) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id AFA2D40267 for ; Thu, 12 Sep 2024 04:33:15 +0200 (CEST) Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4X41dd4G5wzfbnk; Thu, 12 Sep 2024 10:31:01 +0800 (CST) Received: from dggpeml500024.china.huawei.com (unknown [7.185.36.10]) by mail.maildlp.com (Postfix) with ESMTPS id 75E1F1800FF; Thu, 12 Sep 2024 10:33:12 +0800 (CST) Received: from [10.67.121.161] (10.67.121.161) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 12 Sep 2024 10:33:12 +0800 Message-ID: Date: Thu, 12 Sep 2024 10:33:11 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility To: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , CC: , =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand References: <20240910070344.699183-2-mattias.ronnblom@ericsson.com> <20240911170430.701685-1-mattias.ronnblom@ericsson.com> <20240911170430.701685-2-mattias.ronnblom@ericsson.com> Content-Language: en-US From: fengchengwen In-Reply-To: <20240911170430.701685-2-mattias.ronnblom@ericsson.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.121.161] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpeml500024.china.huawei.com (7.185.36.10) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024/9/12 1:04, Mattias Rönnblom wrote: > Introduce DPDK per-lcore id variables, or lcore variables for short. > > An lcore variable has one value for every current and future lcore > id-equipped thread. > > The primary use case is for statically allocating > small, frequently-accessed data structures, for which one instance > should exist for each lcore. > > Lcore variables are similar to thread-local storage (TLS, e.g., C11 > _Thread_local), but decoupling the values' life time with that of the > threads. > > Lcore variables are also similar in terms of functionality provided by > FreeBSD kernel's DPCPU_*() family of macros and the associated > build-time machinery. DPCPU uses linker scripts, which effectively > prevents the reuse of its, otherwise seemingly viable, approach. > > The currently-prevailing way to solve the same problem as lcore > variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized > array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of > lcore variables over this approach is that data related to the same > lcore now is close (spatially, in memory), rather than data used by > the same module, which in turn avoid excessive use of padding, > polluting caches with unused data. > > Signed-off-by: Mattias Rönnblom > Acked-by: Morten Brørup > > -- > > PATCH v2: > * Add Windows support. (Morten Brørup) > * Fix lcore variables API index reference. (Morten Brørup) > * Various improvements of the API documentation. (Morten Brørup) > * Elimination of unused symbol in version.map. (Morten Brørup) these history could move to the cover letter. > > PATCH: > * Update MAINTAINERS and release notes. > * Stop covering included files in extern "C" {}. > > RFC v6: > * Include to get aligned_alloc(). > * Tweak documentation (grammar). > * Provide API-level guarantees that lcore variable values take on an > initial value of zero. > * Fix misplaced __rte_cache_aligned in the API doc example. > > RFC v5: > * In Doxygen, consistenly use @ (and not \). > * The RTE_LCORE_VAR_GET() and SET() convience access macros > covered an uncommon use case, where the lcore value is of a > primitive type, rather than a struct, and is thus eliminated > from the API. (Morten Brørup) > * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR() > RTE_LCORE_VAR_VALUE(). > * The underscores are removed from __rte_lcore_var_lcore_ptr() to > signal that this function is a part of the public API. > * Macro arguments are documented. > > RFV v4: > * Replace large static array with libc heap-allocated memory. One > implication of this change is there no longer exists a fixed upper > bound for the total amount of memory used by lcore variables. > RTE_MAX_LCORE_VAR has changed meaning, and now represent the > maximum size of any individual lcore variable value. > * Fix issues in example. (Morten Brørup) > * Improve access macro type checking. (Morten Brørup) > * Refer to the lcore variable handle as "handle" and not "name" in > various macros. > * Document lack of thread safety in rte_lcore_var_alloc(). > * Provide API-level assurance the lcore variable handle is > always non-NULL, to all applications to use NULL to mean > "not yet allocated". > * Note zero-sized allocations are not allowed. > * Give API-level guarantee the lcore variable values are zeroed. > > RFC v3: > * Replace use of GCC-specific alignof() with alignof(). > * Update example to reflect FOREACH macro name change (in RFC v2). > > RFC v2: > * Use alignof to derive alignment requirements. (Morten Brørup) > * Change name of FOREACH to make it distinct from 's > *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup) > * Allow user-specified alignment, but limit max to cache line size. > --- > MAINTAINERS | 6 + > config/rte_config.h | 1 + > doc/api/doxy-api-index.md | 1 + > doc/guides/rel_notes/release_24_11.rst | 14 + > lib/eal/common/eal_common_lcore_var.c | 78 +++++ > lib/eal/common/meson.build | 1 + > lib/eal/include/meson.build | 1 + > lib/eal/include/rte_lcore_var.h | 385 +++++++++++++++++++++++++ > lib/eal/version.map | 2 + > 9 files changed, 489 insertions(+) > create mode 100644 lib/eal/common/eal_common_lcore_var.c > create mode 100644 lib/eal/include/rte_lcore_var.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index c5a703b5c0..362d9a3f28 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h > F: lib/eal/common/rte_random.c > F: app/test/test_rand_perf.c > > +Lcore Variables > +M: Mattias Rönnblom > +F: lib/eal/include/rte_lcore_var.h > +F: lib/eal/common/eal_common_lcore_var.c > +F: app/test/test_lcore_var.c > + > ARM v7 > M: Wathsala Vithanage > F: config/arm/ > diff --git a/config/rte_config.h b/config/rte_config.h > index dd7bb0d35b..311692e498 100644 > --- a/config/rte_config.h > +++ b/config/rte_config.h > @@ -41,6 +41,7 @@ > /* EAL defines */ > #define RTE_CACHE_GUARD_LINES 1 > #define RTE_MAX_HEAPS 32 > +#define RTE_MAX_LCORE_VAR 1048576 > #define RTE_MAX_MEMSEG_LISTS 128 > #define RTE_MAX_MEMSEG_PER_LIST 8192 > #define RTE_MAX_MEM_MB_PER_LIST 32768 > diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md > index f9f0300126..ed577f14ee 100644 > --- a/doc/api/doxy-api-index.md > +++ b/doc/api/doxy-api-index.md > @@ -99,6 +99,7 @@ The public API headers are grouped by topics: > [interrupts](@ref rte_interrupts.h), > [launch](@ref rte_launch.h), > [lcore](@ref rte_lcore.h), > + [lcore variables](@ref rte_lcore_var.h), > [per-lcore](@ref rte_per_lcore.h), > [service cores](@ref rte_service.h), > [keepalive](@ref rte_keepalive.h), > diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst > index 0ff70d9057..a3884f7491 100644 > --- a/doc/guides/rel_notes/release_24_11.rst > +++ b/doc/guides/rel_notes/release_24_11.rst > @@ -55,6 +55,20 @@ New Features > Also, make sure to start the actual text at the margin. > ======================================================= > > +* **Added EAL per-lcore static memory allocation facility.** > + > + Added EAL API for statically allocating small, > + frequently-accessed data structures, for which one instance should > + exist for each EAL thread and registered non-EAL thread. > + > + With lcore variables, data is organized spatially on a per-lcore id > + basis, rather than per library or PMD, avoiding the need for cache > + aligning (or RTE_CACHE_GUARDing) data structures, which in turn > + reduces CPU cache internal fragmentation, improving performance. > + > + Lcore variables are similar to thread-local storage (TLS, e.g., > + C11 _Thread_local), but decoupling the values' life time from that > + of the threads. > > Removed Items > ------------- > diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c > new file mode 100644 > index 0000000000..309822039b > --- /dev/null > +++ b/lib/eal/common/eal_common_lcore_var.c > @@ -0,0 +1,78 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2024 Ericsson AB > + */ > + > +#include > +#include > + > +#ifdef RTE_EXEC_ENV_WINDOWS > +#include > +#endif > + > +#include > +#include > +#include > + > +#include > + > +#include "eal_private.h" > + > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE) > + > +static void *lcore_buffer; > +static size_t offset = RTE_MAX_LCORE_VAR; > + > +static void * > +lcore_var_alloc(size_t size, size_t align) > +{ > + void *handle; > + void *value; > + > + offset = RTE_ALIGN_CEIL(offset, align); > + > + if (offset + size > RTE_MAX_LCORE_VAR) { > +#ifdef RTE_EXEC_ENV_WINDOWS > + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE, > + RTE_CACHE_LINE_SIZE); > +#else > + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE, > + LCORE_BUFFER_SIZE); > +#endif > + RTE_VERIFY(lcore_buffer != NULL); > + > + offset = 0; > + } > + > + handle = RTE_PTR_ADD(lcore_buffer, offset); > + > + offset += size; > + > + RTE_LCORE_VAR_FOREACH_VALUE(value, handle) > + memset(value, 0, size); > + > + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a " > + "%"PRIuPTR"-byte alignment", size, align); Currrent the data was malloc by libc function, I think it's mainly for such INIT macro which will be init before main. But it will introduce following problem: 1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for each lcore, if we could place it in huge-pages it will reduce the TLB miss rate, especially it freq access data. 2\ it can't across multi-process. many of current lcore-data also don't support multi-process, but I think it worth do that, and it will help us to some service recovery when sub-process failed and reboot. ...