From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1F414459BF; Tue, 17 Sep 2024 16:28:38 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DBE8A402D8; Tue, 17 Sep 2024 16:28:37 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 5F47340261 for ; Tue, 17 Sep 2024 16:28:36 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 18A5E1A913 for ; Tue, 17 Sep 2024 16:28:36 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 0BC4C1A86D; Tue, 17 Sep 2024 16:28:36 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.2 Received: from [192.168.1.86] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 462F31A991; Tue, 17 Sep 2024 16:28:32 +0200 (CEST) Message-ID: Date: Tue, 17 Sep 2024 16:28:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility To: Konstantin Ananyev , =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , "dev@dpdk.org" Cc: =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob References: <20240912084429.703405-2-mattias.ronnblom@ericsson.com> <20240916105210.721315-1-mattias.ronnblom@ericsson.com> <20240916105210.721315-2-mattias.ronnblom@ericsson.com> <23d1b3c5828546128b9ed21e98ac1ae1@huawei.com> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: <23d1b3c5828546128b9ed21e98ac1ae1@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-09-16 16:02, Konstantin Ananyev wrote: > > >> Introduce DPDK per-lcore id variables, or lcore variables for short. >> >> An lcore variable has one value for every current and future lcore >> id-equipped thread. >> >> The primary use case is for statically allocating >> small, frequently-accessed data structures, for which one instance >> should exist for each lcore. >> >> Lcore variables are similar to thread-local storage (TLS, e.g., C11 >> _Thread_local), but decoupling the values' life time with that of the >> threads. >> >> Lcore variables are also similar in terms of functionality provided by >> FreeBSD kernel's DPCPU_*() family of macros and the associated >> build-time machinery. DPCPU uses linker scripts, which effectively >> prevents the reuse of its, otherwise seemingly viable, approach. >> >> The currently-prevailing way to solve the same problem as lcore >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of >> lcore variables over this approach is that data related to the same >> lcore now is close (spatially, in memory), rather than data used by >> the same module, which in turn avoid excessive use of padding, >> polluting caches with unused data. >> >> Signed-off-by: Mattias Rönnblom >> Acked-by: Morten Brørup > > LGTM in general, few small questions (mostly nits), see below. > >> --- /dev/null >> +++ b/lib/eal/common/eal_common_lcore_var.c >> @@ -0,0 +1,78 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2024 Ericsson AB >> + */ >> + >> +#include >> +#include >> + >> +#ifdef RTE_EXEC_ENV_WINDOWS >> +#include >> +#endif >> + >> +#include >> +#include >> +#include >> + >> +#include >> + >> +#include "eal_private.h" >> + >> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE) >> + >> +static void *lcore_buffer; >> +static size_t offset = RTE_MAX_LCORE_VAR; >> + >> +static void * >> +lcore_var_alloc(size_t size, size_t align) >> +{ >> + void *handle; >> + void *value; >> + >> + offset = RTE_ALIGN_CEIL(offset, align); >> + >> + if (offset + size > RTE_MAX_LCORE_VAR) { >> +#ifdef RTE_EXEC_ENV_WINDOWS >> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE, >> + RTE_CACHE_LINE_SIZE); >> +#else >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE, >> + LCORE_BUFFER_SIZE); >> +#endif > > Don't remember did that question already arise or not: > For debugging and health-checking purposes - would it make sense to link all > lcore_buffer values into a linked list? > So user/developer/some tool can walk over it to check that provided handle value > is really a valid lcore_var, etc. > At least you could add some basic statistics, like the total size allocated my lcore variables, and the number of variables. One could also add tracing. >> + RTE_VERIFY(lcore_buffer != NULL); >> + >> + offset = 0; >> + } >> + >> + handle = RTE_PTR_ADD(lcore_buffer, offset); >> + >> + offset += size; >> + >> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle) >> + memset(value, 0, size); >> + >> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a " >> + "%"PRIuPTR"-byte alignment", size, align); >> + >> + return handle; >> +} >> + >> +void * >> +rte_lcore_var_alloc(size_t size, size_t align) >> +{ >> + /* Having the per-lcore buffer size aligned on cache lines >> + * assures as well as having the base pointer aligned on cache >> + * size assures that aligned offsets also translate to alipgned >> + * pointers across all values. >> + */ >> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0); >> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE); >> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR); >> + >> + /* '0' means asking for worst-case alignment requirements */ >> + if (align == 0) >> + align = alignof(max_align_t); >> + >> + RTE_ASSERT(rte_is_power_of_2(align)); >> + >> + return lcore_var_alloc(size, align); >> +} > > .... > >> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h >> new file mode 100644 >> index 0000000000..ec3ab714a8 >> --- /dev/null >> +++ b/lib/eal/include/rte_lcore_var.h > > ... > >> +/** >> + * Given the lcore variable type, produces the type of the lcore >> + * variable handle. >> + */ >> +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \ >> + type * >> + >> +/** >> + * Define an lcore variable handle. >> + * >> + * This macro defines a variable which is used as a handle to access >> + * the various instances of a per-lcore id variable. >> + * >> + * The aim with this macro is to make clear at the point of >> + * declaration that this is an lcore handle, rather than a regular >> + * pointer. >> + * >> + * Add @b static as a prefix in case the lcore variable is only to be >> + * accessed from a particular translation unit. >> + */ >> +#define RTE_LCORE_VAR_HANDLE(type, name) \ >> + RTE_LCORE_VAR_HANDLE_TYPE(type) name >> + >> +/** >> + * Allocate space for an lcore variable, and initialize its handle. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \ >> + handle = rte_lcore_var_alloc(size, align) >> + >> +/** >> + * Allocate space for an lcore variable, and initialize its handle, >> + * with values aligned for any type of object. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \ >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0) >> + >> +/** >> + * Allocate space for an lcore variable of the size and alignment requirements >> + * suggested by the handle pointer type, and initialize its handle. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_ALLOC(handle) \ >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \ >> + alignof(typeof(*(handle)))) >> + >> +/** >> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by >> + * means of a @ref RTE_INIT constructor. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \ >> + RTE_INIT(rte_lcore_var_init_ ## name) \ >> + { \ >> + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \ >> + } >> + >> +/** >> + * Allocate an explicitly-sized lcore variable by means of a @ref >> + * RTE_INIT constructor. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \ >> + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0) >> + >> +/** >> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor. >> + * >> + * The values of the lcore variable are initialized to zero. >> + */ >> +#define RTE_LCORE_VAR_INIT(name) \ >> + RTE_INIT(rte_lcore_var_init_ ## name) \ >> + { \ >> + RTE_LCORE_VAR_ALLOC(name); \ >> + } >> + >> +/** >> + * Get void pointer to lcore variable instance with the specified >> + * lcore id. >> + * >> + * @param lcore_id >> + * The lcore id specifying which of the @c RTE_MAX_LCORE value >> + * instances should be accessed. The lcore id need not be valid >> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer >> + * is also not valid (and thus should not be dereferenced). >> + * @param handle >> + * The lcore variable handle. >> + */ >> +static inline void * >> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle) >> +{ >> + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR); >> +} >> + >> +/** >> + * Get pointer to lcore variable instance with the specified lcore id. >> + * >> + * @param lcore_id >> + * The lcore id specifying which of the @c RTE_MAX_LCORE value >> + * instances should be accessed. The lcore id need not be valid >> + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer >> + * is also not valid (and thus should not be dereferenced). >> + * @param handle >> + * The lcore variable handle. >> + */ >> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \ >> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle)) >> + >> +/** >> + * Get pointer to lcore variable instance of the current thread. >> + * >> + * May only be used by EAL threads and registered non-EAL threads. >> + */ >> +#define RTE_LCORE_VAR_VALUE(handle) \ >> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle) > > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY? > After all if people do not want this extra check, they can probably use > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle) > explicitly. > It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't think so. Attempting to gracefully handle API violations is bad practice, imo. >> + >> +/** >> + * Iterate over each lcore id's value for an lcore variable. >> + * >> + * @param value >> + * A pointer successively set to point to lcore variable value >> + * corresponding to every lcore id (up to @c RTE_MAX_LCORE). >> + * @param handle >> + * The lcore variable handle. >> + */ >> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \ >> + for (unsigned int lcore_id = \ >> + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \ >> + lcore_id < RTE_MAX_LCORE; \ >> + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)) > > Might be a bit better (and safer) to make lcore_id a macro parameter? > I.E.: > define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \ > for ((lcore_id) = ... > Why?