From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8845C45B09; Thu, 10 Oct 2024 23:24:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5467C4060F; Thu, 10 Oct 2024 23:24:23 +0200 (CEST) Received: from fhigh-a3-smtp.messagingengine.com (fhigh-a3-smtp.messagingengine.com [103.168.172.154]) by mails.dpdk.org (Postfix) with ESMTP id 491FE402E3 for ; Thu, 10 Oct 2024 23:24:22 +0200 (CEST) Received: from phl-compute-06.internal (phl-compute-06.phl.internal [10.202.2.46]) by mailfhigh.phl.internal (Postfix) with ESMTP id BE5ED1140100; Thu, 10 Oct 2024 17:24:21 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Thu, 10 Oct 2024 17:24:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1728595461; x=1728681861; bh=RKYdMqZ0dxyD1S8Kyxd9vE2juaCE2yI7HOjFMhDPcG0=; b= ixHKJxhcoECRABR9UWMRINiTGwsIzfjzfwQF9/pISVbXxMb9OdjWJYf4u/Kl7Sci H0l7eVhB9Z1kgTyMZhA87uw5GxGvsDPHi7Xy4eNh1Vy8Jj0aR6Lyp03FoBqjNte3 i6YzxW80oyK8znduvRJhoYTKtKBleill1oz+ab6fKnvFehVBYqeYpbWLki2fZAdT q802TgKI7NHyf4t95rhTUmoFES3qFqsjKO7idjXbBUyL7L2kZ2bYX0PZmLivCiCk +ejxn9+TMfexxwl/xWYUGoSAyB9w0cyxebe3zIt8XvLeIfQgfuQJs3CCtx4d9Hs2 pujhMH26TPsBmtrGWL7yFA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728595461; x= 1728681861; bh=RKYdMqZ0dxyD1S8Kyxd9vE2juaCE2yI7HOjFMhDPcG0=; b=E rEObTinNx7gNQPllm6IKZf3ogS/uUNw7it6BOx4bgg5v54BY4NofoBoA3OCKQu8K /jiPciwzmp7qEt21dWuNqK8Cym/DDfd8xSAeiSc4012EsYva89WctRtXnFYIV+fX KiO8XyFmGr86mxQO2HshdOuQGXCwjcZaxK2zZi+7A2RPdDZFrLc+l9aYQwPsB2dC CToiRcpdPjNqn40kt30ZplgGfnPZKONwpiN6mOcz2zVLJ0/hYW/9nMMNxR4HBsxl bN1ndxD52v391w9BQwHJfgNDAw4Sfs79SkeyBrAoJSFQxyEAwlRBdtt4YzIDsLUX i7AxP7k5n9WeX8LGEu5dQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefiedgfeehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhephffvvefufffkjghfggfgtgesthhqredttddtjeen ucfhrhhomhepvfhhohhmrghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrg hlohhnrdhnvghtqeenucggtffrrghtthgvrhhnpeegtddtleejjeegffekkeektdejvedt heevtdekiedvueeuvdeiuddvleevjeeujeenucevlhhushhtvghrufhiiigvpedtnecurf grrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtpdhn sggprhgtphhtthhopeduvddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepmhgrth htihgrshdrrhhonhhnsghlohhmsegvrhhitghsshhonhdrtghomhdprhgtphhtthhopegu vghvseguphgukhdrohhrghdprhgtphhtthhopehhohhfohhrsheslhihshgrthhorhdrlh hiuhdrshgvpdhrtghpthhtohepmhgssehsmhgrrhhtshhhrghrvghshihsthgvmhhsrdgt ohhmpdhrtghpthhtohepshhtvghphhgvnhesnhgvthifohhrkhhplhhumhgsvghrrdhorh hgpdhrtghpthhtohepkhhonhhsthgrnhhtihhnrdhvrdgrnhgrnhihvghvseihrghnuggv gidrrhhupdhrtghpthhtohepuggrvhhiugdrmhgrrhgthhgrnhgusehrvgguhhgrthdrtg homhdprhgtphhtthhopehjvghrihhnjhesmhgrrhhvvghllhdrtghomhdprhgtphhtthho pehluhhkrgdrjhgrnhhkohhvihgtsegvrhhitghsshhonhdrtghomh X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 10 Oct 2024 17:24:19 -0400 (EDT) From: Thomas Monjalon To: Mattias =?UTF-8?B?UsO2bm5ibG9t?= Cc: dev@dpdk.org, hofors@lysator.liu.se, Morten =?UTF-8?B?QnLDuHJ1cA==?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Mattias =?UTF-8?B?UsO2bm5ibG9t?= , Konstantin Ananyev , Chengwen Feng Subject: Re: [PATCH v9 1/7] eal: add static per-lcore memory allocation facility Date: Thu, 10 Oct 2024 23:24:17 +0200 Message-ID: <1829355.yIU609i1g2@thomas> In-Reply-To: <20241010142205.813134-2-mattias.ronnblom@ericsson.com> References: <20241010141349.813088-2-mattias.ronnblom@ericsson.com> <20241010142205.813134-1-mattias.ronnblom@ericsson.com> <20241010142205.813134-2-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hello, This new feature looks to bring something interesting to DPDK. There was a good amount of discussion and review, and there is a real effort of documentation. However, some choices done in this implementation were not explained or advertised enough in the documentation, in my opinion. I think the first thing to add is an explanation of the memory layout. Maybe that a SVG drawing would help to show how it is stored. We also need to explain why it is not using rte_malloc. Also please could you re-read the doc and comments in detail? I think some words are missing and there are typos. While at it, please allow for easy update of the text by starting each sentence on a new line. Breaking lines logically is better for future patches. One more advice: avoid very long sentences. Do you have benchmarks results of the modules using such variables (power, random, service)? It would be interesting to compare time efficiency and memory usage before/after, with different number of threads. Adding more detailed comments below. 10/10/2024 16:21, Mattias R=C3=B6nnblom: > Introduce DPDK per-lcore id variables, or lcore variables for short. >=20 > An lcore variable has one value for every current and future lcore > id-equipped thread. I find it difficult to read "lcore id-equipped thread". Can we just say "DPDK thread"? > The primary use case is for statically allocating > small, frequently-accessed data structures, for which one instance > should exist for each lcore. >=20 > Lcore variables are similar to thread-local storage (TLS, e.g., C11 > _Thread_local), but decoupling the values' life time with that of the > threads. In which situation we need values of a dead thread? [...] > +An application, a DPDK library or PMD may keep opt to keep per-thread > +state. I don't understand this sentence. > + > +Per-thread data may be maintained using either *lcore variables* > +(``rte_lcore_var.h``), *thread-local storage (TLS)* > +(``rte_per_lcore.h``), or a static array of ``RTE_MAX_LCORE`` > +elements, index by ``rte_lcore_id()``. These methods allows for index*ed* > +per-lcore data to be a largely module-internal affair, and not > +directly visible in its API. Another possibility is to have deal *to* deal ? > +explicitly with per-thread aspects in the API (e.g., the ports of the > +Eventdev API). > + > +Lcore varibles are suitable for small object statically allocated at vari*a*bles > +the time of module or application initialization. An lcore variable > +take on one value for each lcore id-equipped thread (i.e., for EAL > +threads and registered non-EAL threads, in total ``RTE_MAX_LCORE`` > +instances). The lifetime of lcore variables are detached from that of > +the owning threads, and may thus be initialized prior to the owner > +having been created. > + > +Variables with thread-local storage are allocated at the time of > +thread creation, and exists until the thread terminates, for every > +thread in the process. Only very small object should be allocated in > +TLS, since large TLS objects significantly slows down thread creation > +and may needlessly increase memory footprint for application that make > +extensive use of unregistered threads. I don't understand the relation with non-DPDK threads. > + > +A common but now largely obsolete DPDK pattern is to use a static > +array sized according to the maximum number of lcore id-equipped > +threads (i.e., with ``RTE_MAX_LCORE`` elements). To avoid *false > +sharing*, each element must both cache-aligned, and include a must *be* include*s* > +``RTE_CACHE_GUARD``. Such extensive use of padding cause internal cause*s* > +fragmentation (i.e., unused space) and lower cache hit rates. > + > +For more discussions on per-lcore state, see the ``rte_lcore_var.h`` > +API documentation. [...] > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE) With #define RTE_MAX_LCORE_VAR 1048576, LCORE_BUFFER_SIZE can be 100MB, right? > + > +static void *lcore_buffer; It is the last buffer for all lcores. The name suggests it is one single buffer per lcore. What about "last_buffer" or "current_buffer"? > +static size_t offset =3D RTE_MAX_LCORE_VAR; A comment may be useful for this value: it triggers the first alloc? > + > +static void * > +lcore_var_alloc(size_t size, size_t align) > +{ > + void *handle; > + unsigned int lcore_id; > + void *value; > + > + offset =3D RTE_ALIGN_CEIL(offset, align); > + > + if (offset + size > RTE_MAX_LCORE_VAR) { > +#ifdef RTE_EXEC_ENV_WINDOWS > + lcore_buffer =3D _aligned_malloc(LCORE_BUFFER_SIZE, > + RTE_CACHE_LINE_SIZE); > +#else > + lcore_buffer =3D aligned_alloc(RTE_CACHE_LINE_SIZE, > + LCORE_BUFFER_SIZE); > +#endif > + RTE_VERIFY(lcore_buffer !=3D NULL); Please no panic in a lib. You can return NULL. > + > + offset =3D 0; > + } > + > + handle =3D RTE_PTR_ADD(lcore_buffer, offset); > + > + offset +=3D size; > + > + RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) > + memset(value, 0, size); > + > + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a " > + "%"PRIuPTR"-byte alignment", size, align); > + > + return handle; > +} [...] > +#ifndef _RTE_LCORE_VAR_H_ > +#define _RTE_LCORE_VAR_H_ Really we don't need the first and last underscores, but it's a detail. > + > +/** > + * @file > + * > + * RTE Lcore variables Please don't say "RTE", it is just a prefix. You can replace it with "DPDK" if you really want to be specific. > + * > + * This API provides a mechanism to create and access per-lcore id > + * variables in a space- and cycle-efficient manner. > + * > + * A per-lcore id variable (or lcore variable for short) has one value > + * for each EAL thread and registered non-EAL thread. There is one > + * instance for each current and future lcore id-equipped thread, with > + * a total of RTE_MAX_LCORE instances. The value of an lcore variable > + * for a particular lcore id is independent from other values (for > + * other lcore ids) within the same lcore variable. > + * > + * In order to access the values of an lcore variable, a handle is > + * used. The type of the handle is a pointer to the value's type > + * (e.g., for an @c uint32_t lcore variable, the handle is a > + * uint32_t *. The handle type is used to inform the > + * access macros the type of the values. A handle may be passed > + * between modules and threads just like any pointer, but its value > + * must be treated as a an opaque identifier. An allocated handle > + * never has the value NULL. Most of the explanations here would be better hosted in the prog guide. The Doxygen API is better suited for short and direct explanations. > + * > + * @b Creation > + * > + * An lcore variable is created in two steps: > + * 1. Define an lcore variable handle by using @ref RTE_LCORE_VAR_HANDL= E. > + * 2. Allocate lcore variable storage and initialize the handle with > + * a unique identifier by @ref RTE_LCORE_VAR_ALLOC or > + * @ref RTE_LCORE_VAR_INIT. Allocation generally occurs the time of *at* the time > + * module initialization, but may be done at any time. You mean it does not depend on EAL initialization? > + * > + * An lcore variable is not tied to the owning thread's lifetime. It's > + * available for use by any thread immediately after having been > + * allocated, and continues to be available throughout the lifetime of > + * the EAL. > + * > + * Lcore variables cannot and need not be freed. I'm curious about that. If EAL is closed, and the application continues its life, then we want all this memory to be cleaned as well. Do you know rte_eal_cleanup()? > + * > + * @b Access > + * > + * The value of any lcore variable for any lcore id may be accessed > + * from any thread (including unregistered threads), but it should > + * only be *frequently* read from or written to by the owner. Would be interesting to explain why. > + * > + * Values of the same lcore variable but owned by two different lcore > + * ids may be frequently read or written by the owners without risking > + * false sharing. Again you could explain why if you explained the storage layout. What is the minimum object size to avoid false sharing? > + * > + * An appropriate synchronization mechanism (e.g., atomic loads and > + * stores) should employed to assure there are no data races between should *be* > + * the owning thread and any non-owner threads accessing the same > + * lcore variable instance. > + * > + * The value of the lcore variable for a particular lcore id is > + * accessed using @ref RTE_LCORE_VAR_LCORE_VALUE. > + * > + * A common pattern is for an EAL thread or a registered non-EAL > + * thread to access its own lcore variable value. For this purpose, a > + * short-hand exists in the form of @ref RTE_LCORE_VAR_VALUE. shorthand without hyphen? > + * > + * Although the handle (as defined by @ref RTE_LCORE_VAR_HANDLE) is a > + * pointer with the same type as the value, it may not be directly > + * dereferenced and must be treated as an opaque identifier. > + * > + * Lcore variable handles and value pointers may be freely passed > + * between different threads. > + * > + * @b Storage > + * > + * An lcore variable's values may by of a primitive type like @c int, > + * but would more typically be a @c struct. > + * > + * The lcore variable handle introduces a per-variable (not > + * per-value/per-lcore id) overhead of @c sizeof(void *) bytes, so > + * there are some memory footprint gains to be made by organizing all > + * per-lcore id data for a particular module as one lcore variable > + * (e.g., as a struct). > + * > + * An application may choose to define an lcore variable handle, which > + * it then it goes on to never allocate. I don't understand this sentence. > + * > + * The size of an lcore variable's value must be less than the DPDK size of variable, not size of value > + * build-time constant @c RTE_MAX_LCORE_VAR. > + * > + * The lcore variable are stored in a series of lcore buffers, which variable*s* > + * are allocated from the libc heap. Heap allocation failures are > + * treated as fatal. Why not handling as an error, so the app has a chance to cleanup before cra= sh? > + * > + * Lcore variables should generally *not* be @ref __rte_cache_aligned > + * and need *not* include a @ref RTE_CACHE_GUARD field, since the use > + * of these constructs are designed to avoid false sharing. In the > + * case of an lcore variable instance, the thread most recently > + * accessing nearby data structures should almost-always be the lcore > + * variables' owner. Adding padding will increase the effective memory > + * working set size, potentially reducing performance. > + * > + * Lcore variable values take on an initial value of zero. > + * > + * @b Example [...] > + * @b Alternatives > + * > + * Lcore variables are designed to replace a pattern exemplified below: Would be better in the introduction (in the prog guide). > + * @code{.c} > + * struct __rte_cache_aligned foo_lcore_state { > + * int a; > + * long b; > + * RTE_CACHE_GUARD; > + * }; > + * > + * static struct foo_lcore_state lcore_states[RTE_MAX_LCORE]; > + * @endcode [...] > +/** > + * Define an lcore variable handle. > + * > + * This macro defines a variable which is used as a handle to access > + * the various instances of a per-lcore id variable. > + * > + * The aim with this macro is to make clear at the point of This long sentence may be shortened. > + * declaration that this is an lcore handle, rather than a regular > + * pointer. > + * > + * Add @b static as a prefix in case the lcore variable is only to be > + * accessed from a particular translation unit. > + */ > +#define RTE_LCORE_VAR_HANDLE(type, name) \ > + RTE_LCORE_VAR_HANDLE_TYPE(type) name > + > +/** > + * Allocate space for an lcore variable, and initialize its handle. > + * > + * The values of the lcore variable are initialized to zero. The lcore variables are initialized to zero, not the values. Don't you mention 0 in align? > + */ > +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \ > + handle =3D rte_lcore_var_alloc(size, align) > + > +/** > + * Allocate space for an lcore variable, and initialize its handle, > + * with values aligned for any type of object. > + * > + * The values of the lcore variable are initialized to zero. > + */ > +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \ > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0) > + > +/** > + * Allocate space for an lcore variable of the size and alignment requir= ements > + * suggested by the handle pointer type, and initialize its handle. > + * > + * The values of the lcore variable are initialized to zero. > + */ > +#define RTE_LCORE_VAR_ALLOC(handle) \ > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \ > + alignof(typeof(*(handle)))) > + > +/** > + * Allocate an explicitly-sized, explicitly-aligned lcore variable by > + * means of a @ref RTE_INIT constructor. > + * > + * The values of the lcore variable are initialized to zero. > + */ > +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \ > + RTE_INIT(rte_lcore_var_init_ ## name) \ > + { \ > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \ > + } > + > +/** > + * Allocate an explicitly-sized lcore variable by means of a @ref > + * RTE_INIT constructor. > + * > + * The values of the lcore variable are initialized to zero. > + */ > +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \ > + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0) > + > +/** > + * Allocate an lcore variable by means of a @ref RTE_INIT constructor. > + * > + * The values of the lcore variable are initialized to zero. > + */ > +#define RTE_LCORE_VAR_INIT(name) \ > + RTE_INIT(rte_lcore_var_init_ ## name) \ > + { \ > + RTE_LCORE_VAR_ALLOC(name); \ > + } I don't get the need for RTE_INIT macros. It does not cover RTE_INIT_PRIO and anyway another RTE_INIT is probably already there in the module. > + > +/** > + * Get void pointer to lcore variable instance with the specified > + * lcore id. > + * > + * @param lcore_id > + * The lcore id specifying which of the @c RTE_MAX_LCORE value > + * instances should be accessed. The lcore id need not be valid > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer > + * is also not valid (and thus should not be dereferenced). > + * @param handle > + * The lcore variable handle. handle pointer > + */ > +static inline void * > +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle) What a long name! What about rte_lcore_var() ? > +{ > + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR); > +} > + > +/** > + * Get pointer to lcore variable instance with the specified lcore id. Same description as the function above. > + * > + * @param lcore_id > + * The lcore id specifying which of the @c RTE_MAX_LCORE value > + * instances should be accessed. The lcore id need not be valid > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer > + * is also not valid (and thus should not be dereferenced). > + * @param handle > + * The lcore variable handle. > + */ > +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) \ > + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle)) > + > +/** > + * Get pointer to lcore variable instance of the current thread. > + * > + * May only be used by EAL threads and registered non-EAL threads. > + */ > +#define RTE_LCORE_VAR_VALUE(handle) \ RTE_LCORE_VAR_LOCAL? > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle) > + > +/** > + * Iterate over each lcore id's value for an lcore variable. > + * > + * @param lcore_id > + * An unsigned int variable successively set to the > + * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE). > + * @param value > + * A pointer variable successively set to point to lcore variable > + * value instance of the current lcore id being processed. > + * @param handle > + * The lcore variable handle. > + */ > +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \ RTE_LCORE_VAR_FOREACH? > + for ((lcore_id) =3D \ > + (((value) =3D RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \ > + (lcore_id) < RTE_MAX_LCORE; \ > + (lcore_id)++, (value) =3D RTE_LCORE_VAR_LCORE_VALUE(lcore_id, \ > + handle)) > + > +/** > + * Allocate space in the per-lcore id buffers for an lcore variable. > + * > + * The pointer returned is only an opaque identifer of the variable. To > + * get an actual pointer to a particular instance of the variable use > + * @ref RTE_LCORE_VAR_VALUE or @ref RTE_LCORE_VAR_LCORE_VALUE. > + * > + * The lcore variable values' memory is set to zero. > + * > + * The allocation is always successful, barring a fatal exhaustion of > + * the per-lcore id buffer space. > + * > + * rte_lcore_var_alloc() is not multi-thread safe. > + * > + * @param size > + * The size (in bytes) of the variable's per-lcore id value. Must be >= 0. > + * @param align > + * If 0, the values will be suitably aligned for any kind of type > + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned > + * on a multiple of *align*, which must be a power of 2 and equal or > + * less than @c RTE_CACHE_LINE_SIZE. > + * @return > + * The variable's handle, stored in a void pointer value. The value > + * is always non-NULL. > + */ > +__rte_experimental > +void * > +rte_lcore_var_alloc(size_t size, size_t align); [...] > --- a/lib/eal/version.map > +++ b/lib/eal/version.map > @@ -396,6 +396,8 @@ EXPERIMENTAL { > =20 > # added in 24.03 > rte_vfio_get_device_info; # WINDOWS_NO_EXPORT > + # added in 24.11 > + rte_lcore_var_alloc;