From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C317943B57; Tue, 20 Feb 2024 14:37:11 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B45A1402A7; Tue, 20 Feb 2024 14:37:11 +0100 (CET) Received: from dkmailrelay1.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 903CC40289 for ; Tue, 20 Feb 2024 14:37:10 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesys.local [192.168.4.10]) by dkmailrelay1.smartsharesystems.com (Postfix) with ESMTP id 7027420893; Tue, 20 Feb 2024 14:37:07 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC v3 1/6] eal: add static per-lcore memory allocation facility X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Tue, 20 Feb 2024 14:37:04 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9F236@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC v3 1/6] eal: add static per-lcore memory allocation facility Thread-Index: Adpj8YBgFffazJSmR6OfFjmtkxcKQwACmcfQ References: <20240219094036.485727-2-mattias.ronnblom@ericsson.com> <20240220084908.488252-1-mattias.ronnblom@ericsson.com> <20240220084908.488252-2-mattias.ronnblom@ericsson.com> <68c8f01c-d404-4b63-adca-13b560c95df8@lysator.liu.se> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" , =?iso-8859-1?Q?Mattias_R=F6nnblom?= Cc: =?iso-8859-1?Q?Mattias_R=F6nnblom?= , , "Stephen Hemminger" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Tuesday, 20 February 2024 12.39 >=20 > On Tue, Feb 20, 2024 at 11:47:14AM +0100, Mattias R=F6nnblom wrote: > > On 2024-02-20 10:11, Bruce Richardson wrote: > > > On Tue, Feb 20, 2024 at 09:49:03AM +0100, Mattias R=F6nnblom = wrote: > > > > Introduce DPDK per-lcore id variables, or lcore variables for > short. > > > > > > > > An lcore variable has one value for every current and future > lcore > > > > id-equipped thread. > > > > > > > > The primary use case is for statically > allocating > > > > small chunks of often-used data, which is related logically, but > where > > > > there are performance benefits to reap from having updates being > local > > > > to an lcore. > > > > > > > > Lcore variables are similar to thread-local storage (TLS, e.g., > C11 > > > > _Thread_local), but decoupling the values' life time with that = of > the > > > > threads. >=20 > >=20 > > > > +/* > > > > + * Avoid using offset zero, since it would result in a NULL- > value > > > > + * "handle" (offset) pointer, which in principle and per the = API > > > > + * definition shouldn't be an issue, but may confuse some tools > and > > > > + * users. > > > > + */ > > > > +#define INITIAL_OFFSET 1 > > > > + > > > > +char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] > __rte_cache_aligned; > > > > + > > > > > > While I like the idea of improved handling for per-core variables, > my main > > > concern with this set is this definition here, which adds yet > another > > > dependency on the compile-time defined RTE_MAX_LCORE value. > > > > > > > lcore variables replaces one RTE_MAX_LCORE-dependent pattern with > another. > > > > You could even argue the dependency on RTE_MAX_LCORE is reduced with > lcore > > variables, if you look at where/in how many places in the code base > this > > macro is being used. Centralizing per-lcore data management may also > provide > > some opportunity in the future for extending the API to cope with > some more > > dynamic RTE_MAX_LCORE variant. Not without ABI breakage of course, > but we > > are not ever going to change anything related to RTE_MAX_LCORE > without > > breaking the ABI, since this constant is everywhere, including > compiled into > > the application itself. > > >=20 > Yep, that is true if it's widely used. >=20 > > > I believe we already have an issue with this #define where it's > impossible > > > to come up with a single value that works for all, or nearly all > cases. The > > > current default is still 128, yet DPDK needs to support systems > where the > > > number of cores is well into the hundreds, requiring workarounds = of > core > > > mappings or customized builds of DPDK. Upping the value fixes = those > issues > > > at the cost to memory footprint explosion for smaller systems. > > > > > > > I agree this is an issue. > > > > RTE_MAX_LCORE also need to be sized to accommodate not only all = cores > used, > > but the sum of all EAL threads and registered non-EAL threads. > > > > So, there is no reliable way to discover what RTE_MAX_LCORE is on a > > particular piece of hardware, since the actual number of lcore ids > needed is > > up to the application. > > > > Why is the default set so low? Linux has MAX_CPUS, which serves the > same > > purpose, which is set to 4096 by default, if I recall correctly. > Shouldn't > > we at least be able to increase it to 256? I recall a recent techboard meeting where the default was discussed. The = default was agreed so low because it suffices for the vast majority of = hardware out there, and applications for bigger platforms can be = expected to build DPDK with a different configuration themselves. And as = Bruce also mentions, it's a tradeoff for memory consumption. >=20 > The default is so low because of the mempool caches. These are an = array > of > buffer pointers with 512 (IIRC) entries per core up to RTE_MAX_LCORE. The decision was based on a need to make a quick decision, so we used = narrow guesstimates, not a broader memory consumption analysis. If we really cared about default memory consumption, we should reduce = the default RTE_MAX_QUEUES_PER_PORT from 1024 too. It has quite an = effect. Having hard data about which build time configuration parameters have = the biggest effect on memory consumption would be extremely useful for = tweaking the parameters for resource limited hardware. It's a mix of static and dynamic allocation, so it's not obvious which = scalable data structures consume the most memory. >=20 > > > > > I'm therefore nervous about putting more dependencies on this > value, when I > > > feel we should be moving away from its use, to allow more runtime > > > configurability of cores. > > > > > > > What more specifically do you have in mind? > > >=20 > I don't think having a dynamically scaling RTE_MAX_LCORE is feasible, > but > what I would like to see is a runtime specified value. For example, = you > could run DPDK with EAL parameter "--max-lcores=3D1024" for large = systems > or > "--max-lcores=3D32" for small ones. That would then be used at = init-time > to > scale all internal datastructures appropriately. >=20 I agree 100 % that a better long term solution should be on the general = road map. Memory is a precious resource, but few seem to care about it. A mix could provide an easy migration path: Having RTE_MAX_LCORE as the hard upper limit (and default value) for a = runtime specified max number ("rte_max_lcores"). With this, the goal would be for modules with very small data sets to = continue using RTE_MAX_LCORE fixed size arrays, and for modules with = larger data sets to migrate to rte_max_lcores dynamically sized arrays. I am opposed to blocking a new patch series, only because it adds = another RTE_MAX_LCORE sized array. We already have plenty of those. It can be migrated towards dynamically sized array at a later time, just = like the other modules with RTE_MAX_LCORE sized arrays. Perhaps "fixing" an existing module would free up more memory than = fixing this module. Let's spend development resources where they have = the biggest impact.