From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "David Marchand" <david.marchand@redhat.com>, <dev@dpdk.org>
Subject: RE: [PATCH v2 0/3] allow easier use of high lcore-ids
Date: Mon, 7 Apr 2025 14:25:46 +0200 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9FB9E@smartserver.smartshare.dk> (raw)
In-Reply-To: <Z_O9Uq_Mg86yjKzi@bricha3-mobl1.ger.corp.intel.com>
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Monday, 7 April 2025 13.56
>
> On Mon, Apr 07, 2025 at 01:32:59PM +0200, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Monday, 7 April 2025 12.41
> > >
> > > On Mon, Apr 07, 2025 at 12:15:13PM +0200, Morten Brørup wrote:
> > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent:
> > > > > Monday, 7 April 2025 11.49
> > > > >
> > > > > On Mon, Apr 07, 2025 at 09:04:05AM +0200, David Marchand wrote:
> > > > > > Hello Bruce,
> > > > > >
> > > > > > On Tue, Apr 1, 2025 at 4:08 PM Bruce Richardson
> > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > >
> > > > > > > On Mon, Mar 24, 2025 at 05:30:26PM +0000, Bruce Richardson
> > > wrote:
> > > > > > > > Traditionally, DPDK has had a direct mapping of internal
> > > lcore-
> > > > > ids, to
> > > > > > > > the actual core numbers in use. With higher core count
> > > servers
> > > > > becoming
> > > > > > > > more prevalent the issue becomes one of increasing memory
> > > > > footprint when
> > > > > > > > using such a scheme, due to the need to have all arrays
> > > > > dimensioned for
> > > > > > > > all cores on the system, whether or not those cores are
> in
> > > use by
> > > > > the
> > > > > > > > app.
> > > > > > > >
> > > > > > > > Therefore, the decision was made in the past to not
> expand
> > > the
> > > > > > > > build-time RTE_MAX_LCORE value beyond 128. Instead, it
> was
> > > > > recommended
> > > > > > > > that users use the "--lcores" EAL parameter to take the
> high-
> > > > > numbered
> > > > > > > > cores they wish to use and map them to lcore-ids within
> the 0
> > > -
> > > > > 128
> > > > > > > > range. While this works, this is a little clunky as it
> means
> > > that
> > > > > > > > instead of just passing, for example, "-l 130-139", the
> user
> > > must
> > > > > > > > instead pass "--lcores 0@130,1@131,2@132,3@133,...."
> > > > > > > >
> > > > > > > > This patchset attempts to simplify the situation by
> adding a
> > > new
> > > > > flag to
> > > > > > > > do this mapping automatically. To use cores 130-139 and
> map
> > > them
> > > > > to ids
> > > > > > > > 0-9 internally, the EAL args now become: "-l 130-139 --
> map-
> > > lcore-
> > > > > ids",
> > > > > > > > or using the shorter "-M" version of the flag: "-Ml 130-
> 139".
> > > > > > > >
> > > > > > > > Adding this new parameter required some rework of the
> > > existing
> > > > > arg
> > > > > > > > parsing code, because in current DPDK the args are parsed
> and
> > > > > checked in
> > > > > > > > the order they appear on the commandline. This means that
> > > using
> > > > > the
> > > > > > > > example above, the core parameter 130-139 will be
> rejected
> > > > > immediately
> > > > > > > > before the "map-lcore-ids" parameter is seen. To work
> around
> > > > > this, the
> > > > > > > > core (and service core) parameters are not parsed when
> seen,
> > > > > instead
> > > > > > > > they are only saved off and parsed after all arguments
> are
> > > > > parsed. The
> > > > > > > > "-l" and "-c" parameters are converted into "--lcores"
> > > arguments,
> > > > > so all
> > > > > > > > assigning of lcore ids is done there in all cases.
> > > > > > > >
> > > > > > > > RFC->v2: * converted printf to DEBUG log * added "-M" as
> > > shorter
> > > > > > > > version of flag * added documentation * renamed internal
> API
> > > that
> > > > > > > > was changed to avoid any potential
> > > > > hidden
> > > > > > > > runtime issues.
> > > > > > > >
> > > > > > > > Bruce Richardson (3): eal: centralize core parameter
> parsing
> > > eal:
> > > > > > > > convert core masks and lists to core sets eal: allow
> > > automatic
> > > > > > > > mapping of high lcore ids
> > > > > > > >
> > > > > > > Ping for review.
> > > > > > >
> > > > > > > At a high level, does this feature seem useful to users?
> > > > > >
> > > > > > This seems useful, though I am not I would touch the existing
> > > > > options.
> > > > > > I would have gone with a simple -L option (taking the same
> kind
> > > of
> > > > > > input than -l but with new behavior), and not combine a flag
> with
> > > > > > existing options.
> > > > > >
> > > > >
> > > > > That would be an easier patchset to do up. However, it would
> then
> > > mean
> > > > > that we have no less than 4 different ways to specify the cores
> to
> > > use:
> > > > > "- c", "-l", "-L", "--lcores" - and therefore need 4 different
> sets
> > > of
> > > > > parsing options for them, and more checks to ensure we have
> only
> > > one of
> > > > > the 4 specified in any run. That's why I chose the modifier
> option,
> > > and
> > > > > to try and consolidate the core setup a bit.
> > > > >
> > > > > However, if having a completely new option is preferred, I am
> happy
> > > > > enough to do up a different patchset for that.
> > > > >
> > > > > > I scanned through the series, not much to say. Maybe add a
> unit
> > > test
> > > > > > for new cmdline option.
> > > > > >
> > > > > Sure. Once it's decided what approach (if any) to take, I'll do
> up
> > > a
> > > > > new patchset, taking into account any relevant feedback on this
> > > set.
> > > > >
> > > > > /Bruce
> > > >
> > > > Changing the EAL parameter parser to a "two pass parser" makes
> sense.
> > > I
> > > > think checking for existence of more than one lcore specification
> > > options
> > > > should suffice; we don't need to accept multiple lcore
> specification
> > > > options and check for conflicts.
> > > >
> > > > When remapping, do we need to support gaps in the "lcore"
> (logical
> > > cores)
> > > > array, e.g. for secondary processes, or can it be continuous from
> 0
> > > to
> > > > the number of specified lcores?
> > > >
> > > > And are new EAL parameters for this really beneficial? Doesn't
> e.g.
> > > "-l
> > > > 0-9@130-139,100@140" suffice?
> > > >
> > > Actually, I believe "0-9@130-139"[1] is not the same as
> > > "0@130,1@131,2@132,...". The latter affinities one thread to one
> core
> > > ten
> > > times over, while the former affinitizes 10 threads to 10 cores -
> > > leaving
> > > those threads free to move about within the 10 cores specified.
> >
> > Interesting. The documentation [GSG] isn't clear to me about this; a
> example there could help clarify.
> >
> > [GSG]:
> https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#lcore-
> related-options
> >
>
> Yep, agreed.
>
> > If users are manually passing lcore parameters to the EAL, then I see
> why some sort of remapping shorthand is useful.
> > IMO, if the mappings are relatively exotic, it should be acceptable
> requiring an external script to build a long list of mapping parameters
> and then invoke the application with those script-generated EAL
> parameters.
> > This would reduce the scope to support relatively simple, common
> mappings.
> >
> > Could we expand the --lcores syntax to support common mappings?
> >
> > E.g. "0-9@130+" to do what I thought.
> > The lack of "()" treats the entries individually (not as a group),
> and the "+" indicates auto-increment.
> >
> > A more advanced example:
> > "0-9@(130-131)+", meaning lcore 0 gets cpus 130-131, lcore 1 gets
> cpus 132-133, etc.
> >
>
> My issues with the above syntax idea is:
> * I think it's overly complicating the lcores parameter adding in the
> support for the "+" symbol - especially in the advanced case example
> you
> provide. I worry about code maintainability here.
> * More significantly for me, I think it's also getting things backwards
> in
> that it is focusing more on the lcore ids visible to the app, rather
> than
> the physical cores to be used. For the example above of 0-9@130+,
> what I
> would expect is that the user is mainly thinking about the cores he
> wants
> to use 130-139, which are then to be mapped to lower ids. If we had
> the
> syntax reversed where the physical cores were first, I'd say it would
> make more sense, e.g. 130-139@0+
I 100 % agree on the syntax being backwards.
A good reason for introducing a new parameter, rather than expanding on "--lcores".
We should consider deprecating the old (backwards) syntax, so users don’t get confused about one EAL parameter being "backwards" of the other.
> * finally, as I found out last month, there are systems on which the
> cores
> are spread across numa-nodes on odd/even boundaries, so to have an
> app
> running on socket 0, you need to give the core ids as a list i.e.
> 0,2,4,6, and cannot use ranges. [This also reenforces the point above
> too, where again it's the internal ids we need to generalize, not the
> physical cpus]
For this, we could use "/2" (like in subnets), or "+2" as increment parameter.
>
> My thinking on the matter, and I'm happy to be corrected if I'm wrong
> here,
> is that end-users are unlikely to significantly care what the actual
> lcore
> ids are internally in the program, so long as they work and are unique.
Generally yes.
Note that we should allow the same physical CPU core to be assigned to multiple lcores...
If a CPU core is shared between multiple worker lcores, then it could be problematic, but with a mix of lcore roles, this might be handy for some applications. E.g. a virtual machine with only one CPU core could use that single CPU core as both main and worker lcore, or as main and service lcore.
Sharing an lcore between threads requires the developer to take special care of e.g. spinlocks, but the EAL parameter parser should not prohibit it. It might log a notice, though.
> What does matter is the physical cpus on which the code is to run.
> Therefore, my approach to this is to find the simplest possible
> solution
> whereby the user can just provide a list of cores and tell EAL to just
> map
> them to reasonable values. For the "reasonable" values, I would imagine
> that for the vast majority of cases starting "0" is what is wanted. For
> anything beyond that, we already have the existing --lcores syntax to
> be
> used.
Agree with all of the above. :-)
>
> My 2c.
>
> /Bruce
next prev parent reply other threads:[~2025-04-07 12:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 11:38 [RFC PATCH " Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 1/3] eal: centralize core parameter parsing Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 2/3] eal: convert core masks and lists to core sets Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 3/3] eal: allow automatic mapping of high lcore ids Bruce Richardson
2025-03-24 17:30 ` [PATCH v2 0/3] allow easier use of high lcore-ids Bruce Richardson
2025-03-24 17:30 ` [PATCH v2 1/3] eal: centralize core parameter parsing Bruce Richardson
2025-04-07 6:58 ` David Marchand
2025-03-24 17:30 ` [PATCH v2 2/3] eal: convert core masks and lists to core sets Bruce Richardson
2025-04-07 6:59 ` David Marchand
2025-03-24 17:30 ` [PATCH v2 3/3] eal: allow automatic mapping of high lcore ids Bruce Richardson
2025-04-01 14:06 ` [PATCH v2 0/3] allow easier use of high lcore-ids Bruce Richardson
2025-04-07 7:04 ` David Marchand
2025-04-07 9:48 ` Bruce Richardson
2025-04-07 10:15 ` Morten Brørup
2025-04-07 10:40 ` Bruce Richardson
2025-04-07 11:32 ` Morten Brørup
2025-04-07 11:56 ` Bruce Richardson
2025-04-07 12:25 ` Morten Brørup [this message]
2025-04-07 12:41 ` Bruce Richardson
2025-04-07 13:18 ` Morten Brørup
2025-04-07 13:24 ` Bruce Richardson
2025-04-07 15:14 ` Stephen Hemminger
2025-04-07 15:38 ` Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9FB9E@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).