DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "David Marchand" <david.marchand@redhat.com>, <dev@dpdk.org>
Subject: RE: [PATCH v2 0/3] allow easier use of high lcore-ids
Date: Mon, 7 Apr 2025 15:18:36 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9FBA0@smartserver.smartshare.dk> (raw)
In-Reply-To: <Z_PIBLQIT3eu6Ab3@bricha3-mobl1.ger.corp.intel.com>

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Monday, 7 April 2025 14.42
> 
> On Mon, Apr 07, 2025 at 02:25:46PM +0200, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Monday, 7 April 2025 13.56
> > >
> > > On Mon, Apr 07, 2025 at 01:32:59PM +0200, Morten Brørup wrote:
> > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > > Sent: Monday, 7 April 2025 12.41
> > > > >
> > > > > On Mon, Apr 07, 2025 at 12:15:13PM +0200, Morten Brørup wrote:
> > > > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent:
> > > > > > > Monday, 7 April 2025 11.49
> > > > > > >
> > > > > > > On Mon, Apr 07, 2025 at 09:04:05AM +0200, David Marchand
> wrote:
> > > > > > > > Hello Bruce,
> > > > > > > >
> > > > > > > > On Tue, Apr 1, 2025 at 4:08 PM Bruce Richardson
> > > > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Mar 24, 2025 at 05:30:26PM +0000, Bruce
> Richardson
> > > > > wrote:
> > > > > > > > > > Traditionally, DPDK has had a direct mapping of
> internal
> > > > > lcore-
> > > > > > > ids, to
> > > > > > > > > > the actual core numbers in use. With higher core
> count
> > > > > servers
> > > > > > > becoming
> > > > > > > > > > more prevalent the issue becomes one of increasing
> memory
> > > > > > > footprint when
> > > > > > > > > > using such a scheme, due to the need to have all
> arrays
> > > > > > > dimensioned for
> > > > > > > > > > all cores on the system, whether or not those cores
> are
> > > in
> > > > > use by
> > > > > > > the
> > > > > > > > > > app.
> > > > > > > > > >
> > > > > > > > > > Therefore, the decision was made in the past to not
> > > expand
> > > > > the
> > > > > > > > > > build-time RTE_MAX_LCORE value beyond 128. Instead,
> it
> > > was
> > > > > > > recommended
> > > > > > > > > > that users use the "--lcores" EAL parameter to take
> the
> > > high-
> > > > > > > numbered
> > > > > > > > > > cores they wish to use and map them to lcore-ids
> within
> > > the 0
> > > > > -
> > > > > > > 128
> > > > > > > > > > range. While this works, this is a little clunky as
> it
> > > means
> > > > > that
> > > > > > > > > > instead of just passing, for example, "-l 130-139",
> the
> > > user
> > > > > must
> > > > > > > > > > instead pass "--lcores 0@130,1@131,2@132,3@133,...."
> > > > > > > > > >
> > > > > > > > > > This patchset attempts to simplify the situation by
> > > adding a
> > > > > new
> > > > > > > flag to
> > > > > > > > > > do this mapping automatically. To use cores 130-139
> and
> > > map
> > > > > them
> > > > > > > to ids
> > > > > > > > > > 0-9 internally, the EAL args now become: "-l 130-139
> --
> > > map-
> > > > > lcore-
> > > > > > > ids",
> > > > > > > > > > or using the shorter "-M" version of the flag: "-Ml
> 130-
> > > 139".
> > > > > > > > > >
> > > > > > > > > > Adding this new parameter required some rework of the
> > > > > existing
> > > > > > > arg
> > > > > > > > > > parsing code, because in current DPDK the args are
> parsed
> > > and
> > > > > > > checked in
> > > > > > > > > > the order they appear on the commandline. This means
> that
> > > > > using
> > > > > > > the
> > > > > > > > > > example above, the core parameter 130-139 will be
> > > rejected
> > > > > > > immediately
> > > > > > > > > > before the "map-lcore-ids" parameter is seen. To work
> > > around
> > > > > > > this, the
> > > > > > > > > > core (and service core) parameters are not parsed
> when
> > > seen,
> > > > > > > instead
> > > > > > > > > > they are only saved off and parsed after all
> arguments
> > > are
> > > > > > > parsed. The
> > > > > > > > > > "-l" and "-c" parameters are converted into "--
> lcores"
> > > > > arguments,
> > > > > > > so all
> > > > > > > > > > assigning of lcore ids is done there in all cases.
> > > > > > > > > >
> > > > > > > > > > RFC->v2: * converted printf to DEBUG log * added "-M"
> as
> > > > > shorter
> > > > > > > > > > version of flag * added documentation * renamed
> internal
> > > API
> > > > > that
> > > > > > > > > > was changed to avoid any potential
> > > > > > > hidden
> > > > > > > > > >   runtime issues.
> > > > > > > > > >
> > > > > > > > > > Bruce Richardson (3): eal: centralize core parameter
> > > parsing
> > > > > eal:
> > > > > > > > > > convert core masks and lists to core sets eal: allow
> > > > > automatic
> > > > > > > > > > mapping of high lcore ids
> > > > > > > > > >
> > > > > > > > > Ping for review.
> > > > > > > > >
> > > > > > > > > At a high level, does this feature seem useful to
> users?
> > > > > > > >
> > > > > > > > This seems useful, though I am not I would touch the
> existing
> > > > > > > options.
> > > > > > > > I would have gone with a simple -L option (taking the
> same
> > > kind
> > > > > of
> > > > > > > > input than -l but with new behavior), and not combine a
> flag
> > > with
> > > > > > > > existing options.
> > > > > > > >
> > > > > > >
> > > > > > > That would be an easier patchset to do up. However, it
> would
> > > then
> > > > > mean
> > > > > > > that we have no less than 4 different ways to specify the
> cores
> > > to
> > > > > use:
> > > > > > > "- c", "-l", "-L", "--lcores" - and therefore need 4
> different
> > > sets
> > > > > of
> > > > > > > parsing options for them, and more checks to ensure we have
> > > only
> > > > > one of
> > > > > > > the 4 specified in any run. That's why I chose the modifier
> > > option,
> > > > > and
> > > > > > > to try and consolidate the core setup a bit.
> > > > > > >
> > > > > > > However, if having a completely new option is preferred, I
> am
> > > happy
> > > > > > > enough to do up a different patchset for that.
> > > > > > >
> > > > > > > > I scanned through the series, not much to say.  Maybe add
> a
> > > unit
> > > > > test
> > > > > > > > for new cmdline option.
> > > > > > > >
> > > > > > > Sure. Once it's decided what approach (if any) to take,
> I'll do
> > > up
> > > > > a
> > > > > > > new patchset, taking into account any relevant feedback on
> this
> > > > > set.
> > > > > > >
> > > > > > > /Bruce
> > > > > >
> > > > > > Changing the EAL parameter parser to a "two pass parser"
> makes
> > > sense.
> > > > > I
> > > > > > think checking for existence of more than one lcore
> specification
> > > > > options
> > > > > > should suffice; we don't need to accept multiple lcore
> > > specification
> > > > > > options and check for conflicts.
> > > > > >
> > > > > > When remapping, do we need to support gaps in the "lcore"
> > > (logical
> > > > > cores)
> > > > > > array, e.g. for secondary processes, or can it be continuous
> from
> > > 0
> > > > > to
> > > > > > the number of specified lcores?
> > > > > >
> > > > > > And are new EAL parameters for this really beneficial?
> Doesn't
> > > e.g.
> > > > > "-l
> > > > > > 0-9@130-139,100@140" suffice?
> > > > > >
> > > > > Actually, I believe "0-9@130-139"[1]  is not the same as
> > > > > "0@130,1@131,2@132,...". The latter affinities one thread to
> one
> > > core
> > > > > ten
> > > > > times over, while the former affinitizes 10 threads to 10 cores
> -
> > > > > leaving
> > > > > those threads free to move about within the 10 cores specified.
> > > >
> > > > Interesting. The documentation [GSG] isn't clear to me about
> this; a
> > > example there could help clarify.
> > > >
> > > > [GSG]:
> > >
> https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#lcore-
> > > related-options
> > > >
> > >
> > > Yep, agreed.
> > >
> > > > If users are manually passing lcore parameters to the EAL, then I
> see
> > > why some sort of remapping shorthand is useful.
> > > > IMO, if the mappings are relatively exotic, it should be
> acceptable
> > > requiring an external script to build a long list of mapping
> parameters
> > > and then invoke the application with those script-generated EAL
> > > parameters.
> > > > This would reduce the scope to support relatively simple, common
> > > mappings.
> > > >
> > > > Could we expand the --lcores syntax to support common mappings?
> > > >
> > > > E.g. "0-9@130+" to do what I thought.
> > > > The lack of "()" treats the entries individually (not as a
> group),
> > > and the "+" indicates auto-increment.
> > > >
> > > > A more advanced example:
> > > > "0-9@(130-131)+", meaning lcore 0 gets cpus 130-131, lcore 1 gets
> > > cpus 132-133, etc.
> > > >
> > >
> > > My issues with the above syntax idea is:
> > > * I think it's overly complicating the lcores parameter adding in
> the
> > >   support for the "+" symbol - especially in the advanced case
> example
> > > you
> > >   provide. I worry about code maintainability here.
> > > * More significantly for me, I think it's also getting things
> backwards
> > > in
> > >   that it is focusing more on the lcore ids visible to the app,
> rather
> > > than
> > >   the physical cores to be used. For the example above of 0-9@130+,
> > > what I
> > >   would expect is that the user is mainly thinking about the cores
> he
> > > wants
> > >   to use 130-139, which are then to be mapped to lower ids. If we
> had
> > > the
> > >   syntax reversed where the physical cores were first, I'd say it
> would
> > >   make more sense, e.g. 130-139@0+
> >
> > I 100 % agree on the syntax being backwards.
> > A good reason for introducing a new parameter, rather than expanding
> on "--lcores".
> >
> 
> Yes and no.
> I agree with not expanding on --lcores, but I also don't think any new
> parameter added should attempt to reproduce all of what lcores does. I
> would leave --lcores as-is, as the power-tool for lcore config e.g.
> what
> you talk about below for mapping multiple lcores to the same physical
> cpu.
> 
> > We should consider deprecating the old (backwards) syntax, so users
> don’t get confused about one EAL parameter being "backwards" of the
> other.
> >
> 
> I disagree with this. I would be ok with deprecating the old "-c"
> coremask
> syntax - I think the time is long past when we should be dealing with
> masks. However, removing "-l" and "--lcores" flag is, to me anyway, too
> big
> and jarring a change for end users for us to consider.
> 
> > > * finally, as I found out last month, there are systems on which
> the
> > > cores
> > >   are spread across numa-nodes on odd/even boundaries, so to have
> an
> > > app
> > >   running on socket 0, you need to give the core ids as a list i.e.
> > >   0,2,4,6, and cannot use ranges. [This also reenforces the point
> above
> > >   too, where again it's the internal ids we need to generalize, not
> the
> > >   physical cpus]
> >
> > For this, we could use "/2" (like in subnets), or "+2" as increment
> parameter.
> >
> > >
> > > My thinking on the matter, and I'm happy to be corrected if I'm
> wrong
> > > here,
> > > is that end-users are unlikely to significantly care what the
> actual
> > > lcore
> > > ids are internally in the program, so long as they work and are
> unique.
> >
> > Generally yes.
> > Note that we should allow the same physical CPU core to be assigned
> to multiple lcores...
> > If a CPU core is shared between multiple worker lcores, then it could
> be problematic, but with a mix of lcore roles, this might be handy for
> some applications. E.g. a virtual machine with only one CPU core could
> use that single CPU core as both main and worker lcore, or as main and
> service lcore.
> > Sharing an lcore between threads requires the developer to take
> special care of e.g. spinlocks, but the EAL parameter parser should not
> prohibit it. It might log a notice, though.
> >
> 
> This is already taken care of via --lcores, so I don't see the need to
> reimplement it using a new flag also. Any new flags we add should be
> kept
> deliberately simple.
> 
> > > What does matter is the physical cpus on which the code is to run.
> > > Therefore, my approach to this is to find the simplest possible
> > > solution
> > > whereby the user can just provide a list of cores and tell EAL to
> just
> > > map
> > > them to reasonable values. For the "reasonable" values, I would
> imagine
> > > that for the vast majority of cases starting "0" is what is wanted.
> For
> > > anything beyond that, we already have the existing --lcores syntax
> to
> > > be
> > > used.
> >
> > Agree with all of the above. :-)
> >
> 
> Great. That still leaves us with the problem of what exactly to do as
> the
> best solution. Here are some alternatives that I see:
> 
> 1. Add a modifier flag for -l and -c parameters to auto-remap the lcore
> ids
>    to zero, so user is just specifying physical CPU id's.
> 2. Add a new flag for specifying physical cpu ids, which auto-remaps
> the
>    cores specified to zero.
>    2a. To simplify our code and user experience we could at the same
> time:
>        * deprecate "-c" flag for coremasks
>        * make "-l" and "--lcores" the same flag just in long and short
>          versions. This should not break anything as "-l" parameter is
>          just as subset of what "--lcores" provides.
>        * that would leave us with effectively two core flag paths:
>           - -l/--lcores, behaviour as now, full explicit control
>           - -L/--lcores-remapped, takes simplified core list (only ","
> and
>             "-" supported as with "-l" now), and maps them to zero-
> based.

+1 for 2a.

> 
> Third options? Any other feedback?

If the internal lcore ids are effectively hidden by passing physical CPU ids, any related options should take physical CPU ids too. I.e. the options for choosing main lcore and service lcores also need variants with other names.
The service cores are passed as a core mask; this should be deprecated (like "-c"). And superseded by a parameter taking a core list.


  reply	other threads:[~2025-04-07 13:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13 11:38 [RFC PATCH " Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 1/3] eal: centralize core parameter parsing Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 2/3] eal: convert core masks and lists to core sets Bruce Richardson
2025-03-13 11:38 ` [RFC PATCH 3/3] eal: allow automatic mapping of high lcore ids Bruce Richardson
2025-03-24 17:30 ` [PATCH v2 0/3] allow easier use of high lcore-ids Bruce Richardson
2025-03-24 17:30   ` [PATCH v2 1/3] eal: centralize core parameter parsing Bruce Richardson
2025-04-07  6:58     ` David Marchand
2025-03-24 17:30   ` [PATCH v2 2/3] eal: convert core masks and lists to core sets Bruce Richardson
2025-04-07  6:59     ` David Marchand
2025-03-24 17:30   ` [PATCH v2 3/3] eal: allow automatic mapping of high lcore ids Bruce Richardson
2025-04-01 14:06   ` [PATCH v2 0/3] allow easier use of high lcore-ids Bruce Richardson
2025-04-07  7:04     ` David Marchand
2025-04-07  9:48       ` Bruce Richardson
2025-04-07 10:15         ` Morten Brørup
2025-04-07 10:40           ` Bruce Richardson
2025-04-07 11:32             ` Morten Brørup
2025-04-07 11:56               ` Bruce Richardson
2025-04-07 12:25                 ` Morten Brørup
2025-04-07 12:41                   ` Bruce Richardson
2025-04-07 13:18                     ` Morten Brørup [this message]
2025-04-07 13:24                       ` Bruce Richardson
2025-04-07 15:14           ` Stephen Hemminger
2025-04-07 15:38             ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9FBA0@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).