DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Varghese, Vipin" <Vipin.Varghese@amd.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Mattias Rönnblom" <hofors@lysator.liu.se>
Cc: "Yigit, Ferruh" <Ferruh.Yigit@amd.com>,
	"dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>
Subject: RE: [RFC 0/2] introduce LLC aware functions
Date: Thu, 12 Sep 2024 01:33:49 +0000	[thread overview]
Message-ID: <PH7PR12MB8596D0906FA40A55FAE516D282642@PH7PR12MB8596.namprd12.prod.outlook.com> (raw)
In-Reply-To: <716375DE-0C2F-4983-934A-144D7DE342C6@arm.com>

[Public]

Snipped

> >>>>
> >>>> <snipped>
> >>>>
> >>>>>> <snipped>
> >>>>>>
> >>>>>> Thank you Mattias for the comments and question, please let me
> >>>>>> try to explain the same below
> >>>>>>
> >>>>>>> We shouldn't have a separate CPU/cache hierarchy API instead?
> >>>>>>
> >>>>>> Based on the intention to bring in CPU lcores which share same L3
> >>>>>> (for better cache hits and less noisy neighbor) current API
> >>>>>> focuses on using
> >>>>>>
> >>>>>> Last Level Cache. But if the suggestion is `there are SoC where
> >>>>>> L2 cache are also shared, and the new API should be provisioned`,
> >>>>>> I am also
> >>>>>>
> >>>>>> comfortable with the thought.
> >>>>>>
> >>>>>
> >>>>> Rather than some AMD special case API hacked into <rte_lcore.h>, I
> >>>>> think we are better off with no DPDK API at all for this kind of
> functionality.
> >>>>
> >>>> Hi Mattias, as shared in the earlier email thread, this is not a
> >>>> AMD special
> >>> case at all. Let me try to explain this one more time. One of
> >>> techniques used to increase cores cost effective way to go for tiles of
> compute complexes.
> >>>> This introduces a bunch of cores in sharing same Last Level Cache
> >>>> (namely
> >>> L2, L3 or even L4) depending upon cache topology architecture.
> >>>>
> >>>> The API suggested in RFC is to help end users to selectively use
> >>>> cores under
> >>> same Last Level Cache Hierarchy as advertised by OS (irrespective of
> >>> the BIOS settings used). This is useful in both bare-metal and container
> environment.
> >>>>
> >>>
> >>> I'm pretty familiar with AMD CPUs and the use of tiles (including
> >>> the challenges these kinds of non-uniformities pose for work scheduling).
> >>>
> >>> To maximize performance, caring about core<->LLC relationship may
> >>> well not be enough, and more HT/core/cache/memory topology
> >>> information is required. That's what I meant by special case. A
> >>> proper API should allow access to information about which lcores are
> >>> SMT siblings, cores on the same L2, and cores on the same L3, to
> >>> name a few things. Probably you want to fit NUMA into the same API
> >>> as well, although that is available already in <rte_lcore.h>.
> >> Thank you Mattias for the information, as shared by in the reply with
> Anatoly we want expose a new API `rte_get_next_lcore_ex` which intakes a
> extra argument `u32 flags`.
> >> The flags can be RTE_GET_LCORE_L1 (SMT), RTE_GET_LCORE_L2,
> RTE_GET_LCORE_L3, RTE_GET_LCORE_BOOST_ENABLED,
> RTE_GET_LCORE_BOOST_DISABLED.
> >
> > Wouldn't using that API be pretty awkward to use?
Current API available under DPDK is ` rte_get_next_lcore`, which is used within DPDK example and in customer solution.
Based on the comments from others we responded to the idea of changing the new Api from ` rte_get_next_lcore_llc` to ` rte_get_next_lcore_exntd`.

Can you please help us understand what is `awkward`.

> >
> > I mean, what you have is a topology, with nodes of different types and with
> different properties, and you want to present it to the user.
Let me be clear, what we want via DPDK to help customer to use an Unified API which works across multiple platforms.
Example - let a vendor have 2 products namely A and B. CPU-A has all cores within same SUB-NUMA domain and CPU-B has cores split to 2 sub-NUMA domain based on split LLC.
When `rte_get_next_lcore_extnd` is invoked for `LLC` on
1. CPU-A: it returns all cores as there is no split
2. CPU-B: it returns cores from specific sub-NUMA which is partitioned by L3

> >
> > In a sense, it's similar to XCM and DOM versus SAX. The above is SAX-style,
> and what I have in mind is something DOM-like.
> >
> > What use case do you have in mind? What's on top of my list is a scenario
> where a DPDK app gets a bunch of cores (e.g., -l <cores>) and tries to figure
> out how best make use of them.
Exactly.

 It's not going to "skip" (ignore, leave unused)
> SMT siblings, or skip non-boosted cores, it would just try to be clever in
> regards to which cores to use for what purpose.
Let me try to share my idea on SMT sibling. When user invoked for rte_get_next_lcore_extnd` is invoked for `L1 | SMT` flag with `lcore`; the API identifies first whether given lcore is part of enabled core list.
If yes, it programmatically either using `sysfs` or `hwloc library (shared the version concern on distros. Will recheck again)` identify the sibling thread and return.
If there is no sibling thread available under DPDK it will fetch next lcore (probably lcore +1 ).

> >
> >> This is AMD EPYC SoC agnostic and trying to address for all generic cases.
> >> Please do let us know if we (Ferruh & myself) can sync up via call?
> >
> > Sure, I can do that.

Let me sync with Ferruh and get a time slot for internal sync.

> >
> Can this be opened to the rest of the community? This is a common problem
> that needs to be solved for multiple architectures. I would be interested in
> attending.
Thank you Mattias, in DPDK Bangkok summit 2024 we did bring this up. As per the suggestion from Thomas and Jerrin we tried to bring the RFC for discussion.
For DPDK Montreal 2024, Keesang and Ferruh (most likely) is travelling for the summit and presenting this as the talk to get things moving.

>
> >>>
<snipped>

  reply	other threads:[~2024-09-12  1:33 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 15:10 Vipin Varghese
2024-08-27 15:10 ` [RFC 1/2] eal: add llc " Vipin Varghese
2024-08-27 17:36   ` Stephen Hemminger
2024-09-02  0:27     ` Varghese, Vipin
2024-08-27 20:56   ` Wathsala Wathawana Vithanage
2024-08-29  3:21     ` 答复: " Feifei Wang
2024-09-02  1:20     ` Varghese, Vipin
2024-09-03 17:54       ` Wathsala Wathawana Vithanage
2024-09-04  8:18         ` Bruce Richardson
2024-09-06 11:59         ` Varghese, Vipin
2024-09-12 16:58           ` Wathsala Wathawana Vithanage
2024-08-27 15:10 ` [RFC 2/2] eal/lcore: add llc aware for each macro Vipin Varghese
2024-08-27 21:23 ` [RFC 0/2] introduce LLC aware functions Mattias Rönnblom
2024-09-02  0:39   ` Varghese, Vipin
2024-09-04  9:30     ` Mattias Rönnblom
2024-09-04 14:37       ` Stephen Hemminger
2024-09-11  3:13         ` Varghese, Vipin
2024-09-11  3:53           ` Stephen Hemminger
2024-09-12  1:11             ` Varghese, Vipin
2024-09-09 14:22       ` Varghese, Vipin
2024-09-09 14:52         ` Mattias Rönnblom
2024-09-11  3:26           ` Varghese, Vipin
2024-09-11 15:55             ` Mattias Rönnblom
2024-09-11 17:04               ` Honnappa Nagarahalli
2024-09-12  1:33                 ` Varghese, Vipin [this message]
2024-09-12  6:38                   ` Mattias Rönnblom
2024-09-12  7:02                     ` Mattias Rönnblom
2024-09-12 11:23                       ` Varghese, Vipin
2024-09-12 12:12                         ` Mattias Rönnblom
2024-09-12 15:50                           ` Stephen Hemminger
2024-09-12 11:17                     ` Varghese, Vipin
2024-09-12 11:59                       ` Mattias Rönnblom
2024-09-12 13:30                         ` Bruce Richardson
2024-09-12 16:32                           ` Mattias Rönnblom
2024-09-12  2:28                 ` Varghese, Vipin
2024-09-11 16:01             ` Bruce Richardson
2024-09-11 22:25               ` Konstantin Ananyev
2024-09-12  2:38                 ` Varghese, Vipin
2024-09-12  2:19               ` Varghese, Vipin
2024-09-12  9:17                 ` Bruce Richardson
2024-09-12 11:50                   ` Varghese, Vipin
2024-09-13 14:15                     ` Burakov, Anatoly
2024-09-12 13:18                   ` Mattias Rönnblom
2024-08-28  8:38 ` Burakov, Anatoly
2024-09-02  1:08   ` Varghese, Vipin
2024-09-02 14:17     ` Burakov, Anatoly
2024-09-02 15:33       ` Varghese, Vipin
2024-09-03  8:50         ` Burakov, Anatoly
2024-09-05 13:05           ` Ferruh Yigit
2024-09-05 14:45             ` Burakov, Anatoly
2024-09-05 15:34               ` Ferruh Yigit
2024-09-06  8:44                 ` Burakov, Anatoly
2024-09-09 14:14                   ` Varghese, Vipin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH7PR12MB8596D0906FA40A55FAE516D282642@PH7PR12MB8596.namprd12.prod.outlook.com \
    --to=vipin.varghese@amd.com \
    --cc=Ferruh.Yigit@amd.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=hofors@lysator.liu.se \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).