<Snipped>

Thank you Antaloy for the response. Let me try to share my understanding.

I recently looked into how Intel's Sub-NUMA Clustering would work within
DPDK, and found that I actually didn't have to do anything, because the
SNC "clusters" present themselves as NUMA nodes, which DPDK already
supports natively.

yes, this is correct. In Intel Xeon Platinum BIOS one can enable `Cluster per NUMA` as `1,2 or4`.

This divides the tiles into Sub-Numa parition, each having separate lcores,memory controllers, PCIe

and accelerator. 


Does AMD's implementation of chiplets not report themselves as separate
NUMA nodes?

In AMD EPYC Soc, this is different. There are 2 BIOS settings, namely

1. NPS: `Numa Per Socket` which allows the IO tile (memory, PCIe and Accelerator) to be partitioned as Numa 0, 1, 2 or 4.

2. L3 as NUMA: `L3 cache of CPU tiles as individual NUMA`. This allows all CPU tiles to be independent NUMA cores.


The above settings are possible because CPU is independent from IO tile. Thus allowing 4 combinations be available for use.

These are covered in the tuning gudie for the SoC in 12. How to get best performance on AMD platform — Data Plane Development Kit 24.07.0 documentation (dpdk.org).


Because if it does, I don't really think any changes are
required because NUMA nodes would give you the same thing, would it not?

I have a different opinion to this outlook. An end user can

1. Identify the lcores and it's NUMA user `usertools/cpu-layout.py`

2. But it is core mask in eal arguments which makes the threads available to be used in a process.

3. there are no API which distinguish L3 numa domain. Function `rte_socket_id ` for CPU tiles like AMD SoC will return physical socket.


Example: In AMD EPYC Genoa, there are total of 13 tiles. 12 CPU tiles and 1 IO tile. Setting 

1. NPS to 4 will divide the memory, PCIe and accelerator into 4 domain. While the all CPU will appear as single NUMA but each 12 tile having independent L3 caches.

2. Setting `L3 as NUMA` allows each tile to appear as separate L3 clusters.


Hence, adding an API which allows to select available lcores based on Split L3 is essential irrespective of the BIOS setting.



--
Thanks,
Anatoly