DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC v3 0/3] Introduce Topology NUMA grouping for lcores
@ 2024-10-30  5:41 Vipin Varghese
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Vipin Varghese @ 2024-10-30  5:41 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk, jerinj
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

One of the way to increase core density in a given physical package is
to use easy to replicate tiles. These tiles are either core complexes or
core complexes with IO (memory and PCIe). This results to possibility of
having two types NUMA topology.
 - CPU topology & IO topology
 - CPU+IO topology

For platforms like
 - AMD SoC EPYC, core complexes are in separate CPU domain and IO
   are in different NUMA domain (except for zen1 Naples)
 - Intel 4th Xeon (SPR) & above, the CPU+IO NUMA partitioning is
   achieved by BIOS option `SNC as 1, 2 or 4`.
 - Ampere One allow CPU NUMA paritioning by BIOS option `SLC`.
 - while other platforms has 2 or 4 cores sharing same L2 cache.

Grouping DPDK logical cores within the same Cache and IO, helps to
leverage application same cache or IO locality. For applications to
leverage cache or IO locality, ones needs to use lcores sharing same
topology. This approach ensures more consistent latencies by
minimizing the dispersion of lcores across different tiles.

Using lcores in same NUMA domain shows imporvement for applications
 - using pipline staging
 - each lcore processing part of payload
 - eventual hit in either L2 or L3

Library dependency: hwloc

Topology Flags:
---------------
 - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache
 - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1
 - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache
 - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache
 - RTE_LCORE_DOMAIN_IO: group cores sharing same IO

< Function: Purpose >
---------------------
 - rte_get_domain_count: get domain count based on Topology Flag
 - rte_lcore_count_from_domain: get valid lcores count under each domain
 - rte_get_lcore_in_domain: valid lcore id based on index
 - rte_get_next_lcore_from_domain: next valid lcore within domain
 - rte_get_next_lcore_from_next_domain: next valid lcore from next domain

Note:
 1. Topology is NUMA grouping.
 2. Domain is various sub-groups within a specific Topology.

Topology example: L1, L2, L3, IO
Domian example: IO-A, IO-B

< MACRO: Purpose >
------------------
 - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains
 - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all domains
 - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore
 - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th lcore.

Future work (after merge):
--------------------------
 - dma-perf per IO NUMA
 - eventdev per L3 NUMA
 - pipeline per SMT|L3 NUMA
 - distributor per L3 for Port-Queue
 - l2fwd-power per SMT
 - testpmd option for IO NUMA per port

Platform tested on:
-------------------
 - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2)
 - AMD EPYC 8534P (supports IO numa 1 & 2)
 - AMD EPYC 9554 (supports IO numa 1, 2, 4)

Logs:
-----
1. INTEL(R) XEON(R) PLATINUM 8562Y+:
 - SNC=1
        Domain (IO): at index (0) there are 48 core, with (0) at index 0
 - SNC=2
        Domain (IO): at index (0) there are 24 core, with (0) at index 0
        Domain (IO): at index (1) there are 24 core, with (12) at index 0

2. AMD EPYC 8534P:
 - NPS=1:
        Domain (IO): at index (0) there are 128 core, with (0) at index 0
 - NPS=2:
        Domain (IO): at index (0) there are 64 core, with (0) at index 0
        Domain (IO): at index (1) there are 64 core, with (32) at index 0


Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>

Vipin Varghese (3):
  eal/lcore: add topology based functions
  test/lcore: enable tests for topology
  examples: add lcore topology API calls

 app/test/test_lcores.c            | 189 ++++++++++
 config/meson.build                |  18 +
 examples/helloworld/main.c        | 142 +++++++-
 examples/l2fwd/main.c             |  56 ++-
 examples/skeleton/basicfwd.c      |  22 ++
 lib/eal/common/eal_common_lcore.c | 580 ++++++++++++++++++++++++++++++
 lib/eal/common/eal_private.h      |  48 +++
 lib/eal/freebsd/eal.c             |  10 +
 lib/eal/include/rte_lcore.h       | 168 +++++++++
 lib/eal/linux/eal.c               |  11 +
 lib/eal/meson.build               |   4 +
 lib/eal/version.map               |   9 +
 lib/eal/windows/eal.c             |  12 +
 13 files changed, 1259 insertions(+), 10 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 [RFC v3 0/3] Introduce Topology NUMA grouping for lcores Vipin Varghese
@ 2024-10-30  5:41 ` Vipin Varghese
  2024-10-30 15:43   ` Stephen Hemminger
                     ` (4 more replies)
  2024-10-30  5:41 ` [RFC v3 2/3] test/lcore: enable tests for topology Vipin Varghese
  2024-10-30  5:41 ` [RFC v3 3/3] examples: add lcore topology API calls Vipin Varghese
  2 siblings, 5 replies; 28+ messages in thread
From: Vipin Varghese @ 2024-10-30  5:41 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk, jerinj
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Introduce topology aware lcore mapping into lcore API.
With higher core density, more and more cores are categorized
into various chiplets based on IO (memory and PCIe) and
Last Level Cache (mainly L3).

Using hwloc library, the dpdk available lcores can be grouped
into various groups nameley L1, L2, L3 and IO. This patch
introduces functions and MACRO that helps to identify such
groups.

Internal API:
 - rte_eal_topology_init
 - rte_eal_topology_release
 - get_domain_lcore_mapping

External Experimental API:
 - rte_domain_count
 - rte_lcore_count_from_domain
 - rte_get_lcore_from_domain
 - rte_get_next_lcore_from_domain
 - rte_get_next_lcore_from_next_domain

v2 changes: focuses on rte_lcore api for getting topology
 - use hwloc instead of sysfs exploration - Mattias Rönnblom
 - L1, L2 and IO domain mapping - Ferruh, Vipin
 - new API marked experimental - Stephen Hemminger

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 config/meson.build                |  18 +
 lib/eal/common/eal_common_lcore.c | 580 ++++++++++++++++++++++++++++++
 lib/eal/common/eal_private.h      |  48 +++
 lib/eal/freebsd/eal.c             |  10 +
 lib/eal/include/rte_lcore.h       | 168 +++++++++
 lib/eal/linux/eal.c               |  11 +
 lib/eal/meson.build               |   4 +
 lib/eal/version.map               |   9 +
 lib/eal/windows/eal.c             |  12 +
 9 files changed, 860 insertions(+)

diff --git a/config/meson.build b/config/meson.build
index 8dae811378..a48822dcb1 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -240,6 +240,24 @@ if find_libnuma
     endif
 endif
 
+has_libhwloc = false
+find_libhwloc = true
+
+if meson.is_cross_build() and not meson.get_cross_property('hwloc', true)
+    # don't look for libhwloc if explicitly disabled in cross build
+    find_libhwloc = false
+endif
+
+if find_libhwloc
+    hwloc_dep = cc.find_library('hwloc', required: false)
+    if hwloc_dep.found() and cc.has_header('hwloc.h')
+        dpdk_conf.set10('RTE_HAS_LIBHWLOC', true)
+        has_libhwloc = true
+        add_project_link_arguments('-lhwloc', language: 'c')
+        dpdk_extra_ldflags += '-lhwloc'
+    endif
+endif
+
 has_libfdt = false
 fdt_dep = cc.find_library('fdt', required: false)
 if fdt_dep.found() and cc.has_header('fdt.h')
diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c
index 2ff9252c52..1ab9ead1d1 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -112,6 +112,302 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap)
 	return i;
 }
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+static struct core_domain_mapping *
+get_domain_lcore_mapping(unsigned int domain_sel, unsigned int domain_indx)
+{
+	struct core_domain_mapping *ptr =
+		(domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1[domain_indx] : NULL;
+
+	return ptr;
+}
+#endif
+
+unsigned int rte_get_domain_count(unsigned int domain_sel __rte_unused)
+{
+	unsigned int domain_cnt = 0;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	if (domain_sel & RTE_LCORE_DOMAIN_ALL) {
+		domain_cnt =
+			(domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1_count : 0;
+	}
+#endif
+
+	return domain_cnt;
+}
+
+unsigned int
+rte_lcore_count_from_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+	unsigned int core_cnt = 0;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	unsigned int domain_cnt = 0;
+
+	if ((domain_sel & RTE_LCORE_DOMAIN_ALL) == 0)
+		return core_cnt;
+
+	domain_cnt = rte_get_domain_count(domain_sel);
+	if ((domain_indx != RTE_LCORE_DOMAIN_LCORES_ALL) && (domain_indx >= domain_cnt))
+		return core_cnt;
+
+	core_cnt = (domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1_core_count : 0;
+
+	if ((domain_indx != RTE_LCORE_DOMAIN_LCORES_ALL) && (core_cnt)) {
+		struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+		core_cnt = ptr->core_count;
+	}
+#endif
+
+	return core_cnt;
+}
+
+unsigned int
+rte_get_lcore_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused, unsigned int lcore_pos __rte_unused)
+{
+	uint16_t sel_core = RTE_MAX_LCORE;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	unsigned int domain_cnt = 0;
+	unsigned int core_cnt = 0;
+
+	if (domain_sel & RTE_LCORE_DOMAIN_ALL) {
+		domain_cnt = rte_get_domain_count(domain_sel);
+		if (domain_cnt == 0)
+			return sel_core;
+
+		core_cnt = rte_lcore_count_from_domain(domain_sel, RTE_LCORE_DOMAIN_LCORES_ALL);
+		if (core_cnt == 0)
+			return sel_core;
+
+		struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+		if ((ptr) && (ptr->core_count)) {
+			if (lcore_pos < ptr->core_count)
+				sel_core = ptr->cores[lcore_pos];
+		}
+	}
+#endif
+
+	return sel_core;
+}
+
+unsigned int
+rte_get_next_lcore_from_domain(unsigned int indx __rte_unused,
+int skip_main __rte_unused, int wrap __rte_unused, uint32_t flag __rte_unused)
+{
+	if (indx >= RTE_MAX_LCORE) {
+		indx = rte_get_next_lcore(-1, skip_main, wrap);
+		return indx;
+	}
+	uint16_t usr_lcore = indx % RTE_MAX_LCORE;
+	uint16_t sel_domain_core = RTE_MAX_LCORE;
+
+	EAL_LOG(DEBUG, "lcore (%u), skip main lcore (%d), wrap (%d), flag (%u)",
+		usr_lcore, skip_main, wrap, flag);
+
+	/* check the input lcore indx */
+	if (!rte_lcore_is_enabled(indx)) {
+		EAL_LOG(ERR, "User input lcore (%u) is not enabled!!!", indx);
+		return sel_domain_core;
+	}
+
+	if ((rte_lcore_count() == 1)) {
+		EAL_LOG(DEBUG, "1 lcore only in dpdk process!!!");
+		sel_domain_core = wrap ? indx : sel_domain_core;
+		return sel_domain_core;
+	}
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	uint16_t main_lcore = rte_get_main_lcore();
+	uint16_t sel_domain = 0xffff;
+	uint16_t sel_domain_core_index = 0xffff;
+	uint16_t sel_domain_core_count = 0;
+
+	struct core_domain_mapping *ptr = NULL;
+	uint16_t domain_count = 0;
+	uint16_t domain_core_count = 0;
+	uint16_t *domain_core_list = NULL;
+
+
+	domain_count = rte_get_domain_count(flag);
+	if (domain_count == 0) {
+		EAL_LOG(ERR, "No cores found for the flag (%u)!!!", flag);
+		return sel_domain_core;
+	}
+
+	/* identify the lcore to get the domain to start from */
+	for (int i = 0; (i < domain_count) && (sel_domain_core_index == 0xffff); i++) {
+		ptr = get_domain_lcore_mapping(flag, i);
+
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		for (int j = 0; j < domain_core_count; j++) {
+			if (usr_lcore == domain_core_list[j]) {
+				sel_domain_core_index = j;
+				sel_domain_core_count = domain_core_count;
+				sel_domain = i;
+				break;
+			}
+		}
+	}
+
+	if (sel_domain_core_count == 1) {
+		EAL_LOG(DEBUG, "there is no more lcore in the domain!!!");
+		return sel_domain_core;
+	}
+
+	EAL_LOG(DEBUG, "selected: domain (%u), core: count %u, index %u, core: current %u",
+		sel_domain, sel_domain_core_count, sel_domain_core_index,
+		domain_core_list[sel_domain_core_index]);
+
+	/* get next lcore from the selected domain */
+	/* next lcore is always `sel_domain_core_index + 1`, but needs boundary check */
+	bool lcore_found = false;
+	uint16_t next_domain_lcore_index = sel_domain_core_index + 1;
+	while (false == lcore_found) {
+
+		if (next_domain_lcore_index >= sel_domain_core_count) {
+			if (wrap) {
+				next_domain_lcore_index = 0;
+				continue;
+			}
+			break;
+		}
+
+		/* check if main lcore skip */
+		if ((domain_core_list[next_domain_lcore_index] == main_lcore) && (skip_main)) {
+			next_domain_lcore_index += 1;
+			continue;
+		}
+
+		lcore_found = true;
+	}
+	if (true == lcore_found)
+		sel_domain_core = domain_core_list[next_domain_lcore_index];
+#endif
+
+	EAL_LOG(DEBUG, "Selected core (%u)", sel_domain_core);
+	return sel_domain_core;
+}
+
+unsigned int
+rte_get_next_lcore_from_next_domain(unsigned int indx __rte_unused,
+int skip_main __rte_unused, int wrap __rte_unused,
+uint32_t flag __rte_unused, int cores_to_skip __rte_unused)
+{
+	uint16_t sel_domain_core = RTE_MAX_LCORE;
+	uint16_t usr_lcore = indx % RTE_MAX_LCORE;
+
+	if (indx >= RTE_MAX_LCORE) {
+		indx = rte_get_next_lcore(-1, skip_main, wrap);
+		return indx;
+	}
+
+	EAL_LOG(DEBUG, "lcore (%u), skip main lcore (%d), wrap (%d), flag (%u)",
+		usr_lcore, skip_main, wrap, flag);
+
+	/* check the input lcore indx */
+	if (!rte_lcore_is_enabled(indx)) {
+		EAL_LOG(ERR, "User input lcore (%u) is not enabled!!!", indx);
+		return sel_domain_core;
+	}
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	uint16_t main_lcore = rte_get_main_lcore();
+
+	uint16_t sel_domain = 0xffff;
+	uint16_t sel_domain_core_index = 0xffff;
+
+	uint16_t domain_count = 0;
+	uint16_t domain_core_count = 0;
+	uint16_t *domain_core_list = NULL;
+
+	domain_count = rte_get_domain_count(flag);
+	if (domain_count == 0) {
+		EAL_LOG(ERR, "No Domains found for the flag (%u)!!!", flag);
+		return sel_domain_core;
+	}
+
+	/* identify the lcore to get the domain to start from */
+	struct core_domain_mapping *ptr = NULL;
+	for (int i = 0; (i < domain_count) && (sel_domain_core_index == 0xffff); i++) {
+		ptr = get_domain_lcore_mapping(flag, i);
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		for (int j = 0; j < domain_core_count; j++) {
+			if (usr_lcore == domain_core_list[j]) {
+				sel_domain_core_index = j;
+				sel_domain = i;
+				break;
+			}
+		}
+	}
+
+	if (sel_domain_core_index == 0xffff) {
+		EAL_LOG(ERR, "Invalid lcore %u for the flag (%u)!!!", indx, flag);
+		return sel_domain_core;
+	}
+
+	EAL_LOG(DEBUG, "Selected - core_index (%u); domain (%u), core_count (%u), cores (%p)",
+		sel_domain_core_index, sel_domain, domain_core_count, domain_core_list);
+
+	uint16_t skip_cores = (cores_to_skip >= 0) ? cores_to_skip : (0 - cores_to_skip);
+
+	/* get the next domain & valid lcore */
+	sel_domain = (((1 + sel_domain) == domain_count) && (wrap)) ? 0 : (1 + sel_domain);
+	sel_domain_core_index = 0xffff;
+
+	bool iter_loop = false;
+	for (int i = sel_domain; (i < domain_count) && (sel_domain_core == RTE_MAX_LCORE); i++) {
+		ptr = (flag & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1[i] :
+			(flag & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2[i] :
+			(flag & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3[i] :
+			(flag & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io[i] : NULL;
+
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		/* check if we have cores to iterate from this domain */
+		if (skip_cores >= domain_core_count)
+			continue;
+
+		if (((1 + sel_domain) == domain_count) && (wrap)) {
+			if (iter_loop == true)
+				break;
+
+			iter_loop = true;
+		}
+
+		sel_domain_core_index = (cores_to_skip >= 0) ? skip_cores :
+					(domain_core_count - skip_cores);
+		sel_domain_core = domain_core_list[sel_domain_core_index];
+
+		if ((skip_main) && (sel_domain_core == main_lcore)) {
+			sel_domain_core_index = 0xffff;
+			sel_domain_core = RTE_MAX_LCORE;
+			continue;
+		}
+	}
+#endif
+
+	EAL_LOG(DEBUG, "Selected core (%u)", sel_domain_core);
+	return sel_domain_core;
+}
+
 unsigned int
 rte_lcore_to_socket_id(unsigned int lcore_id)
 {
@@ -131,6 +427,290 @@ socket_id_cmp(const void *a, const void *b)
 	return 0;
 }
 
+
+
+/*
+ * Use HWLOC library to parse L1|L2|L3|NUMA-IO on the running target machine.
+ * Store the topology structure in memory.
+ */
+int
+rte_eal_topology_init(void)
+{
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	memset(&topo_cnfg, 0, sizeof(struct topology_config));
+
+	hwloc_topology_init(&topo_cnfg.topology);
+	hwloc_topology_load(topo_cnfg.topology);
+
+	int l1_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L1CACHE);
+	int l2_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L2CACHE);
+	int l3_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L3CACHE);
+	int io_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_NUMANODE);
+
+	if (l1_depth) {
+		topo_cnfg.l1_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l1_depth);
+		topo_cnfg.l1 = (struct core_domain_mapping **)
+			malloc(sizeof(struct core_domain_mapping *) * topo_cnfg.l1_count);
+		if (topo_cnfg.l1 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l1_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l1_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l1[j] = (struct core_domain_mapping *)
+					malloc(sizeof(struct core_domain_mapping));
+			if (topo_cnfg.l1[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l1[j]->core_count = 0;
+			topo_cnfg.l1[j]->cores = (uint16_t *)
+						malloc(sizeof(uint16_t) * cpu_count);
+			if (topo_cnfg.l1[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L1|SMT domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l1[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					topo_cnfg.l1[j]->core_count += 1;
+					topo_cnfg.l1_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	if (l2_depth) {
+		topo_cnfg.l2_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l2_depth);
+		topo_cnfg.l2 = (struct core_domain_mapping **)
+				malloc(sizeof(struct core_domain_mapping *) * topo_cnfg.l2_count);
+		if (topo_cnfg.l2 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l2_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l2_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l2[j] = (struct core_domain_mapping *)
+					malloc(sizeof(struct core_domain_mapping));
+			if (topo_cnfg.l2[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l2[j]->core_count = 0;
+			topo_cnfg.l2[j]->cores = (uint16_t *)
+						malloc(sizeof(uint16_t) * cpu_count);
+			if (topo_cnfg.l2[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L2 domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l2[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					topo_cnfg.l2[j]->core_count += 1;
+					topo_cnfg.l2_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+	if (l3_depth) {
+		topo_cnfg.l3_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l3_depth);
+		topo_cnfg.l3 = (struct core_domain_mapping **)
+				malloc(sizeof(struct core_domain_mapping *) * topo_cnfg.l3_count);
+		if (topo_cnfg.l3 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l3_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l3_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l3[j] = (struct core_domain_mapping *)
+					malloc(sizeof(struct core_domain_mapping));
+			if (topo_cnfg.l3[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l3[j]->core_count = 0;
+			topo_cnfg.l3[j]->cores = (uint16_t *)
+						malloc(sizeof(uint16_t) * cpu_count);
+			if (topo_cnfg.l3[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L3 domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l3[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					topo_cnfg.l3[j]->core_count += 1;
+					topo_cnfg.l3_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+	if (io_depth) {
+		topo_cnfg.io_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, io_depth);
+		topo_cnfg.io = (struct core_domain_mapping **)
+				malloc(sizeof(struct core_domain_mapping *) * topo_cnfg.io_count);
+		if (topo_cnfg.io == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.io_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, io_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.io[j] = (struct core_domain_mapping *)
+					malloc(sizeof(struct core_domain_mapping));
+			if (topo_cnfg.io[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.io[j]->core_count = 0;
+			topo_cnfg.io[j]->cores = (uint16_t *)
+						malloc(sizeof(uint16_t) * cpu_count);
+			if (topo_cnfg.io[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " IO domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.io[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					topo_cnfg.io[j]->core_count += 1;
+					topo_cnfg.io_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	hwloc_topology_destroy(topo_cnfg.topology);
+	topo_cnfg.topology = NULL;
+
+	EAL_LOG(INFO, "Details of Topology:");
+	EAL_LOG(INFO, " - domain count: l1 %u, l2, %u, l3 %u, io %u",
+		topo_cnfg.l1_count, topo_cnfg.l2_count,
+		topo_cnfg.l3_count, topo_cnfg.io_count);
+	EAL_LOG(INFO, " - core count: l1 %u, l2 %u, l3 %u, io %u",
+		topo_cnfg.l1_core_count, topo_cnfg.l2_core_count,
+		topo_cnfg.l3_core_count, topo_cnfg.io_core_count);
+#endif
+
+	return 0;
+}
+
+/*
+ * release HWLOC topology structure memory
+ */
+int
+rte_eal_topology_release(void)
+{
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	EAL_LOG(DEBUG, "release l1 domain memory!");
+	for (int i = 0; i < topo_cnfg.l1_count; i++) {
+		if (topo_cnfg.l1[i]->cores) {
+			free(topo_cnfg.l1[i]->cores);
+			topo_cnfg.l1[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l1) {
+		free(topo_cnfg.l1);
+		topo_cnfg.l1 = NULL;
+	}
+	topo_cnfg.l1_count = 0;
+
+	EAL_LOG(DEBUG, "release l2 domain memory!");
+	for (int i = 0; i < topo_cnfg.l2_count; i++) {
+		if (topo_cnfg.l2[i]->cores) {
+			free(topo_cnfg.l2[i]->cores);
+			topo_cnfg.l2[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l2) {
+		free(topo_cnfg.l2);
+		topo_cnfg.l2 = NULL;
+	}
+	topo_cnfg.l2_count = 0;
+
+	EAL_LOG(DEBUG, "release l3 domain memory!");
+	for (int i = 0; i < topo_cnfg.l3_count; i++) {
+		if (topo_cnfg.l3[i]->cores) {
+			free(topo_cnfg.l3[i]->cores);
+			topo_cnfg.l3[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l3) {
+		free(topo_cnfg.l3);
+		topo_cnfg.l3 = NULL;
+	}
+	topo_cnfg.l3_count = 0;
+
+	EAL_LOG(DEBUG, "release IO domain memory!");
+	for (int i = 0; i < topo_cnfg.io_count; i++) {
+		if (topo_cnfg.io[i]->cores) {
+			free(topo_cnfg.io[i]->cores);
+			topo_cnfg.io[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.io) {
+		free(topo_cnfg.io);
+		topo_cnfg.io = NULL;
+	}
+	topo_cnfg.io_count = 0;
+#endif
+
+	return 0;
+}
+
 /*
  * Parse /sys/devices/system/cpu to get the number of physical and logical
  * processors on the machine. The function will fill the cpu_info
diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index bb315dab04..ed97f112ca 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -17,6 +17,10 @@
 
 #include "eal_internal_cfg.h"
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+#include <hwloc.h>
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -40,6 +44,36 @@ struct lcore_config {
 
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
+struct core_domain_mapping {
+	uint16_t core_count;
+	uint16_t *cores;
+};
+
+struct topology_config {
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	hwloc_topology_t topology;
+#endif
+
+	/* domain count */
+	uint16_t l1_count;
+	uint16_t l2_count;
+	uint8_t l3_count;
+	uint8_t io_count;
+
+	/* total cores under all domain */
+	uint16_t l1_core_count;
+	uint16_t l2_core_count;
+	uint16_t l3_core_count;
+	uint16_t io_core_count;
+
+	/* two dimensional array for each domain */
+	struct core_domain_mapping **l1;
+	struct core_domain_mapping **l2;
+	struct core_domain_mapping **l3;
+	struct core_domain_mapping **io;
+};
+extern struct topology_config topo_cnfg;
+
 /**
  * The global RTE configuration structure.
  */
@@ -81,6 +115,20 @@ struct rte_config *rte_eal_get_configuration(void);
  */
 int rte_eal_memzone_init(void);
 
+
+/**
+ * Initialize the topology structure using HWLOC Library
+ */
+__rte_internal
+int rte_eal_topology_init(void);
+
+/**
+ * Release the memory held by Topology structure
+ */
+__rte_internal
+int rte_eal_topology_release(void);
+
+
 /**
  * Fill configuration with number of physical and logical processors
  *
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 1229230063..301f993748 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -73,6 +73,8 @@ struct lcore_config lcore_config[RTE_MAX_LCORE];
 /* used by rte_rdtsc() */
 int rte_cycles_vmware_tsc_map;
 
+/* holds topology information */
+struct topology_config topo_cnfg;
 
 int
 eal_clean_runtime_dir(void)
@@ -912,6 +914,12 @@ rte_eal_init(int argc, char **argv)
 			return -1;
 	}
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -932,6 +940,8 @@ rte_eal_cleanup(void)
 
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
+
+	rte_eal_topology_release();
 	rte_service_finalize();
 	rte_mp_channel_cleanup();
 	eal_bus_cleanup();
diff --git a/lib/eal/include/rte_lcore.h b/lib/eal/include/rte_lcore.h
index 7deae47af3..f6c3597656 100644
--- a/lib/eal/include/rte_lcore.h
+++ b/lib/eal/include/rte_lcore.h
@@ -18,6 +18,7 @@
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_thread.h>
+#include <rte_bitset.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -37,6 +38,39 @@ enum rte_lcore_role_t {
 	ROLE_NON_EAL,
 };
 
+/**
+ * The lcore grouping with in the L1 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L1  RTE_BIT32(0)
+/**
+ * The lcore grouping with in the L2 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L2  RTE_BIT32(1)
+/**
+ * The lcore grouping with in the L3 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L3  RTE_BIT32(2)
+/**
+ * The lcore grouping with in the IO Domain.
+ */
+#define RTE_LCORE_DOMAIN_IO  RTE_BIT32(3)
+/**
+ * The lcore grouping with in the SMT Domain (Like L1 Domain).
+ */
+#define RTE_LCORE_DOMAIN_SMT RTE_LCORE_DOMAIN_L1
+/**
+ * The lcore grouping based on Domains (L1|L2|L3|IO).
+ */
+#define RTE_LCORE_DOMAIN_ALL (RTE_LCORE_DOMAIN_L1 |     \
+				RTE_LCORE_DOMAIN_L2 |   \
+				RTE_LCORE_DOMAIN_L3 |   \
+				RTE_LCORE_DOMAIN_IO)
+/**
+ * The mask for getting all cores under same topology.
+ */
+#define RTE_LCORE_DOMAIN_LCORES_ALL RTE_GENMASK32(31, 0)
+
+
 /**
  * Get a lcore's role.
  *
@@ -211,6 +245,108 @@ int rte_lcore_is_enabled(unsigned int lcore_id);
  */
 unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap);
 
+/**
+ * Get count for selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|IO].
+ * @return
+ *   total count for selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int rte_get_domain_count(unsigned int domain_sel);
+
+/**
+ * Get count for lcores for a domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @return
+ *   total count for lcore in a selected index of a domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_lcore_count_from_domain(unsigned int domain_sel, unsigned int domain_indx);
+
+/**
+ * Get n'th lcore from a selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @param lcore_pos
+ *   lcore position, valid range from 0 to (dpdk_enabled_lcores in the domain -1)
+ * @return
+ *   lcore from the list for the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_lcore_in_domain(unsigned int domain_sel,
+unsigned int domain_indx, unsigned int lcore_pos);
+
+/**
+ * Get the enabled lcores from next domain based on extended flag.
+ *
+ * @param i
+ *   The current lcore (reference).
+ * @param skip_main
+ *   If true, do not return the ID of the main lcore.
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_LCORE_DOMAIN_[L1|L2|L3|IO]
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_next_lcore_from_domain(unsigned int i, int skip_main, int wrap,
+uint32_t flag);
+
+/**
+ * Get the Nth (first|last) lcores from next domain based on extended flag.
+ *
+ * @param i
+ *   The current lcore (reference).
+ * @param skip_main
+ *   If true, do not return the ID of the main lcore.
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_LCORE_DOMAIN_(L1|L2|L3|IO)
+ * @param cores_to_skip
+ *   If set to positive value, will skip to Nth lcore from start.
+ *   If set to negative value, will skipe to Nth lcore from last.
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_next_lcore_from_next_domain(unsigned int i,
+int skip_main, int wrap, uint32_t flag, int cores_to_skip);
+
 /**
  * Macro to browse all running lcores.
  */
@@ -227,6 +363,38 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap);
 	     i < RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Macro to browse all running lcores in a domain.
+ */
+#define RTE_LCORE_FOREACH_DOMAIN(i, flag)				\
+	for (i = rte_get_next_lcore_from_domain(-1, 0, 0, flag);	\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_domain(i, 0, 0, flag))
+
+/**
+ * Macro to browse all running lcores except the main lcore in domain.
+ */
+#define RTE_LCORE_FOREACH_WORKER_DOMAIN(i, flag)				\
+	for (i = rte_get_next_lcore_from_domain(-1, 1, 0, flag);	\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_domain(i, 1, 0, flag))
+
+/**
+ * Macro to browse Nth lcores on each domain.
+ */
+#define RTE_LCORE_FORN_NEXT_DOMAIN(i, flag, n)				\
+	for (i = rte_get_next_lcore_from_next_domain(-1, 0, 0, flag, n);\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_next_domain(i, 0, 0, flag, n))
+
+/**
+ * Macro to browse all Nth lcores except the main lcore on each domain.
+ */
+#define RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(i, flag, n)			\
+	for (i = rte_get_next_lcore_from_next_domain(-1, 1, 0, flag, n);\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_next_domain(i, 1, 0, flag, n))
+
 /**
  * Callback prototype for initializing lcores.
  *
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 54577b7718..093f208319 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -65,6 +65,9 @@
  * duration of the program, as we hold a write lock on it in the primary proc */
 static int mem_cfg_fd = -1;
 
+/* holds topology information */
+struct topology_config topo_cnfg;
+
 static struct flock wr_lock = {
 		.l_type = F_WRLCK,
 		.l_whence = SEEK_SET,
@@ -1311,6 +1314,12 @@ rte_eal_init(int argc, char **argv)
 			return -1;
 	}
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -1352,6 +1361,8 @@ rte_eal_cleanup(void)
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
 
+	rte_eal_topology_release();
+
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
 			internal_conf->hugepage_file.unlink_existing)
 		rte_memseg_walk(mark_freeable, NULL);
diff --git a/lib/eal/meson.build b/lib/eal/meson.build
index e1d6c4cf17..690b95d5df 100644
--- a/lib/eal/meson.build
+++ b/lib/eal/meson.build
@@ -31,3 +31,7 @@ endif
 if is_freebsd
     annotate_locks = false
 endif
+
+if has_libhwloc
+    dpdk_conf.set10('RTE_EAL_HWLOC_TOPOLOGY_PROBE', true)
+endif
diff --git a/lib/eal/version.map b/lib/eal/version.map
index f493cd1ca7..6c5b3ad205 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -399,6 +399,13 @@ EXPERIMENTAL {
 
 	# added in 24.11
 	rte_bitset_to_str;
+
+	# added in 25.03
+	rte_get_domain_count;
+	rte_lcore_count_from_domain;
+	rte_get_lcore_in_domain;
+	rte_get_next_lcore_from_domain;
+	rte_get_next_lcore_from_next_domain;
 };
 
 INTERNAL {
@@ -408,6 +415,8 @@ INTERNAL {
 	rte_bus_unregister;
 	rte_eal_get_baseaddr;
 	rte_eal_parse_coremask;
+	rte_eal_topology_init;
+	rte_eal_topology_release;
 	rte_firmware_read;
 	rte_intr_allow_others;
 	rte_intr_cap_multiple;
diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
index 28b78a95a6..2edfc4128c 100644
--- a/lib/eal/windows/eal.c
+++ b/lib/eal/windows/eal.c
@@ -40,6 +40,10 @@ static int mem_cfg_fd = -1;
 /* internal configuration (per-core) */
 struct lcore_config lcore_config[RTE_MAX_LCORE];
 
+/* holds topology information */
+struct topology_config topo_cnfg;
+
+
 /* Detect if we are a primary or a secondary process */
 enum rte_proc_type_t
 eal_proc_type_detect(void)
@@ -262,6 +266,8 @@ rte_eal_cleanup(void)
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
 
+	rte_eal_topology_release();
+
 	eal_intr_thread_cancel();
 	eal_mem_virt2iova_cleanup();
 	eal_bus_cleanup();
@@ -505,6 +511,12 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MAIN);
 	rte_eal_mp_wait_lcore();
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC v3 2/3] test/lcore: enable tests for topology
  2024-10-30  5:41 [RFC v3 0/3] Introduce Topology NUMA grouping for lcores Vipin Varghese
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
@ 2024-10-30  5:41 ` Vipin Varghese
  2024-10-30 11:50   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
  2024-10-30  5:41 ` [RFC v3 3/3] examples: add lcore topology API calls Vipin Varghese
  2 siblings, 1 reply; 28+ messages in thread
From: Vipin Varghese @ 2024-10-30  5:41 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk, jerinj
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

add functional test cases to validate topology supported lcore
API.

v3 changes:
 - fix test test-report/2024-October/817748.html

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 app/test/test_lcores.c | 189 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 189 insertions(+)

diff --git a/app/test/test_lcores.c b/app/test/test_lcores.c
index bd5c0dd94b..a40763d890 100644
--- a/app/test/test_lcores.c
+++ b/app/test/test_lcores.c
@@ -389,6 +389,186 @@ test_ctrl_thread(void)
 	return 0;
 }
 
+static int
+test_topology_macro(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_wrkr_lcores = 0;
+
+	unsigned int total_lcore_io = 0;
+	unsigned int total_lcore_l3 = 0;
+	unsigned int total_lcore_l2 = 0;
+	unsigned int total_lcore_l1 = 0;
+
+	unsigned int total_wrkr_lcore_io = 0;
+	unsigned int total_wrkr_lcore_l3 = 0;
+	unsigned int total_wrkr_lcore_l2 = 0;
+	unsigned int total_wrkr_lcore_l1 = 0;
+
+	unsigned int lcore;
+
+	/* get topology core count */
+	lcore = -1;
+	RTE_LCORE_FOREACH(lcore)
+		total_lcores += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO)
+		total_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3)
+		total_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2)
+		total_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1)
+		total_lcore_l1 += 1;
+
+	printf("INFO: lcore count topology: none (%u), io (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_lcores, total_lcore_io,
+		total_lcore_l3, total_lcore_l2,
+		total_lcore_l1);
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER(lcore)
+		total_wrkr_lcores += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO)
+		total_wrkr_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3)
+		total_wrkr_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2)
+		total_wrkr_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1)
+		total_wrkr_lcore_l1 += 1;
+
+	printf("INFO: worker lcore count topology: none (%u), io (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_wrkr_lcores, total_wrkr_lcore_io,
+		total_wrkr_lcore_l3, total_wrkr_lcore_l2,
+		total_wrkr_lcore_l1);
+
+	if ((total_wrkr_lcores + 1) != total_lcores) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_io) > total_lcore_io) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for IO\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l3) > total_lcore_l3) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L3\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l2) > total_lcore_l2) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L2\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l1) > total_lcore_l1) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L1\n");
+		return -2;
+	}
+
+	printf("INFO: lcore DOMAIN macro: success!\n");
+
+	return 0;
+}
+
+static int
+test_lcore_count_from_domain(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_lcore_io = 0;
+	unsigned int total_lcore_l3 = 0;
+	unsigned int total_lcore_l2 = 0;
+	unsigned int total_lcore_l1 = 0;
+
+	unsigned int domain_count;
+	unsigned int i;
+
+	/* get topology core count */
+	total_lcores = rte_lcore_count();
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_io += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l3 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L3, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l2 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L2, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l1 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L1, i);
+
+	printf("INFO: lcore count: none (%u), io (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_lcores, total_lcore_io, total_lcore_l3, total_lcore_l2, total_lcore_l1);
+
+	if ((total_lcores != total_lcore_l1) ||
+		(total_lcores != total_lcore_l2) ||
+		(total_lcores != total_lcore_l3) ||
+		(total_lcores != total_lcore_io)) {
+		printf("ERR: failed in domain API\n");
+		return -2;
+	}
+
+	printf("INFO: lcore count domain API: success\n");
+
+	return 0;
+}
+
+static int
+test_lcore_from_domain_negative(void)
+{
+	unsigned int domain_count;
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+	if (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, domain_count)) {
+		printf("ERR: domain API inconsistent for IO\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+	if (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L3, domain_count)) {
+		printf("ERR: domain API inconsistent for L3\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+	if (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L2, domain_count)) {
+		printf("ERR: domain API inconsistent for L2\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+	if (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L1, domain_count)) {
+		printf("ERR: domain API inconsistent for L1\n");
+		return -1;
+	}
+
+	printf("INFO: lcore domain API: success!\n");
+
+	return 0;
+}
+
 static int
 test_lcores(void)
 {
@@ -419,6 +599,15 @@ test_lcores(void)
 	if (test_ctrl_thread() < 0)
 		return TEST_FAILED;
 
+	if (test_topology_macro() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_count_from_domain() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_from_domain_negative() < 0)
+		return TEST_FAILED;
+
 	return TEST_SUCCESS;
 }
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30  5:41 [RFC v3 0/3] Introduce Topology NUMA grouping for lcores Vipin Varghese
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
  2024-10-30  5:41 ` [RFC v3 2/3] test/lcore: enable tests for topology Vipin Varghese
@ 2024-10-30  5:41 ` Vipin Varghese
  2024-10-30 11:49   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
  2 siblings, 1 reply; 28+ messages in thread
From: Vipin Varghese @ 2024-10-30  5:41 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk, jerinj
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Enhance example code to allow topology based lcores API, while
retaining default behaviour.

 - helloworld: allow lcoes to send hello to lcores in selected topology.
 - l2fwd: allow use of IO lcores topology.
 - skeleton: choose the lcore from IO topology which has more ports.

v3 changes:
 - fix typo from SE_NO_TOPOLOGY to USE_NO_TOPOLOGY

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 examples/helloworld/main.c   | 142 ++++++++++++++++++++++++++++++++++-
 examples/l2fwd/main.c        |  56 ++++++++++++--
 examples/skeleton/basicfwd.c |  22 ++++++
 3 files changed, 210 insertions(+), 10 deletions(-)

diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
index af509138da..9845c3775c 100644
--- a/examples/helloworld/main.c
+++ b/examples/helloworld/main.c
@@ -7,6 +7,7 @@
 #include <stdint.h>
 #include <errno.h>
 #include <sys/queue.h>
+#include <getopt.h>
 
 #include <rte_memory.h>
 #include <rte_launch.h>
@@ -14,6 +15,14 @@
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_debug.h>
+#include <rte_log.h>
+
+#define RTE_LOGTYPE_HELLOWORLD RTE_LOGTYPE_USER1
+#define USE_NO_TOPOLOGY 0xffff
+
+static uint16_t topo_sel = USE_NO_TOPOLOGY;
+/* lcore selector based on Topology */
+static const char short_options[] = "T:";
 
 /* Launch a function on lcore. 8< */
 static int
@@ -21,11 +30,109 @@ lcore_hello(__rte_unused void *arg)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+
 	printf("hello from core %u\n", lcore_id);
 	return 0;
 }
+
+static int
+send_lcore_hello(__rte_unused void *arg)
+{
+	unsigned int lcore_id;
+	uint16_t send_lcore_id;
+	uint16_t send_count = 0;
+
+	lcore_id = rte_lcore_id();
+
+	send_lcore_id = rte_get_next_lcore_from_domain(lcore_id, false, true, topo_sel);
+
+	while ((send_lcore_id != RTE_MAX_LCORE) && (lcore_id != send_lcore_id)) {
+		printf("hello from core %u to core %u\n", lcore_id, send_lcore_id);
+		send_lcore_id = rte_get_next_lcore_from_domain(send_lcore_id,
+				false, true, topo_sel);
+		send_count += 1;
+	}
+
+	if (send_count == 0)
+		RTE_LOG(INFO, HELLOWORLD, "for lcoe %u; no lcores in same domain!!!\n", lcore_id);
+
+	return 0;
+}
 /* >8 End of launching function on lcore. */
 
+/* display usage. 8< */
+static void
+helloworld_usage(const char *prgname)
+{
+	printf("%s [EAL options] -- [-T TOPO]\n"
+		"  -T TOPO: choose topology to send hello to\n"
+		"	- 0: send cores sharing L1 (SMT)\n"
+		"	- 1: send cores sharing L2\n"
+		"	- 2: send cores sharing L3\n"
+		"	- 3: send cores sharing IO\n\n",
+		prgname);
+}
+
+static unsigned int
+parse_topology(const char *q_arg)
+{
+	char *end = NULL;
+	unsigned long n;
+
+	/* parse the topology option */
+	n = strtoul(q_arg, &end, 10);
+
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return 0;
+
+	if (n > 3)
+		return USE_NO_TOPOLOGY;
+
+	n = (n == 0) ? RTE_LCORE_DOMAIN_L1 :
+		(n == 1) ? RTE_LCORE_DOMAIN_L2 :
+		(n == 2) ? RTE_LCORE_DOMAIN_L3 :
+		RTE_LCORE_DOMAIN_IO;
+
+	return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+helloworld_parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt = argv;
+	int option_index;
+	char *prgname = argv[0];
+	while ((opt = getopt_long(argc, argvopt, short_options,
+				NULL, &option_index)) != EOF) {
+		switch (opt) {
+		/* Topology selection */
+		case 'T':
+			topo_sel = parse_topology(optarg);
+			if (topo_sel == USE_NO_TOPOLOGY) {
+				helloworld_usage(prgname);
+				rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
+			}
+
+			RTE_LOG(DEBUG, HELLOWORLD, "USR selects (%s) domain cores!\n",
+				(topo_sel == RTE_LCORE_DOMAIN_L1) ? "L1" :
+				(topo_sel == RTE_LCORE_DOMAIN_L2) ? "L2" :
+				(topo_sel == RTE_LCORE_DOMAIN_L3) ? "L3" : "IO");
+			ret = 0;
+			break;
+		default:
+			helloworld_usage(prgname);
+			return -1;
+		}
+	}
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+	ret = optind-1;
+	optind = 1; /* reset getopt lib */
+	return ret;
+}
+
 /* Initialization of Environment Abstraction Layer (EAL). 8< */
 int
 main(int argc, char **argv)
@@ -38,15 +145,46 @@ main(int argc, char **argv)
 		rte_panic("Cannot init EAL\n");
 	/* >8 End of initialization of Environment Abstraction Layer */
 
+	argc -= ret;
+	argv += ret;
+
+	ret = helloworld_parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
+	if (topo_sel != USE_NO_TOPOLOGY) {
+		uint16_t domain_count = rte_get_domain_count(topo_sel);
+		RTE_LOG(DEBUG, HELLOWORLD, "selected Domain (%s)\n",
+			(topo_sel == RTE_LCORE_DOMAIN_L1) ? "L1" :
+			(topo_sel == RTE_LCORE_DOMAIN_L2) ? "L2" :
+			(topo_sel == RTE_LCORE_DOMAIN_L3) ? "L3" : "IO");
+
+		for (int i = 0; i < domain_count; i++) {
+			uint16_t domain_lcore_count = rte_lcore_count_from_domain(topo_sel, i);
+			uint16_t domain_lcore = rte_get_lcore_in_domain(topo_sel, i, 0);
+
+			if (domain_lcore_count)
+				RTE_LOG(DEBUG, HELLOWORLD, "at index (%u), %u cores, lcore (%u) at index 0\n",
+					i,
+					domain_lcore_count,
+					domain_lcore);
+		}
+	}
+
 	/* Launches the function on each lcore. 8< */
 	RTE_LCORE_FOREACH_WORKER(lcore_id) {
 		/* Simpler equivalent. 8< */
-		rte_eal_remote_launch(lcore_hello, NULL, lcore_id);
+		rte_eal_remote_launch((topo_sel == USE_NO_TOPOLOGY) ?
+					lcore_hello : send_lcore_hello, NULL, lcore_id);
 		/* >8 End of simpler equivalent. */
 	}
 
 	/* call it on main lcore too */
-	lcore_hello(NULL);
+	if (topo_sel == USE_NO_TOPOLOGY)
+		lcore_hello(NULL);
+	else
+		send_lcore_hello(NULL);
+
 	/* >8 End of launching the function on each lcore. */
 
 	rte_eal_mp_wait_lcore();
diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c
index c6fafdd019..398dd15502 100644
--- a/examples/l2fwd/main.c
+++ b/examples/l2fwd/main.c
@@ -46,6 +46,9 @@ static int mac_updating = 1;
 /* Ports set in promiscuous mode off by default. */
 static int promiscuous_on;
 
+/* select lcores based on ports numa (RTE_LCORE_DOMAIN_IO). */
+static bool select_port_from_io_domain;
+
 #define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
@@ -314,6 +317,7 @@ l2fwd_usage(const char *prgname)
 	       "  -P : Enable promiscuous mode\n"
 	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
 	       "  -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+	       "  -t : Enable IO domain lcores mapping to Ports\n"
 	       "  --no-mac-updating: Disable MAC addresses updating (enabled by default)\n"
 	       "      When enabled:\n"
 	       "       - The source MAC address is replaced by the TX port MAC address\n"
@@ -431,6 +435,7 @@ static const char short_options[] =
 	"P"   /* promiscuous */
 	"q:"  /* number of queues */
 	"T:"  /* timer period */
+	"t"  /* lcore from port io numa */
 	;
 
 #define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating"
@@ -502,6 +507,11 @@ l2fwd_parse_args(int argc, char **argv)
 			timer_period = timer_secs;
 			break;
 
+		/* lcores from port io numa */
+		case 't':
+			select_port_from_io_domain = true;
+			break;
+
 		/* long options */
 		case CMD_LINE_OPT_PORTMAP_NUM:
 			ret = l2fwd_parse_port_pair_config(optarg);
@@ -654,7 +664,7 @@ main(int argc, char **argv)
 	uint16_t nb_ports;
 	uint16_t nb_ports_available = 0;
 	uint16_t portid, last_port;
-	unsigned lcore_id, rx_lcore_id;
+	uint16_t lcore_id, rx_lcore_id;
 	unsigned nb_ports_in_mask = 0;
 	unsigned int nb_lcores = 0;
 	unsigned int nb_mbufs;
@@ -738,18 +748,48 @@ main(int argc, char **argv)
 	qconf = NULL;
 
 	/* Initialize the port/queue configuration of each logical core */
+	if (rte_get_domain_count(RTE_LCORE_DOMAIN_IO) == 0)
+		rte_exit(EXIT_FAILURE, "we do not have enough cores in IO numa!\n");
+
+	uint16_t coreindx_io_domain[RTE_MAX_ETHPORTS] = {0};
+	uint16_t lcore_io_domain[RTE_MAX_ETHPORTS] = {RTE_MAX_LCORE};
+	uint16_t l3_domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+
+	for (int i = 0; i < l3_domain_count; i++)
+		lcore_io_domain[i] = rte_get_lcore_in_domain(RTE_LCORE_DOMAIN_IO, i, 0);
+
 	RTE_ETH_FOREACH_DEV(portid) {
 		/* skip ports that are not enabled */
 		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
 			continue;
 
-		/* get the lcore_id for this port */
-		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
-		       lcore_queue_conf[rx_lcore_id].n_rx_port ==
-		       l2fwd_rx_queue_per_lcore) {
-			rx_lcore_id++;
-			if (rx_lcore_id >= RTE_MAX_LCORE)
-				rte_exit(EXIT_FAILURE, "Not enough cores\n");
+		/* get IO NUMA for the port */
+		int port_socket = rte_eth_dev_socket_id(portid);
+
+		if (select_port_from_io_domain == false) {
+			/* get the lcore_id for this port */
+			while ((rte_lcore_is_enabled(rx_lcore_id) == 0) ||
+			       (lcore_queue_conf[rx_lcore_id].n_rx_port ==
+				l2fwd_rx_queue_per_lcore)) {
+				rx_lcore_id++;
+				if (rx_lcore_id >= RTE_MAX_LCORE)
+					rte_exit(EXIT_FAILURE, "Not enough cores\n");
+			}
+		} else {
+			/* get lcore from IO numa for this port */
+			rx_lcore_id = lcore_io_domain[port_socket];
+
+			if (lcore_queue_conf[rx_lcore_id].n_rx_port == l2fwd_rx_queue_per_lcore) {
+				coreindx_io_domain[port_socket] += 1;
+				rx_lcore_id = rte_get_lcore_in_domain(RTE_LCORE_DOMAIN_IO,
+						port_socket, coreindx_io_domain[port_socket]);
+			}
+
+			if (rx_lcore_id == RTE_MAX_LCORE)
+				rte_exit(EXIT_FAILURE, "unable find IO (%u) numa lcore for port (%u)\n",
+					 port_socket, portid);
+
+			lcore_io_domain[port_socket] = rx_lcore_id;
 		}
 
 		if (qconf != &lcore_queue_conf[rx_lcore_id]) {
diff --git a/examples/skeleton/basicfwd.c b/examples/skeleton/basicfwd.c
index 133293cf15..6d3786b33f 100644
--- a/examples/skeleton/basicfwd.c
+++ b/examples/skeleton/basicfwd.c
@@ -176,6 +176,11 @@ main(int argc, char *argv[])
 	unsigned nb_ports;
 	uint16_t portid;
 
+	uint16_t ports_socket_domain[RTE_MAX_ETHPORTS] = {0};
+	uint16_t sel_io_socket = 0;
+	uint16_t sel_io_indx = 0;
+	uint16_t core_count_from_io = 0;
+
 	/* Initializion the Environment Abstraction Layer (EAL). 8< */
 	int ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -190,6 +195,20 @@ main(int argc, char *argv[])
 	if (nb_ports < 2 || (nb_ports & 1))
 		rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
 
+	/* get the socekt of each port */
+	RTE_ETH_FOREACH_DEV(portid) {
+		ports_socket_domain[rte_eth_dev_socket_id(portid)] += 1;
+
+		if (ports_socket_domain[rte_eth_dev_socket_id(portid)] > sel_io_socket) {
+			sel_io_socket = ports_socket_domain[rte_eth_dev_socket_id(portid)];
+			sel_io_indx = rte_eth_dev_socket_id(portid);
+		}
+	}
+
+	core_count_from_io = rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, sel_io_indx);
+	if (core_count_from_io == 0)
+		printf("\nWARNING: select main_lcore from IO domain (%u)\n", sel_io_indx);
+
 	/* Creates a new mempool in memory to hold the mbufs. */
 
 	/* Allocates mempool to hold the mbufs. 8< */
@@ -210,6 +229,9 @@ main(int argc, char *argv[])
 	if (rte_lcore_count() > 1)
 		printf("\nWARNING: Too many lcores enabled. Only 1 used.\n");
 
+	if (rte_lcore_to_socket_id(rte_lcore_id()) != sel_io_indx)
+		printf("\nWARNING: please use lcore from IO domain %u.\n", sel_io_indx);
+
 	/* Call lcore_main on the main core only. Called on single lcore. 8< */
 	lcore_main();
 	/* >8 End of called on single lcore. */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30  5:41 ` [RFC v3 3/3] examples: add lcore topology API calls Vipin Varghese
@ 2024-10-30 11:49   ` Pavan Nikhilesh Bhagavatula
  2024-10-30 12:06     ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2024-10-30 11:49 UTC (permalink / raw)
  To: Vipin Varghese, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

> Enhance example code to allow topology based lcores API, while
> retaining default behaviour.
> 
>  - helloworld: allow lcoes to send hello to lcores in selected topology.
>  - l2fwd: allow use of IO lcores topology.
>  - skeleton: choose the lcore from IO topology which has more ports.
> 
> v3 changes:
>  - fix typo from SE_NO_TOPOLOGY to USE_NO_TOPOLOGY
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
> ---

I see compilation failure on ARM platforms due to missing header include.

../examples/helloworld/main.c: In function 'parse_topology':
../examples/helloworld/main.c:83:13: error: implicit declaration of function 'strtoul'; did you mean 'strtok'? [-Wimplicit-function-declaration]
   83 |         n = strtoul(q_arg, &end, 10);
      |             ^~~~~~~
      |             strtok
../examples/helloworld/main.c:83:13: warning: nested extern declaration of 'strtoul' [-Wnested-externs]
../examples/helloworld/main.c: In function 'helloworld_parse_args':
../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE' undeclared (first use in this function)
  115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
      |                                          ^~~~~~~~~~~~
../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is defined in header '<stdlib.h>'; this is probably fixable by adding '#include <stdlib.h>'
   12 | #include <rte_memory.h>
  +++ |+#include <stdlib.h>
   13 | #include <rte_launch.h>
../examples/helloworld/main.c:115:42: note: each undeclared identifier is reported only once for each function it appears in
  115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
      |                                          ^~~~~~~~~~~~
../examples/helloworld/main.c: In function 'main':
../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE' undeclared (first use in this function)
  153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
      |                          ^~~~~~~~~~~~
../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is defined in header '<stdlib.h>'; thi

Below header include should fix it.

diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
index 9845c3775c3a..f49bd0108f74 100644
--- a/examples/helloworld/main.c
+++ b/examples/helloworld/main.c
@@ -3,6 +3,7 @@
  */
 
 #include <stdio.h>
+#include <stdlib.h>
 #include <string.h>
 #include <stdint.h>
 #include <errno.h>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 2/3] test/lcore: enable tests for topology
  2024-10-30  5:41 ` [RFC v3 2/3] test/lcore: enable tests for topology Vipin Varghese
@ 2024-10-30 11:50   ` Pavan Nikhilesh Bhagavatula
  2024-11-04  3:07     ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2024-10-30 11:50 UTC (permalink / raw)
  To: Vipin Varghese, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

> add functional test cases to validate topology supported lcore
> API.
> 
> v3 changes:
>  - fix test test-report/2024-October/817748.html
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
> ---


Test fails on ARM platform:


lcore 19, socket 0, role RTE, cpuset 19
Control thread running successfully
INFO: lcore count topology: none (12), io (1), l3 (1), l2 (1), l1 (1).
INFO: worker lcore count topology: none (11), io (1), l3 (1), l2 (1), l1 (1).
INFO: lcore DOMAIN macro: success!
INFO: lcore count: none (12), io (0), l3 (0), l2 (0), l1 (0).
ERR: failed in domain API
Test Failed



^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30 11:49   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
@ 2024-10-30 12:06     ` Varghese, Vipin
  2024-10-30 12:37       ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Varghese, Vipin @ 2024-10-30 12:06 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Hi Pavan,

Snipped

> 
> I see compilation failure on ARM platforms due to missing header include.
> 
> ../examples/helloworld/main.c: In function 'parse_topology':
> ../examples/helloworld/main.c:83:13: error: implicit declaration of function 'strtoul';
> did you mean 'strtok'? [-Wimplicit-function-declaration]
>    83 |         n = strtoul(q_arg, &end, 10);
>       |             ^~~~~~~
>       |             strtok
> ../examples/helloworld/main.c:83:13: warning: nested extern declaration of 'strtoul' [-
> Wnested-externs]
> ../examples/helloworld/main.c: In function 'helloworld_parse_args':
> ../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE' undeclared (first use
> in this function)
>   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
>       |                                          ^~~~~~~~~~~~
> ../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is defined in header
> '<stdlib.h>'; this is probably fixable by adding '#include <stdlib.h>'
>    12 | #include <rte_memory.h>
>   +++ |+#include <stdlib.h>
>    13 | #include <rte_launch.h>
> ../examples/helloworld/main.c:115:42: note: each undeclared identifier is reported
> only once for each function it appears in
>   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
>       |                                          ^~~~~~~~~~~~
> ../examples/helloworld/main.c: In function 'main':
> ../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE' undeclared (first use
> in this function)
>   153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
>       |                          ^~~~~~~~~~~~
> ../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is defined in header
> '<stdlib.h>'; thi
> 
> Below header include should fix it.
> 
> diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c index
> 9845c3775c3a..f49bd0108f74 100644
> --- a/examples/helloworld/main.c
> +++ b/examples/helloworld/main.c
> @@ -3,6 +3,7 @@
>   */
> 
>  #include <stdio.h>
> +#include <stdlib.h>
>  #include <string.h>
>  #include <stdint.h>
>  #include <errno.h>

Thank you for helping me here, I did run with `check_includes & developer_mode`, it did not throw this error.
Before patch submission I tried `devtools/test-meson-builds.sh` too. I think internally this is not using ` check_includes & developer_mode `.
Let me recheck and fix this in version 4.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30 12:06     ` Varghese, Vipin
@ 2024-10-30 12:37       ` Varghese, Vipin
  2024-10-30 19:34         ` Pavan Nikhilesh Bhagavatula
  0 siblings, 1 reply; 28+ messages in thread
From: Varghese, Vipin @ 2024-10-30 12:37 UTC (permalink / raw)
  To: Varghese, Vipin, Pavan Nikhilesh Bhagavatula, dev, roretzla,
	bruce.richardson, john.mcnamara, dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

 
> Hi Pavan,
> 
> Snipped
> 
> >
> > I see compilation failure on ARM platforms due to missing header include.
> >
> > ../examples/helloworld/main.c: In function 'parse_topology':
> > ../examples/helloworld/main.c:83:13: error: implicit declaration of
> > function 'strtoul'; did you mean 'strtok'? [-Wimplicit-function-declaration]
> >    83 |         n = strtoul(q_arg, &end, 10);
> >       |             ^~~~~~~
> >       |             strtok
> > ../examples/helloworld/main.c:83:13: warning: nested extern
> > declaration of 'strtoul' [- Wnested-externs]
> > ../examples/helloworld/main.c: In function 'helloworld_parse_args':
> > ../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE' undeclared
> > (first use in this function)
> >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> selection\n");
> >       |                                          ^~~~~~~~~~~~
> > ../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is defined in
> > header '<stdlib.h>'; this is probably fixable by adding '#include <stdlib.h>'
> >    12 | #include <rte_memory.h>
> >   +++ |+#include <stdlib.h>
> >    13 | #include <rte_launch.h>
> > ../examples/helloworld/main.c:115:42: note: each undeclared identifier
> > is reported only once for each function it appears in
> >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> selection\n");
> >       |                                          ^~~~~~~~~~~~
> > ../examples/helloworld/main.c: In function 'main':
> > ../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE' undeclared
> > (first use in this function)
> >   153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
> >       |                          ^~~~~~~~~~~~
> > ../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is defined
> > in header '<stdlib.h>'; thi
> >
> > Below header include should fix it.
> >
> > diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
> > index
> > 9845c3775c3a..f49bd0108f74 100644
> > --- a/examples/helloworld/main.c
> > +++ b/examples/helloworld/main.c
> > @@ -3,6 +3,7 @@
> >   */
> >
> >  #include <stdio.h>
> > +#include <stdlib.h>
> >  #include <string.h>
> >  #include <stdint.h>
> >  #include <errno.h>
> 
> Thank you for helping me here, I did run with `check_includes & developer_mode`,
> it did not throw this error.
> Before patch submission I tried `devtools/test-meson-builds.sh` too. I think internally
> this is not using ` check_includes & developer_mode `.
> Let me recheck and fix this in version 4.

Thank you, I found the reason for my miss on this.

When build using cross compiler manually, no issues with the steps
```
meson arm64-build --cross-file config/arm/arm64_armv8_linux_gcc
ninja -C arm64-build
cd example/helloworld
make
```

But building using cross compiler with examples manually, no issues with the steps
```
meson arm64-build --cross-file config/arm/arm64_armv8_linux_gcc -Dexamples=helloworld
ninja -C arm64-build
```

We get the logs as
```
../examples/helloworld/main.c: In function 'parse_topology':
../examples/helloworld/main.c:83:13: warning: implicit declaration of function 'strtoul'; did you mean 'strtok'? [-Wimplicit-function-declaration]
   83 |         n = strtoul(q_arg, &end, 10);
      |             ^~~~~~~
      |             strtok
../examples/helloworld/main.c:83:13: warning: nested extern declaration of 'strtoul' [-Wnested-externs]
../examples/helloworld/main.c: In function 'helloworld_parse_args':
../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE' undeclared (first use in this function)
  115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
      |                                          ^~~~~~~~~~~~
../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is defined in header '<stdlib.h>'; did you forget to '#include <stdlib.h>'?
   12 | #include <rte_memory.h>
  +++ |+#include <stdlib.h>
   13 | #include <rte_launch.h>
../examples/helloworld/main.c:115:42: note: each undeclared identifier is reported only once for each function it appears in
  115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
      |                                          ^~~~~~~~~~~~
../examples/helloworld/main.c: In function 'main':
../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE' undeclared (first use in this function)
  153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
      |                          ^~~~~~~~~~~~
../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is defined in header '<stdlib.h>'; did you forget to '#include <stdlib.h>'?
[2963/4590] Compiling C object drivers/libtmp_rte_event_cnxk.a.p/event_cnxk_tx_cn10k_tx_112_127_seg.c.o
ninja: build stopped: subcommand failed.
```


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
@ 2024-10-30 15:43   ` Stephen Hemminger
  2024-11-04  3:09     ` Varghese, Vipin
  2024-10-30 15:44   ` Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2024-10-30 15:43 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

On Wed, 30 Oct 2024 11:11:31 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> 			topo_cnfg.l3[j] = (struct core_domain_mapping *)
> +					malloc(sizeof(struct core_domain_mapping));

Minor nit, no need to cast return value from malloc().
In C void * can be assigned to a type directly.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
  2024-10-30 15:43   ` Stephen Hemminger
@ 2024-10-30 15:44   ` Stephen Hemminger
  2024-11-04  3:13     ` Varghese, Vipin
  2024-11-04  8:45     ` Mattias Rönnblom
  2024-10-30 15:45   ` Stephen Hemminger
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 28+ messages in thread
From: Stephen Hemminger @ 2024-10-30 15:44 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

On Wed, 30 Oct 2024 11:11:31 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> +	if (topo_cnfg.io) {
> +		free(topo_cnfg.io);
> +		topo_cnfg.io = NULL;
> +	}

No need to check for NULL before calling free.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
  2024-10-30 15:43   ` Stephen Hemminger
  2024-10-30 15:44   ` Stephen Hemminger
@ 2024-10-30 15:45   ` Stephen Hemminger
  2024-11-04  3:13     ` Varghese, Vipin
  2024-10-30 15:47   ` Stephen Hemminger
  2024-11-04  7:57   ` Morten Brørup
  4 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2024-10-30 15:45 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

On Wed, 30 Oct 2024 11:11:31 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> +	/* domain count */
> +	uint16_t l1_count;
> +	uint16_t l2_count;
> +	uint8_t l3_count;
> +	uint8_t io_count;

Make them all uint16_t, the space is there already.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
                     ` (2 preceding siblings ...)
  2024-10-30 15:45   ` Stephen Hemminger
@ 2024-10-30 15:47   ` Stephen Hemminger
  2024-11-04  3:16     ` Varghese, Vipin
  2024-11-04  7:57   ` Morten Brørup
  4 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2024-10-30 15:47 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

On Wed, 30 Oct 2024 11:11:31 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> +struct topology_config {
> +#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
> +	hwloc_topology_t topology;
> +#endif
> +
> +	/* domain count */
> +	uint16_t l1_count;
> +	uint16_t l2_count;
> +	uint8_t l3_count;
> +	uint8_t io_count;
> +
> +	/* total cores under all domain */
> +	uint16_t l1_core_count;
> +	uint16_t l2_core_count;
> +	uint16_t l3_core_count;
> +	uint16_t io_core_count;
> +
> +	/* two dimensional array for each domain */
> +	struct core_domain_mapping **l1;
> +	struct core_domain_mapping **l2;
> +	struct core_domain_mapping **l3;
> +	struct core_domain_mapping **io;
> +};
> +extern struct topology_config topo_cnfg;
> +

To work with primary/secondary process model, it might be better to keep
this info in hugpage/shared memory. I.e put topology_config into space
allocated with rte_malloc() and populated by primary process.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30 12:37       ` Varghese, Vipin
@ 2024-10-30 19:34         ` Pavan Nikhilesh Bhagavatula
  2024-11-04  3:02           ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2024-10-30 19:34 UTC (permalink / raw)
  To: Varghese, Vipin, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

> > Hi Pavan,
> >
> > Snipped
> >
> > >
> > > I see compilation failure on ARM platforms due to missing header include.
> > >
> > > ../examples/helloworld/main.c: In function 'parse_topology':
> > > ../examples/helloworld/main.c:83:13: error: implicit declaration of
> > > function 'strtoul'; did you mean 'strtok'? [-Wimplicit-function-declaration]
> > >    83 |         n = strtoul(q_arg, &end, 10);
> > >       |             ^~~~~~~
> > >       |             strtok
> > > ../examples/helloworld/main.c:83:13: warning: nested extern
> > > declaration of 'strtoul' [- Wnested-externs]
> > > ../examples/helloworld/main.c: In function 'helloworld_parse_args':
> > > ../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE' undeclared
> > > (first use in this function)
> > >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> > selection\n");
> > >       |                                          ^~~~~~~~~~~~
> > > ../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is defined in
> > > header '<stdlib.h>'; this is probably fixable by adding '#include <stdlib.h>'
> > >    12 | #include <rte_memory.h>
> > >   +++ |+#include <stdlib.h>
> > >    13 | #include <rte_launch.h>
> > > ../examples/helloworld/main.c:115:42: note: each undeclared identifier
> > > is reported only once for each function it appears in
> > >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> > selection\n");
> > >       |                                          ^~~~~~~~~~~~
> > > ../examples/helloworld/main.c: In function 'main':
> > > ../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE' undeclared
> > > (first use in this function)
> > >   153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
> > >       |                          ^~~~~~~~~~~~
> > > ../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is defined
> > > in header '<stdlib.h>'; thi
> > >
> > > Below header include should fix it.
> > >
> > > diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
> > > index
> > > 9845c3775c3a..f49bd0108f74 100644
> > > --- a/examples/helloworld/main.c
> > > +++ b/examples/helloworld/main.c
> > > @@ -3,6 +3,7 @@
> > >   */
> > >
> > >  #include <stdio.h>
> > > +#include <stdlib.h>
> > >  #include <string.h>
> > >  #include <stdint.h>
> > >  #include <errno.h>
> >
> > Thank you for helping me here, I did run with `check_includes &
> developer_mode`,
> > it did not throw this error.
> > Before patch submission I tried `devtools/test-meson-builds.sh` too. I think
> internally
> > this is not using ` check_includes & developer_mode `.
> > Let me recheck and fix this in version 4.
> 
> Thank you, I found the reason for my miss on this.
> 
> When build using cross compiler manually, no issues with the steps
> ```
> meson arm64-build --cross-file config/arm/arm64_armv8_linux_gcc
> ninja -C arm64-build
> cd example/helloworld
> make
> ```
> 
> But building using cross compiler with examples manually, no issues with the
> steps
> ```
> meson arm64-build --cross-file config/arm/arm64_armv8_linux_gcc -
> Dexamples=helloworld
> ninja -C arm64-build
> ```
> 

I generally run ./devtools/test-meson-builds.sh as a catch all 😊

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 3/3] examples: add lcore topology API calls
  2024-10-30 19:34         ` Pavan Nikhilesh Bhagavatula
@ 2024-11-04  3:02           ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:02 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped

> > > >
> > > > I see compilation failure on ARM platforms due to missing header include.
> > > >
> > > > ../examples/helloworld/main.c: In function 'parse_topology':
> > > > ../examples/helloworld/main.c:83:13: error: implicit declaration
> > > > of function 'strtoul'; did you mean 'strtok'? [-Wimplicit-function-declaration]
> > > >    83 |         n = strtoul(q_arg, &end, 10);
> > > >       |             ^~~~~~~
> > > >       |             strtok
> > > > ../examples/helloworld/main.c:83:13: warning: nested extern
> > > > declaration of 'strtoul' [- Wnested-externs]
> > > > ../examples/helloworld/main.c: In function 'helloworld_parse_args':
> > > > ../examples/helloworld/main.c:115:42: error: 'EXIT_FAILURE'
> > > > undeclared (first use in this function)
> > > >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> > > selection\n");
> > > >       |                                          ^~~~~~~~~~~~
> > > > ../examples/helloworld/main.c:13:1: note: 'EXIT_FAILURE' is
> > > > defined in header '<stdlib.h>'; this is probably fixable by adding '#include
> <stdlib.h>'
> > > >    12 | #include <rte_memory.h>
> > > >   +++ |+#include <stdlib.h>
> > > >    13 | #include <rte_launch.h>
> > > > ../examples/helloworld/main.c:115:42: note: each undeclared
> > > > identifier is reported only once for each function it appears in
> > > >   115 |                                 rte_exit(EXIT_FAILURE, "Invalid Topology
> > > selection\n");
> > > >       |                                          ^~~~~~~~~~~~
> > > > ../examples/helloworld/main.c: In function 'main':
> > > > ../examples/helloworld/main.c:153:26: error: 'EXIT_FAILURE'
> > > > undeclared (first use in this function)
> > > >   153 |                 rte_exit(EXIT_FAILURE, "Invalid arguments\n");
> > > >       |                          ^~~~~~~~~~~~
> > > > ../examples/helloworld/main.c:153:26: note: 'EXIT_FAILURE' is
> > > > defined in header '<stdlib.h>'; thi
> > > >
> > > > Below header include should fix it.
> > > >
> > > > diff --git a/examples/helloworld/main.c
> > > > b/examples/helloworld/main.c index
> > > > 9845c3775c3a..f49bd0108f74 100644
> > > > --- a/examples/helloworld/main.c
> > > > +++ b/examples/helloworld/main.c
> > > > @@ -3,6 +3,7 @@
> > > >   */
> > > >
> > > >  #include <stdio.h>
> > > > +#include <stdlib.h>
> > > >  #include <string.h>
> > > >  #include <stdint.h>
> > > >  #include <errno.h>
> > >
> > > Thank you for helping me here, I did run with `check_includes &
> > developer_mode`,
> > > it did not throw this error.
> > > Before patch submission I tried `devtools/test-meson-builds.sh` too.
> > > I think
> > internally
> > > this is not using ` check_includes & developer_mode `.
> > > Let me recheck and fix this in version 4.
> >
> > Thank you, I found the reason for my miss on this.
> >
> > When build using cross compiler manually, no issues with the steps ```
> > meson arm64-build --cross-file config/arm/arm64_armv8_linux_gcc ninja
> > -C arm64-build cd example/helloworld make ```
> >
> > But building using cross compiler with examples manually, no issues
> > with the steps ``` meson arm64-build --cross-file
> > config/arm/arm64_armv8_linux_gcc - Dexamples=helloworld ninja -C
> > arm64-build ```
> >
> 
> I generally run ./devtools/test-meson-builds.sh as a catch all 😊

Thank you sharing v4 with fix

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [EXTERNAL] [RFC v3 2/3] test/lcore: enable tests for topology
  2024-10-30 11:50   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
@ 2024-11-04  3:07     ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:07 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, Jerin Jacob
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped
> > ---
> 
> 
> Test fails on ARM platform:
> 
> 
> lcore 19, socket 0, role RTE, cpuset 19
> Control thread running successfully
> INFO: lcore count topology: none (12), io (1), l3 (1), l2 (1), l1 (1).
> INFO: worker lcore count topology: none (11), io (1), l3 (1), l2 (1), l1 (1).
> INFO: lcore DOMAIN macro: success!
> INFO: lcore count: none (12), io (0), l3 (0), l2 (0), l1 (0).
> ERR: failed in domain API
> Test Failed
> 
Thank you Pavan, for helping me on this. Sharing the fix in v4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30 15:43   ` Stephen Hemminger
@ 2024-11-04  3:09     ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:09 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped

> 
> On Wed, 30 Oct 2024 11:11:31 +0530
> Vipin Varghese <vipin.varghese@amd.com> wrote:
> 
> >                       topo_cnfg.l3[j] = (struct core_domain_mapping *)
> > +                                     malloc(sizeof(struct core_domain_mapping));
> 
> Minor nit, no need to cast return value from malloc().
> In C void * can be assigned to a type directly.

Thank you Stephen, will address this with v4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30 15:44   ` Stephen Hemminger
@ 2024-11-04  3:13     ` Varghese, Vipin
  2024-11-04  8:45     ` Mattias Rönnblom
  1 sibling, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:13 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped
> 
> On Wed, 30 Oct 2024 11:11:31 +0530
> Vipin Varghese <vipin.varghese@amd.com> wrote:
> 
> > +     if (topo_cnfg.io) {
> > +             free(topo_cnfg.io);
> > +             topo_cnfg.io = NULL;
> > +     }
> 
> No need to check for NULL before calling free.

Thank you Stepehen on this, the reason why we specifically added to address 2 scenarios
 1. as pointed out by Wathsala where System Level Cache will be disabled; where L3 will be not available at all
 2. for version where hwloc does not detect the specific cache level.

By checking the NULL we can avoid such cases.

Note: I will recheck and add|remove this in v4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30 15:45   ` Stephen Hemminger
@ 2024-11-04  3:13     ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:13 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped
> 
> > +     /* domain count */
> > +     uint16_t l1_count;
> > +     uint16_t l2_count;
> > +     uint8_t l3_count;
> > +     uint8_t io_count;
> 
> Make them all uint16_t, the space is there already.

Thank you, fixing this in v4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30 15:47   ` Stephen Hemminger
@ 2024-11-04  3:16     ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  3:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov, Yigit,
	Ferruh, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

Snipped
> 
> > +struct topology_config {
> > +#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
> > +     hwloc_topology_t topology;
> > +#endif
> > +
> > +     /* domain count */
> > +     uint16_t l1_count;
> > +     uint16_t l2_count;
> > +     uint8_t l3_count;
> > +     uint8_t io_count;
> > +
> > +     /* total cores under all domain */
> > +     uint16_t l1_core_count;
> > +     uint16_t l2_core_count;
> > +     uint16_t l3_core_count;
> > +     uint16_t io_core_count;
> > +
> > +     /* two dimensional array for each domain */
> > +     struct core_domain_mapping **l1;
> > +     struct core_domain_mapping **l2;
> > +     struct core_domain_mapping **l3;
> > +     struct core_domain_mapping **io; }; extern struct
> > +topology_config topo_cnfg;
> > +
> 
> To work with primary/secondary process model, it might be better to keep this info
> in hugpage/shared memory. I.e put topology_config into space allocated with
> rte_malloc() and populated by primary process.

Initially we did use `rte_memzone_reserve` and moved it under `rte_cnfg structure` for primary and secondary.
Exploring on the use cases we noticed in multiprocessor scenarios, lcore of the primary did not share. Hence we moved to using `malloc` as local allocator.

We can easily move to rte_malloc, but do we need to share the HW topology based lcore grouping to multi-process?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
                     ` (3 preceding siblings ...)
  2024-10-30 15:47   ` Stephen Hemminger
@ 2024-11-04  7:57   ` Morten Brørup
  2024-11-04  9:56     ` Varghese, Vipin
  4 siblings, 1 reply; 28+ messages in thread
From: Morten Brørup @ 2024-11-04  7:57 UTC (permalink / raw)
  To: Vipin Varghese, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, jerinj, David Christensen, Wathsala Vithanage
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen,
	ferruh.yigit, honnappa.nagarahalli, konstantin.ananyev

> From: Vipin Varghese [mailto:vipin.varghese@amd.com]
> Sent: Wednesday, 30 October 2024 06.42
> 
> Introduce topology aware lcore mapping into lcore API.
> With higher core density, more and more cores are categorized
> into various chiplets based on IO (memory and PCIe) and
> Last Level Cache (mainly L3).
> 
> Using hwloc library, the dpdk available lcores can be grouped
> into various groups nameley L1, L2, L3 and IO. This patch
> introduces functions and MACRO that helps to identify such
> groups.

Does the API need to be prepared for L4 cache?
https://www.anandtech.com/show/16924/did-ibm-just-preview-the-future-of-caches

And - just a thought:
Since this library and the Cache Stashing (PCIe TLP) library are somewhat related, would it be beneficial to combine them into one patch series, primarily to make their APIs more uniform?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-10-30 15:44   ` Stephen Hemminger
  2024-11-04  3:13     ` Varghese, Vipin
@ 2024-11-04  8:45     ` Mattias Rönnblom
  1 sibling, 0 replies; 28+ messages in thread
From: Mattias Rönnblom @ 2024-11-04  8:45 UTC (permalink / raw)
  To: Stephen Hemminger, Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	jerinj, ruifeng.wang, mattias.ronnblom, anatoly.burakov,
	ferruh.yigit, honnappa.nagarahalli, wathsala.vithanage,
	konstantin.ananyev

On 2024-10-30 16:44, Stephen Hemminger wrote:
> On Wed, 30 Oct 2024 11:11:31 +0530
> Vipin Varghese <vipin.varghese@amd.com> wrote:
> 
>> +	if (topo_cnfg.io) {
>> +		free(topo_cnfg.io);
>> +		topo_cnfg.io = NULL;
>> +	}
> 
> No need to check for NULL before calling free.

If you do need to check for NULL, do it explicitly.

if (topo_cnfg.io != NULL) /../

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04  7:57   ` Morten Brørup
@ 2024-11-04  9:56     ` Varghese, Vipin
  2024-11-04 11:21       ` Morten Brørup
  0 siblings, 1 reply; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04  9:56 UTC (permalink / raw)
  To: Morten Brørup, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, jerinj, David Christensen,
	Wathsala Vithanage
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli, konstantin.ananyev

Snipped

> 
> Does the API need to be prepared for L4 cache?
> https://www.anandtech.com/show/16924/did-ibm-just-preview-the-future-of-caches
Thank you for the pointer, yes initial patch was considering L4 cache too. But I was not able to get hold of system or get someone to test this with L4.
Hence removed the L4 instance from dpdk_topology structure.

We can introduce into v4. Can we get someone from IBM to confirm the same?

> 
> And - just a thought:
> Since this library and the Cache Stashing (PCIe TLP) library are somewhat related,
> would it be beneficial to combine them into one patch series, primarily to make their
> APIs more uniform?

There was initial zoom invite for understanding and usage, we expected a follow up on the same to close the loop.
Based on my current understanding, the API to be used as hints to PMD should be passing `lcore id` only.
Hence these can be independent.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04  9:56     ` Varghese, Vipin
@ 2024-11-04 11:21       ` Morten Brørup
  2024-11-04 12:14         ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Morten Brørup @ 2024-11-04 11:21 UTC (permalink / raw)
  To: Varghese, Vipin, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, jerinj, David Christensen, Wathsala Vithanage,
	Min Zhou, Stanislaw Kardach, konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

> > Does the API need to be prepared for L4 cache?
> > https://www.anandtech.com/show/16924/did-ibm-just-preview-the-future-
> of-caches
> Thank you for the pointer, yes initial patch was considering L4 cache
> too. But I was not able to get hold of system or get someone to test
> this with L4.
> Hence removed the L4 instance from dpdk_topology structure.
> 
> We can introduce into v4. Can we get someone from IBM to confirm the
> same?

If any of the CPU folks think L4 cache might become relevant in the foreseeable future, it should be added to the API without testing at this time. Adding it now would prevent API breakage whenever L4 cache actually becomes relevant.
Otherwise please don't support for it - considering it would be dead and untested code.

> 
> >
> > And - just a thought:
> > Since this library and the Cache Stashing (PCIe TLP) library are
> somewhat related,
> > would it be beneficial to combine them into one patch series,
> primarily to make their
> > APIs more uniform?
> 
> There was initial zoom invite for understanding and usage, we expected
> a follow up on the same to close the loop.
> Based on my current understanding, the API to be used as hints to PMD
> should be passing `lcore id` only.
> Hence these can be independent.

The Cache Stashing hints API uses a more specific destination than just "lcore id".
The implementations of this Topology library and the Cache Stashing library may be completely independent, but the structure describing a location in the topology could be common.

Maybe I should say this differently:
Wathsala and other folks working on the Cache Stashing API, you need to keep a close eye on this Topology patch series!
We might expect the Cache Stashing API to reuse relevant structures from it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04 11:21       ` Morten Brørup
@ 2024-11-04 12:14         ` Varghese, Vipin
  2024-11-04 12:29           ` Morten Brørup
  0 siblings, 1 reply; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04 12:14 UTC (permalink / raw)
  To: Morten Brørup, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, jerinj, David Christensen,
	Wathsala Vithanage, Min Zhou, Stanislaw Kardach,
	konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

Snipped
 
> 
> > > Does the API need to be prepared for L4 cache?
> > > https://www.anandtech.com/show/16924/did-ibm-just-preview-the-future
> > > -
> > of-caches
> > Thank you for the pointer, yes initial patch was considering L4 cache
> > too. But I was not able to get hold of system or get someone to test
> > this with L4.
> > Hence removed the L4 instance from dpdk_topology structure.
> >
> > We can introduce into v4. Can we get someone from IBM to confirm the
> > same?
> 
> If any of the CPU folks think L4 cache might become relevant in the foreseeable
> future, it should be added to the API without testing at this time. 

I remember 3 L4 cache scenario 
 1. from IBM power-9 variant suggested in 2020-2021 in hot chips 
 2. from Intel
  a. Haswell|Broadwell series with eDram as L4
  b. future product (at least in desktop) with L4 cache.

> Adding it now would prevent API breakage whenever L4 cache actually becomes relevant.
> Otherwise please don't support for it - considering it would be dead and untested
> code.

I can add the same to the DPDK topology probe in v4, on AMD EPYC v-cache is treated as extended L3 and not L4. Hence will not be able to test on AMD EPYC.

> 
> >
> > >
> > > And - just a thought:
> > > Since this library and the Cache Stashing (PCIe TLP) library are
> > somewhat related,
> > > would it be beneficial to combine them into one patch series,
> > primarily to make their
> > > APIs more uniform?
> >
> > There was initial zoom invite for understanding and usage, we expected
> > a follow up on the same to close the loop.
> > Based on my current understanding, the API to be used as hints to PMD
> > should be passing `lcore id` only.
> > Hence these can be independent.
> 
> The Cache Stashing hints API uses a more specific destination than just "lcore id".
> The implementations of this Topology library and the Cache Stashing library may be
> completely independent, but the structure describing a location in the topology could
> be common.
Thank you, now I understand,

> 
> Maybe I should say this differently:
> Wathsala and other folks working on the Cache Stashing API, you need to keep a
> close eye on this Topology patch series!
> We might expect the Cache Stashing API to reuse relevant structures from it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04 12:14         ` Varghese, Vipin
@ 2024-11-04 12:29           ` Morten Brørup
  2024-11-04 13:08             ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Morten Brørup @ 2024-11-04 12:29 UTC (permalink / raw)
  To: Varghese, Vipin, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, jerinj, David Christensen, Wathsala Vithanage,
	Min Zhou, Stanislaw Kardach, konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

> > > > Does the API need to be prepared for L4 cache?
> > > > https://www.anandtech.com/show/16924/did-ibm-just-preview-the-
> future
> > > > -
> > > of-caches
> > > Thank you for the pointer, yes initial patch was considering L4
> cache
> > > too. But I was not able to get hold of system or get someone to
> test
> > > this with L4.
> > > Hence removed the L4 instance from dpdk_topology structure.
> > >
> > > We can introduce into v4. Can we get someone from IBM to confirm
> the
> > > same?
> >
> > If any of the CPU folks think L4 cache might become relevant in the
> foreseeable
> > future, it should be added to the API without testing at this time.
> 
> I remember 3 L4 cache scenario
>  1. from IBM power-9 variant suggested in 2020-2021 in hot chips
>  2. from Intel
>   a. Haswell|Broadwell series with eDram as L4
>   b. future product (at least in desktop) with L4 cache.
> 
> > Adding it now would prevent API breakage whenever L4 cache actually
> becomes relevant.
> > Otherwise please don't support for it - considering it would be dead
> and untested
> > code.
> 
> I can add the same to the DPDK topology probe in v4, on AMD EPYC v-
> cache is treated as extended L3 and not L4. Hence will not be able to
> test on AMD EPYC.

I recall from the Cache Stashing community call... There is some ACPI function to get the (opaque) "location IDs" of various parts in the system, to be used for setting the Cache Stashing hints.
Is there only one "ACPI location ID" (I don't know the correct name) shared by the L3 cache and the v-cache in AMD EPYC, or do they have each their own?
If they are not exposed as one ID, but two separate IDs, the Topology API might need to reflect this, so it can be used in the Cache Stashing API.

> 
> >
> > >
> > > >
> > > > And - just a thought:
> > > > Since this library and the Cache Stashing (PCIe TLP) library are
> > > somewhat related,
> > > > would it be beneficial to combine them into one patch series,
> > > primarily to make their
> > > > APIs more uniform?
> > >
> > > There was initial zoom invite for understanding and usage, we
> expected
> > > a follow up on the same to close the loop.
> > > Based on my current understanding, the API to be used as hints to
> PMD
> > > should be passing `lcore id` only.
> > > Hence these can be independent.
> >
> > The Cache Stashing hints API uses a more specific destination than
> just "lcore id".
> > The implementations of this Topology library and the Cache Stashing
> library may be
> > completely independent, but the structure describing a location in
> the topology could
> > be common.
> Thank you, now I understand,
> 
> >
> > Maybe I should say this differently:
> > Wathsala and other folks working on the Cache Stashing API, you need
> to keep a
> > close eye on this Topology patch series!
> > We might expect the Cache Stashing API to reuse relevant structures
> from it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04 12:29           ` Morten Brørup
@ 2024-11-04 13:08             ` Varghese, Vipin
  2024-11-04 14:04               ` Morten Brørup
  0 siblings, 1 reply; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-04 13:08 UTC (permalink / raw)
  To: Morten Brørup, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, jerinj, David Christensen,
	Wathsala Vithanage, Min Zhou, Stanislaw Kardach,
	konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

Snipped

> 
> 
> > > > > Does the API need to be prepared for L4 cache?
> > > > > https://www.anandtech.com/show/16924/did-ibm-just-preview-the-
> > future
> > > > > -
> > > > of-caches
> > > > Thank you for the pointer, yes initial patch was considering L4
> > cache
> > > > too. But I was not able to get hold of system or get someone to
> > test
> > > > this with L4.
> > > > Hence removed the L4 instance from dpdk_topology structure.
> > > >
> > > > We can introduce into v4. Can we get someone from IBM to confirm
> > the
> > > > same?
> > >
> > > If any of the CPU folks think L4 cache might become relevant in the
> > foreseeable
> > > future, it should be added to the API without testing at this time.
> >
> > I remember 3 L4 cache scenario
> >  1. from IBM power-9 variant suggested in 2020-2021 in hot chips  2.
> > from Intel
> >   a. Haswell|Broadwell series with eDram as L4
> >   b. future product (at least in desktop) with L4 cache.
> >
> > > Adding it now would prevent API breakage whenever L4 cache actually
> > becomes relevant.
> > > Otherwise please don't support for it - considering it would be dead
> > and untested
> > > code.
> >
> > I can add the same to the DPDK topology probe in v4, on AMD EPYC v-
> > cache is treated as extended L3 and not L4. Hence will not be able to
> > test on AMD EPYC.
> 
> I recall from the Cache Stashing community call... There is some ACPI function to
> get the (opaque) "location IDs" of various parts in the system, to be used for setting
> the Cache Stashing hints.
> Is there only one "ACPI location ID" (I don't know the correct name) shared by the
> L3 cache and the v-cache in AMD EPYC, or do they have each their own?

At least on AMD EPYC, the stashing ID updated to either MSI-X table or Device Specific Mode is core-id.


> If they are not exposed as one ID, but two separate IDs, the Topology API might
> need to reflect this, so it can be used in the Cache Stashing API.

I have different view on the same and had shared this with Ajit (Broadcom) and others. To my understanding, use of rte_ethdev API used
for caching hints should be inline to rte_lcore. Depending upon the platform (ARM's specific implementation, the lcore gets translated to L2 or L3 cache ID within the PMD.

Note:  the current patch introduces of Topology aware grouping, which helps to run application better or tiles or chiplets sharing same L2|L3 or IO domain.

Snipped

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04 13:08             ` Varghese, Vipin
@ 2024-11-04 14:04               ` Morten Brørup
  2024-11-05  2:17                 ` Varghese, Vipin
  0 siblings, 1 reply; 28+ messages in thread
From: Morten Brørup @ 2024-11-04 14:04 UTC (permalink / raw)
  To: Varghese, Vipin, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, jerinj, David Christensen, Wathsala Vithanage,
	Min Zhou, Stanislaw Kardach, konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

> > > > > > Does the API need to be prepared for L4 cache?
> > > > > > https://www.anandtech.com/show/16924/did-ibm-just-preview-
> the-
> > > future
> > > > > > -
> > > > > of-caches
> > > > > Thank you for the pointer, yes initial patch was considering L4
> > > cache
> > > > > too. But I was not able to get hold of system or get someone to
> > > test
> > > > > this with L4.
> > > > > Hence removed the L4 instance from dpdk_topology structure.
> > > > >
> > > > > We can introduce into v4. Can we get someone from IBM to
> confirm
> > > the
> > > > > same?
> > > >
> > > > If any of the CPU folks think L4 cache might become relevant in
> the
> > > foreseeable
> > > > future, it should be added to the API without testing at this
> time.
> > >
> > > I remember 3 L4 cache scenario
> > >  1. from IBM power-9 variant suggested in 2020-2021 in hot chips
> 2.
> > > from Intel
> > >   a. Haswell|Broadwell series with eDram as L4
> > >   b. future product (at least in desktop) with L4 cache.
> > >
> > > > Adding it now would prevent API breakage whenever L4 cache
> actually
> > > becomes relevant.
> > > > Otherwise please don't support for it - considering it would be
> dead
> > > and untested
> > > > code.
> > >
> > > I can add the same to the DPDK topology probe in v4, on AMD EPYC v-
> > > cache is treated as extended L3 and not L4. Hence will not be able
> to
> > > test on AMD EPYC.
> >
> > I recall from the Cache Stashing community call... There is some ACPI
> function to
> > get the (opaque) "location IDs" of various parts in the system, to be
> used for setting
> > the Cache Stashing hints.
> > Is there only one "ACPI location ID" (I don't know the correct name)
> shared by the
> > L3 cache and the v-cache in AMD EPYC, or do they have each their own?
> 
> At least on AMD EPYC, the stashing ID updated to either MSI-X table or
> Device Specific Mode is core-id.

Are you saying that on AMD EPYC only L2 caches have a Stashing ID, so no other CPU caches can be stashed into?
If yes, then it's a non-issue for Cache Stashing, since it doesn't need to care about L3 cache or v-cache.

> 
> 
> > If they are not exposed as one ID, but two separate IDs, the Topology
> API might
> > need to reflect this, so it can be used in the Cache Stashing API.
> 
> I have different view on the same and had shared this with Ajit
> (Broadcom) and others. To my understanding, use of rte_ethdev API used
> for caching hints should be inline to rte_lcore. Depending upon the
> platform (ARM's specific implementation, the lcore gets translated to
> L2 or L3 cache ID within the PMD.

The rte_ethdev API for cache stashing provides a higher level of abstraction, yes.

But the layer below it - the Stashing API used by the PMDs to obtain Stashing ID from "location ID" - could use the "location ID" structure type defined by the Topology library's lower layer.

> 
> Note:  the current patch introduces of Topology aware grouping, which
> helps to run application better or tiles or chiplets sharing same L2|L3
> or IO domain.

Both libraries (Topology and Cache Stashing) need to have detailed information about the hardware, although they use the information for two different purposes.

Maybe they could share a common lower layer for the system topology, perhaps just a few header files. Or maybe the Cache Stashing library should depend on the Topology library as its "lower layer" to provide the hardware information it needs.

I'm not saying that it must be so. I'm only saying that I suppose these two libraries have a lot in common, and they could try to take advantage of this, to provide a more uniform API at their lower layers.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC v3 1/3] eal/lcore: add topology based functions
  2024-11-04 14:04               ` Morten Brørup
@ 2024-11-05  2:17                 ` Varghese, Vipin
  0 siblings, 0 replies; 28+ messages in thread
From: Varghese, Vipin @ 2024-11-05  2:17 UTC (permalink / raw)
  To: Morten Brørup, dev, roretzla, bruce.richardson,
	john.mcnamara, dmitry.kozliuk, jerinj, David Christensen,
	Wathsala Vithanage, Min Zhou, Stanislaw Kardach,
	konstantin.ananyev
  Cc: ruifeng.wang, mattias.ronnblom, anatoly.burakov, stephen, Yigit,
	Ferruh, honnappa.nagarahalli

Snipped

> > >
> > > I recall from the Cache Stashing community call... There is some
> > > ACPI
> > function to
> > > get the (opaque) "location IDs" of various parts in the system, to
> > > be
> > used for setting
> > > the Cache Stashing hints.
> > > Is there only one "ACPI location ID" (I don't know the correct name)
> > shared by the
> > > L3 cache and the v-cache in AMD EPYC, or do they have each their own?
> >
> > At least on AMD EPYC, the stashing ID updated to either MSI-X table or
> > Device Specific Mode is core-id.
> 
> Are you saying that on AMD EPYC only L2 caches have a Stashing ID, so no other
> CPU caches can be stashed into?
On AMD EPYC zen4 (limited Operating Part Number) & zen5, Cache Stashing is done on L2 cache level.
This is done by passing core-id as the steering tag (opaque value)

> If yes, then it's a non-issue for Cache Stashing, since it doesn't need to care about
> L3 cache or v-cache.
Yes, that is what I am trying to imply irrespective of the platform (Arm, powerpc, riscv, Intel or AMD) cache
stashing based on PCIe SIG is based on each vendor implementation. On AMD EPYC currently this is on based 
on core-id (which is the hint to platform to put it into L2 cache). 

Other platforms might support putting to L1, L2 or L3. And for these I agree that cache id might be used steering tag.

> 
> >
> >
> > > If they are not exposed as one ID, but two separate IDs, the
> > > Topology
> > API might
> > > need to reflect this, so it can be used in the Cache Stashing API.
> >
> > I have different view on the same and had shared this with Ajit
> > (Broadcom) and others. To my understanding, use of rte_ethdev API used
> > for caching hints should be inline to rte_lcore. Depending upon the
> > platform (ARM's specific implementation, the lcore gets translated to
> > L2 or L3 cache ID within the PMD.
> 
> The rte_ethdev API for cache stashing provides a higher level of abstraction, yes.
> 
> But the layer below it - the Stashing API used by the PMDs to obtain Stashing ID
> from "location ID" - could use the "location ID" structure type defined by the
> Topology library's lower layer.

Yes, consume the lcore-id from the user via ethdev API, internally the PMD translates this..
Based on my current understanding this can be done in two ways

1. the translation is done using hwloc library API calls
2. using rte_topology structure during probing, also probe for cache id for L1, L2, L3 and L4.

To do achieve this, one has to add `unsigned int cache_id` to `struct core_domain_mapping`.
This allows `rte_eal_topology_init` probe to store the cache_id.

> 
> >
> > Note:  the current patch introduces of Topology aware grouping, which
> > helps to run application better or tiles or chiplets sharing same
> > L2|L3 or IO domain.
> 
> Both libraries (Topology and Cache Stashing) need to have detailed information
> about the hardware, although they use the information for two different purposes.
> 
> Maybe they could share a common lower layer for the system topology, perhaps
> just a few header files. Or maybe the Cache Stashing library should depend on the
> Topology library as its "lower layer" to provide the hardware information it needs.
> 
> I'm not saying that it must be so. I'm only saying that I suppose these two libraries
> have a lot in common, and they could try to take advantage of this, to provide a
> more uniform API at their lower layers.

Yes I agree, my only case here let us first get rte_topology into dpdk eco-system.
Since the ` struct core_domain_mapping` can be easily updated this can be done in next step.

Note: assuming we can merge this for release 25.03, and we all concur or `cache stashing API` we get it tested on AMD EPYC and other platforms alike.

I will share v4 by today.


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-11-05  2:17 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-30  5:41 [RFC v3 0/3] Introduce Topology NUMA grouping for lcores Vipin Varghese
2024-10-30  5:41 ` [RFC v3 1/3] eal/lcore: add topology based functions Vipin Varghese
2024-10-30 15:43   ` Stephen Hemminger
2024-11-04  3:09     ` Varghese, Vipin
2024-10-30 15:44   ` Stephen Hemminger
2024-11-04  3:13     ` Varghese, Vipin
2024-11-04  8:45     ` Mattias Rönnblom
2024-10-30 15:45   ` Stephen Hemminger
2024-11-04  3:13     ` Varghese, Vipin
2024-10-30 15:47   ` Stephen Hemminger
2024-11-04  3:16     ` Varghese, Vipin
2024-11-04  7:57   ` Morten Brørup
2024-11-04  9:56     ` Varghese, Vipin
2024-11-04 11:21       ` Morten Brørup
2024-11-04 12:14         ` Varghese, Vipin
2024-11-04 12:29           ` Morten Brørup
2024-11-04 13:08             ` Varghese, Vipin
2024-11-04 14:04               ` Morten Brørup
2024-11-05  2:17                 ` Varghese, Vipin
2024-10-30  5:41 ` [RFC v3 2/3] test/lcore: enable tests for topology Vipin Varghese
2024-10-30 11:50   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
2024-11-04  3:07     ` Varghese, Vipin
2024-10-30  5:41 ` [RFC v3 3/3] examples: add lcore topology API calls Vipin Varghese
2024-10-30 11:49   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
2024-10-30 12:06     ` Varghese, Vipin
2024-10-30 12:37       ` Varghese, Vipin
2024-10-30 19:34         ` Pavan Nikhilesh Bhagavatula
2024-11-04  3:02           ` Varghese, Vipin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).