From: Tomasz Duszynski <tduszynski@marvell.com>
To: <tduszynski@marvell.com>, Thomas Monjalon <thomas@monjalon.net>
Cc: <Ruifeng.Wang@arm.com>, <bruce.richardson@intel.com>,
<david.marchand@redhat.com>, <dev@dpdk.org>, <jerinj@marvell.com>,
<konstantin.v.ananyev@yandex.ru>, <mattias.ronnblom@ericsson.com>,
<mb@smartsharesystems.com>, <roretzla@linux.microsoft.com>,
<zhoumin@loongson.cn>
Subject: [PATCH v13 1/4] lib: add generic support for reading PMU events
Date: Wed, 9 Oct 2024 13:23:05 +0200 [thread overview]
Message-ID: <20241009112308.2973903-2-tduszynski@marvell.com> (raw)
In-Reply-To: <20241009112308.2973903-1-tduszynski@marvell.com>
Add support for programming PMU counters and reading their values
in runtime bypassing kernel completely.
This is especially useful in cases where CPU cores are isolated
i.e run dedicated tasks. In such cases one cannot use standard
perf utility without sacrificing latency and performance.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
---
MAINTAINERS | 5 +
app/test/meson.build | 1 +
app/test/test_pmu.c | 53 +++
doc/api/doxy-api-index.md | 3 +-
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/profile_app.rst | 29 ++
doc/guides/rel_notes/release_24_11.rst | 7 +
lib/meson.build | 1 +
lib/pmu/meson.build | 12 +
lib/pmu/pmu_private.h | 32 ++
lib/pmu/rte_pmu.c | 462 +++++++++++++++++++++++++
lib/pmu/rte_pmu.h | 227 ++++++++++++
lib/pmu/version.map | 14 +
13 files changed, 846 insertions(+), 1 deletion(-)
create mode 100644 app/test/test_pmu.c
create mode 100644 lib/pmu/meson.build
create mode 100644 lib/pmu/pmu_private.h
create mode 100644 lib/pmu/rte_pmu.c
create mode 100644 lib/pmu/rte_pmu.h
create mode 100644 lib/pmu/version.map
diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..80bf5968de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1811,6 +1811,11 @@ M: Nithin Dabilpuram <ndabilpuram@marvell.com>
M: Pavan Nikhilesh <pbhagavatula@marvell.com>
F: lib/node/
+PMU - EXPERIMENTAL
+M: Tomasz Duszynski <tduszynski@marvell.com>
+F: lib/pmu/
+F: app/test/test_pmu*
+
Test Applications
-----------------
diff --git a/app/test/meson.build b/app/test/meson.build
index e29258e6ec..45f56d8aae 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -139,6 +139,7 @@ source_file_deps = {
'test_pmd_perf.c': ['ethdev', 'net'] + packet_burst_generator_deps,
'test_pmd_ring.c': ['net_ring', 'ethdev', 'bus_vdev'],
'test_pmd_ring_perf.c': ['ethdev', 'net_ring', 'bus_vdev'],
+ 'test_pmu.c': ['pmu'],
'test_power.c': ['power'],
'test_power_cpufreq.c': ['power'],
'test_power_intel_uncore.c': ['power'],
diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c
new file mode 100644
index 0000000000..79376ea2e8
--- /dev/null
+++ b/app/test/test_pmu.c
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2024 Marvell International Ltd.
+ */
+
+#include <rte_pmu.h>
+
+#include "test.h"
+
+static int
+test_pmu_read(void)
+{
+ const char *name = NULL;
+ int tries = 10, event;
+ uint64_t val = 0;
+
+ if (name == NULL) {
+ printf("PMU not supported on this arch\n");
+ return TEST_SKIPPED;
+ }
+
+ if (rte_pmu_init() == -ENOTSUP) {
+ printf("pmu_autotest only supported on Linux, skipping test\n");
+ return TEST_SKIPPED;
+ }
+ if (rte_pmu_init() < 0)
+ return TEST_SKIPPED;
+
+ event = rte_pmu_add_event(name);
+ while (tries--)
+ val += rte_pmu_read(event);
+
+ rte_pmu_fini();
+
+ return val ? TEST_SUCCESS : TEST_FAILED;
+}
+
+static struct unit_test_suite pmu_tests = {
+ .suite_name = "pmu autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_pmu_read),
+ TEST_CASES_END()
+ }
+};
+
+static int
+test_pmu(void)
+{
+ return unit_test_suite_runner(&pmu_tests);
+}
+
+REGISTER_FAST_TEST(pmu_autotest, true, true, test_pmu);
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..805efc6520 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -237,7 +237,8 @@ The public API headers are grouped by topics:
[log](@ref rte_log.h),
[errno](@ref rte_errno.h),
[trace](@ref rte_trace.h),
- [trace_point](@ref rte_trace_point.h)
+ [trace_point](@ref rte_trace_point.h),
+ [pmu](@ref rte_pmu.h)
- **misc**:
[EAL config](@ref rte_eal.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index a8823c046f..658490b6a2 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -69,6 +69,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/pdcp \
@TOPDIR@/lib/pdump \
@TOPDIR@/lib/pipeline \
+ @TOPDIR@/lib/pmu \
@TOPDIR@/lib/port \
@TOPDIR@/lib/power \
@TOPDIR@/lib/ptr_compress \
diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst
index a6b5fb4d5e..854c515a61 100644
--- a/doc/guides/prog_guide/profile_app.rst
+++ b/doc/guides/prog_guide/profile_app.rst
@@ -7,6 +7,35 @@ Profile Your Application
The following sections describe methods of profiling DPDK applications on
different architectures.
+Performance counter based profiling
+-----------------------------------
+
+Majority of architectures support some performance monitoring unit (PMU).
+Such unit provides programmable counters that monitor specific events.
+
+Different tools gather that information, like for example perf.
+However, in some scenarios when CPU cores are isolated and run
+dedicated tasks interrupting those tasks with perf may be undesirable.
+
+In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``.
+
+By default, userspace applications are not allowed to access PMU internals. That can be changed
+by setting ``/sys/kernel/perf_event_paranoid`` to 2 (that should be a default value anyway) and
+adding ``CAP_PERFMON`` capability to DPDK application. Please refer to
+``Documentation/admin-guide/perf-security.rst`` under Linux sources for more information. Fairly
+recent kernel, i.e >= 5.9, is advised too.
+
+As of now implementation imposes certain limitations:
+
+* Management APIs that normally return a non-negative value will return error (``-ENOTSUP``) while
+ ``rte_pmu_read()`` will return ``UINT64_MAX`` if running under unsupported operating system.
+
+* Only EAL lcores are supported
+
+* EAL lcores must not share a cpu
+
+* Each EAL lcore measures same group of events
+
Profiling on x86
----------------
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 52841d8d56..bafa24f11a 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -74,6 +74,13 @@ New Features
* Added SR-IOV VF support.
* Added recent 1400/14000 and 15000 models to the supported list.
+* **Added PMU library.**
+
+ Added a new performance monitoring unit (PMU) library which allows applications
+ to perform self monitoring activities without depending on external utilities like perf.
+ After integration with :doc:`../prog_guide/trace_lib` data gathered from hardware counters
+ can be stored in CTF format for further analysis.
+
Removed Items
-------------
diff --git a/lib/meson.build b/lib/meson.build
index 162287753f..cc7a4bd535 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -13,6 +13,7 @@ libraries = [
'kvargs', # eal depends on kvargs
'argparse',
'telemetry', # basic info querying
+ 'pmu',
'eal', # everything depends on eal
'ptr_compress',
'ring',
diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build
new file mode 100644
index 0000000000..386232e5c7
--- /dev/null
+++ b/lib/pmu/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Marvell International Ltd.
+
+headers = files('rte_pmu.h')
+
+if not is_linux
+ subdir_done()
+endif
+
+sources = files('rte_pmu.c')
+
+deps += ['log']
diff --git a/lib/pmu/pmu_private.h b/lib/pmu/pmu_private.h
new file mode 100644
index 0000000000..d2b15615bf
--- /dev/null
+++ b/lib/pmu/pmu_private.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell
+ */
+
+#ifndef _PMU_PRIVATE_H_
+#define _PMU_PRIVATE_H_
+
+/**
+ * Architecture specific PMU init callback.
+ *
+ * @return
+ * 0 in case of success, negative value otherwise.
+ */
+int
+pmu_arch_init(void);
+
+/**
+ * Architecture specific PMU cleanup callback.
+ */
+void
+pmu_arch_fini(void);
+
+/**
+ * Apply architecture specific settings to config before passing it to syscall.
+ *
+ * @param config
+ * Architecture specific event configuration. Consult kernel sources for available options.
+ */
+void
+pmu_arch_fixup_config(uint64_t config[3]);
+
+#endif /* _PMU_PRIVATE_H_ */
diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c
new file mode 100644
index 0000000000..5c38a309d8
--- /dev/null
+++ b/lib/pmu/rte_pmu.c
@@ -0,0 +1,462 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2024 Marvell International Ltd.
+ */
+
+#include <ctype.h>
+#include <dirent.h>
+#include <errno.h>
+#include <regex.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/queue.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#include <rte_atomic.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_pmu.h>
+#include <rte_spinlock.h>
+#include <rte_tailq.h>
+
+#include "pmu_private.h"
+
+#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices"
+
+#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h)))))
+#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m))
+
+/* A structure describing an event */
+struct rte_pmu_event {
+ char *name;
+ unsigned int index;
+ TAILQ_ENTRY(rte_pmu_event) next;
+};
+
+RTE_DEFINE_PER_LCORE(struct rte_pmu_event_group, _event_group);
+struct rte_pmu rte_pmu;
+
+/*
+ * Following __rte_weak functions provide default no-op. Architectures should override them if
+ * necessary.
+ */
+
+int
+__rte_weak pmu_arch_init(void)
+{
+ return 0;
+}
+
+void
+__rte_weak pmu_arch_fini(void)
+{
+}
+
+void
+__rte_weak pmu_arch_fixup_config(uint64_t __rte_unused config[3])
+{
+}
+
+static int
+get_term_format(const char *name, int *num, uint64_t *mask)
+{
+ char path[PATH_MAX];
+ char *config = NULL;
+ int high, low, ret;
+ FILE *fp;
+
+ *num = *mask = 0;
+ snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", rte_pmu.name, name);
+ fp = fopen(path, "r");
+ if (fp == NULL)
+ return -errno;
+
+ errno = 0;
+ ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high);
+ if (ret < 2) {
+ ret = -ENODATA;
+ goto out;
+ }
+ if (errno) {
+ ret = -errno;
+ goto out;
+ }
+
+ if (ret == 2)
+ high = low;
+
+ *mask = GENMASK_ULL(high, low);
+ /* Last digit should be [012]. If last digit is missing 0 is implied. */
+ *num = config[strlen(config) - 1];
+ *num = isdigit(*num) ? *num - '0' : 0;
+
+ ret = 0;
+out:
+ free(config);
+ fclose(fp);
+
+ return ret;
+}
+
+static int
+parse_event(char *buf, uint64_t config[3])
+{
+ char *token, *term;
+ int num, ret, val;
+ uint64_t mask;
+
+ config[0] = config[1] = config[2] = 0;
+
+ token = strtok(buf, ",");
+ while (token) {
+ errno = 0;
+ /* <term>=<value> */
+ ret = sscanf(token, "%m[^=]=%i", &term, &val);
+ if (ret < 1)
+ return -ENODATA;
+ if (errno)
+ return -errno;
+ if (ret == 1)
+ val = 1;
+
+ ret = get_term_format(term, &num, &mask);
+ free(term);
+ if (ret)
+ return ret;
+
+ config[num] |= FIELD_PREP(mask, val);
+ token = strtok(NULL, ",");
+ }
+
+ return 0;
+}
+
+static int
+get_event_config(const char *name, uint64_t config[3])
+{
+ char path[PATH_MAX], buf[BUFSIZ];
+ FILE *fp;
+ int ret;
+
+ snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name);
+ fp = fopen(path, "r");
+ if (fp == NULL)
+ return -errno;
+
+ ret = fread(buf, 1, sizeof(buf), fp);
+ if (ret == 0) {
+ fclose(fp);
+
+ return -EINVAL;
+ }
+ fclose(fp);
+ buf[ret] = '\0';
+
+ return parse_event(buf, config);
+}
+
+static int
+do_perf_event_open(uint64_t config[3], int group_fd)
+{
+ struct perf_event_attr attr = {
+ .size = sizeof(struct perf_event_attr),
+ .type = PERF_TYPE_RAW,
+ .exclude_kernel = 1,
+ .exclude_hv = 1,
+ .disabled = 1,
+ .pinned = group_fd == -1,
+ };
+
+ pmu_arch_fixup_config(config);
+
+ attr.config = config[0];
+ attr.config1 = config[1];
+ attr.config2 = config[2];
+
+ return syscall(SYS_perf_event_open, &attr, 0, -1, group_fd, 0);
+}
+
+static int
+open_events(struct rte_pmu_event_group *group)
+{
+ struct rte_pmu_event *event;
+ uint64_t config[3];
+ int num = 0, ret;
+
+ /* group leader gets created first, with fd = -1 */
+ group->fds[0] = -1;
+
+ TAILQ_FOREACH(event, &rte_pmu.event_list, next) {
+ ret = get_event_config(event->name, config);
+ if (ret)
+ continue;
+
+ ret = do_perf_event_open(config, group->fds[0]);
+ if (ret == -1) {
+ ret = -errno;
+ goto out;
+ }
+
+ group->fds[event->index] = ret;
+ num++;
+ }
+
+ return 0;
+out:
+ for (--num; num >= 0; num--) {
+ close(group->fds[num]);
+ group->fds[num] = -1;
+ }
+
+
+ return ret;
+}
+
+static int
+mmap_events(struct rte_pmu_event_group *group)
+{
+ long page_size = sysconf(_SC_PAGE_SIZE);
+ unsigned int i;
+ void *addr;
+ int ret;
+
+ for (i = 0; i < rte_pmu.num_group_events; i++) {
+ addr = mmap(0, page_size, PROT_READ, MAP_SHARED, group->fds[i], 0);
+ if (addr == MAP_FAILED) {
+ ret = -errno;
+ goto out;
+ }
+
+ group->mmap_pages[i] = addr;
+ }
+
+ return 0;
+out:
+ for (; i; i--) {
+ munmap(group->mmap_pages[i - 1], page_size);
+ group->mmap_pages[i - 1] = NULL;
+ }
+
+ return ret;
+}
+
+static void
+cleanup_events(struct rte_pmu_event_group *group)
+{
+ unsigned int i;
+
+ if (group->fds[0] != -1)
+ ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP);
+
+ for (i = 0; i < rte_pmu.num_group_events; i++) {
+ if (group->mmap_pages[i]) {
+ munmap(group->mmap_pages[i], sysconf(_SC_PAGE_SIZE));
+ group->mmap_pages[i] = NULL;
+ }
+
+ if (group->fds[i] != -1) {
+ close(group->fds[i]);
+ group->fds[i] = -1;
+ }
+ }
+
+ group->enabled = false;
+}
+
+int
+__rte_pmu_enable_group(void)
+{
+ struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group);
+ int ret;
+
+ if (rte_pmu.num_group_events == 0)
+ return -ENODEV;
+
+ ret = open_events(group);
+ if (ret)
+ goto out;
+
+ ret = mmap_events(group);
+ if (ret)
+ goto out;
+
+ if (ioctl(group->fds[0], PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP) == -1) {
+ ret = -errno;
+ goto out;
+ }
+
+ if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) {
+ ret = -errno;
+ goto out;
+ }
+
+ rte_spinlock_lock(&rte_pmu.lock);
+ TAILQ_INSERT_TAIL(&rte_pmu.event_group_list, group, next);
+ rte_spinlock_unlock(&rte_pmu.lock);
+ group->enabled = true;
+
+ return 0;
+
+out:
+ cleanup_events(group);
+
+ return ret;
+}
+
+static int
+scan_pmus(void)
+{
+ char path[PATH_MAX];
+ struct dirent *dent;
+ const char *name;
+ DIR *dirp;
+
+ dirp = opendir(EVENT_SOURCE_DEVICES_PATH);
+ if (dirp == NULL)
+ return -errno;
+
+ while ((dent = readdir(dirp))) {
+ name = dent->d_name;
+ if (name[0] == '.')
+ continue;
+
+ /* sysfs entry should either contain cpus or be a cpu */
+ if (!strcmp(name, "cpu"))
+ break;
+
+ snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name);
+ if (access(path, F_OK) == 0)
+ break;
+ }
+
+ if (dent) {
+ rte_pmu.name = strdup(name);
+ if (rte_pmu.name == NULL) {
+ closedir(dirp);
+
+ return -ENOMEM;
+ }
+ }
+
+ closedir(dirp);
+
+ return rte_pmu.name ? 0 : -ENODEV;
+}
+
+static struct rte_pmu_event *
+new_event(const char *name)
+{
+ struct rte_pmu_event *event;
+
+ event = calloc(1, sizeof(*event));
+ if (event == NULL)
+ goto out;
+
+ event->name = strdup(name);
+ if (event->name == NULL) {
+ free(event);
+ event = NULL;
+ }
+
+out:
+ return event;
+}
+
+static void
+free_event(struct rte_pmu_event *event)
+{
+ free(event->name);
+ free(event);
+}
+
+int
+rte_pmu_add_event(const char *name)
+{
+ struct rte_pmu_event *event;
+ char path[PATH_MAX];
+
+ if (rte_pmu.name == NULL)
+ return -ENODEV;
+
+ if (rte_pmu.num_group_events + 1 >= RTE_MAX_NUM_GROUP_EVENTS)
+ return -ENOSPC;
+
+ snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name);
+ if (access(path, R_OK))
+ return -ENODEV;
+
+ TAILQ_FOREACH(event, &rte_pmu.event_list, next) {
+ if (!strcmp(event->name, name))
+ return event->index;
+ continue;
+ }
+
+ event = new_event(name);
+ if (event == NULL)
+ return -ENOMEM;
+
+ event->index = rte_pmu.num_group_events++;
+ TAILQ_INSERT_TAIL(&rte_pmu.event_list, event, next);
+
+ return event->index;
+}
+
+int
+rte_pmu_init(void)
+{
+ int ret;
+
+ /* Allow calling init from multiple contexts within a single thread. This simplifies
+ * resource management a bit e.g in case fast-path tracepoint has already been enabled
+ * via command line but application doesn't care enough and performs init/fini again.
+ */
+ if (__atomic_fetch_add(&rte_pmu.initialized, 1, __ATOMIC_SEQ_CST) != 0)
+ return 0;
+
+ ret = scan_pmus();
+ if (ret)
+ goto out;
+
+ ret = pmu_arch_init();
+ if (ret)
+ goto out;
+
+ TAILQ_INIT(&rte_pmu.event_list);
+ TAILQ_INIT(&rte_pmu.event_group_list);
+ rte_spinlock_init(&rte_pmu.lock);
+
+ return 0;
+out:
+ free(rte_pmu.name);
+ rte_pmu.name = NULL;
+
+ return ret;
+}
+
+void
+rte_pmu_fini(void)
+{
+ struct rte_pmu_event_group *group, *tmp_group;
+ struct rte_pmu_event *event, *tmp_event;
+
+ /* cleanup once init count drops to zero */
+ if (__atomic_fetch_sub(&rte_pmu.initialized, 1, __ATOMIC_SEQ_CST) - 1 != 0)
+ return;
+
+ RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) {
+ TAILQ_REMOVE(&rte_pmu.event_list, event, next);
+ free_event(event);
+ }
+
+ RTE_TAILQ_FOREACH_SAFE(group, &rte_pmu.event_group_list, next, tmp_group) {
+ TAILQ_REMOVE(&rte_pmu.event_group_list, group, next);
+ cleanup_events(group);
+ }
+
+ pmu_arch_fini();
+ free(rte_pmu.name);
+ rte_pmu.name = NULL;
+ rte_pmu.num_group_events = 0;
+}
diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h
new file mode 100644
index 0000000000..09238ee33d
--- /dev/null
+++ b/lib/pmu/rte_pmu.h
@@ -0,0 +1,227 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell
+ */
+
+#ifndef _RTE_PMU_H_
+#define _RTE_PMU_H_
+
+/**
+ * @file
+ *
+ * PMU event tracing operations
+ *
+ * This file defines generic API and types necessary to setup PMU and
+ * read selected counters in runtime.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <errno.h>
+
+#include <rte_common.h>
+#include <rte_compat.h>
+
+#ifdef RTE_EXEC_ENV_LINUX
+
+#include <linux/perf_event.h>
+
+#include <rte_atomic.h>
+#include <rte_branch_prediction.h>
+#include <rte_spinlock.h>
+
+/** Maximum number of events in a group */
+#define RTE_MAX_NUM_GROUP_EVENTS 8
+
+/**
+ * A structure describing a group of events.
+ */
+struct __rte_cache_aligned rte_pmu_event_group {
+ /** array of user pages */
+ struct perf_event_mmap_page *mmap_pages[RTE_MAX_NUM_GROUP_EVENTS];
+ int fds[RTE_MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */
+ bool enabled; /**< true if group was enabled on particular lcore */
+ TAILQ_ENTRY(rte_pmu_event_group) next; /**< list entry */
+};
+
+/**
+ * A PMU state container.
+ */
+struct rte_pmu {
+ char *name; /**< name of core PMU listed under /sys/bus/event_source/devices */
+ rte_spinlock_t lock; /**< serialize access to event group list */
+ TAILQ_HEAD(, rte_pmu_event_group) event_group_list; /**< list of event groups */
+ unsigned int num_group_events; /**< number of events in a group */
+ TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */
+ unsigned int initialized; /**< initialization counter */
+};
+
+/** lcore event group */
+RTE_DECLARE_PER_LCORE(struct rte_pmu_event_group, _event_group);
+
+/** PMU state container */
+extern struct rte_pmu rte_pmu;
+
+/** Each architecture supporting PMU needs to provide its own version */
+#ifndef rte_pmu_pmc_read
+#define rte_pmu_pmc_read(index) ({ (void)(index); 0; })
+#endif
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Read PMU counter.
+ *
+ * @warning This should be not called directly.
+ *
+ * @param pc
+ * Pointer to the mmapped user page.
+ * @return
+ * Counter value read from hardware.
+ */
+__rte_experimental
+static __rte_always_inline uint64_t
+__rte_pmu_read_userpage(struct perf_event_mmap_page *pc)
+{
+#define __RTE_PMU_READ_ONCE(x) (*(const volatile typeof(x) *)&(x))
+ uint64_t width, offset;
+ uint32_t seq, index;
+ int64_t pmc;
+
+ for (;;) {
+ seq = __RTE_PMU_READ_ONCE(pc->lock);
+ rte_compiler_barrier();
+ index = __RTE_PMU_READ_ONCE(pc->index);
+ offset = __RTE_PMU_READ_ONCE(pc->offset);
+ width = __RTE_PMU_READ_ONCE(pc->pmc_width);
+
+ /* index set to 0 means that particular counter cannot be used */
+ if (likely(pc->cap_user_rdpmc && index)) {
+ pmc = rte_pmu_pmc_read(index - 1);
+ pmc <<= 64 - width;
+ pmc >>= 64 - width;
+ offset += pmc;
+ }
+
+ rte_compiler_barrier();
+
+ if (likely(__RTE_PMU_READ_ONCE(pc->lock) == seq))
+ return offset;
+ }
+
+ return 0;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Enable group of events on the calling lcore.
+ *
+ * @warning This should be not called directly.
+ *
+ * @return
+ * 0 in case of success, negative value otherwise.
+ */
+__rte_experimental
+int
+__rte_pmu_enable_group(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Initialize PMU library.
+ *
+ * @warning This should be not called directly.
+ *
+ * @return
+ * 0 in case of success, negative value otherwise.
+ */
+__rte_experimental
+int
+rte_pmu_init(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Finalize PMU library. This should be called after PMU counters are no longer being read.
+ */
+__rte_experimental
+void
+rte_pmu_fini(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add event to the group of enabled events.
+ *
+ * @param name
+ * Name of an event listed under /sys/bus/event_source/devices/pmu/events.
+ * @return
+ * Event index in case of success, negative value otherwise.
+ */
+__rte_experimental
+int
+rte_pmu_add_event(const char *name);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Read hardware counter configured to count occurrences of an event.
+ *
+ * @param index
+ * Index of an event to be read.
+ * @return
+ * Event value read from register. In case of errors or lack of support
+ * 0 is returned. In other words, stream of zeros in a trace file
+ * indicates problem with reading particular PMU event register.
+ */
+__rte_experimental
+static __rte_always_inline uint64_t
+rte_pmu_read(unsigned int index)
+{
+ struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group);
+ int ret;
+
+ if (unlikely(!rte_pmu.initialized))
+ return 0;
+
+ if (unlikely(!group->enabled)) {
+ ret = __rte_pmu_enable_group();
+ if (ret)
+ return 0;
+ }
+
+ if (unlikely(index >= rte_pmu.num_group_events))
+ return 0;
+
+ return __rte_pmu_read_userpage(group->mmap_pages[index]);
+}
+
+#else /* !RTE_EXEC_ENV_LINUX */
+
+__rte_experimental
+static inline int rte_pmu_init(void) { return -ENOTSUP; }
+
+__rte_experimental
+static inline void rte_pmu_fini(void) { }
+
+__rte_experimental
+static inline int rte_pmu_add_event(const char *name __rte_unused) { return -ENOTSUP; }
+
+__rte_experimental
+static inline uint64_t rte_pmu_read(unsigned int index __rte_unused) { return UINT64_MAX; }
+
+#endif /* RTE_EXEC_ENV_LINUX */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PMU_H_ */
diff --git a/lib/pmu/version.map b/lib/pmu/version.map
new file mode 100644
index 0000000000..d7c80ce4ce
--- /dev/null
+++ b/lib/pmu/version.map
@@ -0,0 +1,14 @@
+EXPERIMENTAL {
+ global:
+
+ # added in 24.11
+ __rte_pmu_enable_group;
+ per_lcore__event_group;
+ rte_pmu;
+ rte_pmu_add_event;
+ rte_pmu_fini;
+ rte_pmu_init;
+ rte_pmu_read;
+
+ local: *;
+};
--
2.34.1
next prev parent reply other threads:[~2024-10-09 11:23 UTC|newest]
Thread overview: 190+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-11 9:43 [PATCH 0/4] add support for self monitoring Tomasz Duszynski
2022-11-11 9:43 ` [PATCH 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-12-15 8:33 ` Mattias Rönnblom
2022-11-11 9:43 ` [PATCH 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-11 9:43 ` [PATCH 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-11 9:43 ` [PATCH 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 0/4] add support for self monitoring Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-21 12:11 ` [PATCH v2 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-29 9:28 ` [PATCH v3 0/4] add support for self monitoring Tomasz Duszynski
2022-11-29 9:28 ` [PATCH v3 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-11-30 8:32 ` zhoumin
2022-12-13 8:05 ` [EXT] " Tomasz Duszynski
2022-11-29 9:28 ` [PATCH v3 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-11-29 9:28 ` [PATCH v3 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-11-29 9:28 ` [PATCH v3 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2022-11-29 10:42 ` [PATCH v3 0/4] add support for self monitoring Morten Brørup
2022-12-13 8:23 ` Tomasz Duszynski
2022-12-13 10:43 ` [PATCH v4 " Tomasz Duszynski
2022-12-13 10:43 ` [PATCH v4 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2022-12-13 11:52 ` Morten Brørup
2022-12-14 9:38 ` Tomasz Duszynski
2022-12-14 10:41 ` Morten Brørup
2022-12-15 8:22 ` Morten Brørup
2022-12-16 7:33 ` Morten Brørup
2023-01-05 21:14 ` Tomasz Duszynski
2023-01-05 22:07 ` Morten Brørup
2023-01-08 15:41 ` Tomasz Duszynski
2023-01-08 16:30 ` Morten Brørup
2022-12-15 8:46 ` Mattias Rönnblom
2023-01-04 15:47 ` Tomasz Duszynski
2023-01-09 7:37 ` Ruifeng Wang
2023-01-09 15:40 ` Tomasz Duszynski
2022-12-13 10:43 ` [PATCH v4 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2022-12-13 10:43 ` [PATCH v4 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2022-12-13 10:43 ` [PATCH v4 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-01-10 23:46 ` [PATCH v5 0/4] add support for self monitoring Tomasz Duszynski
2023-01-10 23:46 ` [PATCH v5 1/4] eal: add generic support for reading PMU events Tomasz Duszynski
2023-01-11 9:05 ` Morten Brørup
2023-01-11 16:20 ` Tomasz Duszynski
2023-01-11 16:54 ` Morten Brørup
2023-01-10 23:46 ` [PATCH v5 2/4] eal/arm: support reading ARM PMU events in runtime Tomasz Duszynski
2023-01-10 23:46 ` [PATCH v5 3/4] eal/x86: support reading Intel " Tomasz Duszynski
2023-01-10 23:46 ` [PATCH v5 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-01-11 0:32 ` [PATCH v5 0/4] add support for self monitoring Tyler Retzlaff
2023-01-11 9:31 ` Morten Brørup
2023-01-11 14:24 ` Tomasz Duszynski
2023-01-11 14:32 ` Bruce Richardson
2023-01-11 9:39 ` [EXT] " Tomasz Duszynski
2023-01-11 21:05 ` Tyler Retzlaff
2023-01-13 7:44 ` Tomasz Duszynski
2023-01-13 19:22 ` Tyler Retzlaff
2023-01-14 9:53 ` Morten Brørup
2023-01-19 23:39 ` [PATCH v6 " Tomasz Duszynski
2023-01-19 23:39 ` [PATCH v6 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-01-20 9:46 ` Morten Brørup
2023-01-26 9:40 ` Tomasz Duszynski
2023-01-26 12:29 ` Morten Brørup
2023-01-26 12:59 ` Bruce Richardson
2023-01-26 15:28 ` [EXT] " Tomasz Duszynski
2023-02-02 14:27 ` Morten Brørup
2023-01-26 15:17 ` Tomasz Duszynski
2023-01-20 18:29 ` Tyler Retzlaff
2023-01-26 9:05 ` [EXT] " Tomasz Duszynski
2023-01-19 23:39 ` [PATCH v6 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-01-19 23:39 ` [PATCH v6 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-01-19 23:39 ` [PATCH v6 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-01 13:17 ` [PATCH v7 0/4] add support for self monitoring Tomasz Duszynski
2023-02-01 13:17 ` [PATCH v7 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-01 13:17 ` [PATCH v7 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-01 13:17 ` [PATCH v7 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-01 13:17 ` [PATCH v7 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-01 13:51 ` [PATCH v7 0/4] add support for self monitoring Morten Brørup
2023-02-02 7:54 ` Tomasz Duszynski
2023-02-02 9:43 ` [PATCH v8 " Tomasz Duszynski
2023-02-02 9:43 ` [PATCH v8 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-02 10:32 ` Ruifeng Wang
2023-02-02 9:43 ` [PATCH v8 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-02 9:43 ` [PATCH v8 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-02 9:43 ` [PATCH v8 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-02 12:49 ` [PATCH v9 0/4] add support for self monitoring Tomasz Duszynski
2023-02-02 12:49 ` [PATCH v9 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-06 11:02 ` David Marchand
2023-02-09 11:09 ` [EXT] " Tomasz Duszynski
2023-02-02 12:49 ` [PATCH v9 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-02 12:49 ` [PATCH v9 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-02 12:49 ` [PATCH v9 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-13 11:31 ` [PATCH v10 0/4] add support for self monitoring Tomasz Duszynski
2023-02-13 11:31 ` [PATCH v10 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-16 7:39 ` Ruifeng Wang
2023-02-16 14:44 ` Tomasz Duszynski
2023-02-13 11:31 ` [PATCH v10 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-16 7:41 ` Ruifeng Wang
2023-02-13 11:31 ` [PATCH v10 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-13 11:31 ` [PATCH v10 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-16 17:54 ` [PATCH v11 0/4] add support for self monitoring Tomasz Duszynski
2023-02-16 17:54 ` [PATCH v11 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2023-02-16 23:50 ` Konstantin Ananyev
2023-02-17 8:49 ` [EXT] " Tomasz Duszynski
2023-02-17 10:14 ` Konstantin Ananyev
2023-02-19 14:23 ` Tomasz Duszynski
2023-02-20 14:31 ` Konstantin Ananyev
2023-02-20 16:59 ` Tomasz Duszynski
2023-02-20 17:21 ` Konstantin Ananyev
2023-02-20 20:42 ` Tomasz Duszynski
2023-02-21 0:48 ` Konstantin Ananyev
2023-02-27 8:12 ` Tomasz Duszynski
2023-02-28 11:35 ` Konstantin Ananyev
2023-02-21 12:15 ` Konstantin Ananyev
2023-02-21 2:17 ` Konstantin Ananyev
2023-02-27 9:19 ` [EXT] " Tomasz Duszynski
2023-02-27 20:53 ` Konstantin Ananyev
2023-02-28 8:25 ` Morten Brørup
2023-02-28 12:04 ` Konstantin Ananyev
2023-02-28 13:15 ` Morten Brørup
2023-02-28 16:22 ` Morten Brørup
2023-03-05 16:30 ` Konstantin Ananyev
2023-02-28 9:57 ` Tomasz Duszynski
2023-02-28 11:58 ` Konstantin Ananyev
2023-02-16 17:55 ` [PATCH v11 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2023-02-16 17:55 ` [PATCH v11 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2023-02-16 17:55 ` [PATCH v11 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2023-02-16 18:03 ` [PATCH v11 0/4] add support for self monitoring Ruifeng Wang
2023-05-04 8:02 ` David Marchand
2023-07-31 12:33 ` Thomas Monjalon
2023-08-07 8:11 ` [EXT] " Tomasz Duszynski
2023-09-21 8:26 ` David Marchand
2024-09-02 14:48 ` Morten Brørup
2024-09-05 3:49 ` Tomasz Duszynski
2024-09-27 22:06 ` [PATCH v12 " Tomasz Duszynski
2024-09-27 22:06 ` [PATCH v12 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2024-10-06 14:30 ` Morten Brørup
2024-10-09 9:17 ` Tomasz Duszynski
2024-10-07 6:59 ` Jerin Jacob
2024-10-09 7:50 ` [EXTERNAL] " Tomasz Duszynski
2024-09-27 22:06 ` [PATCH v12 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2024-09-27 22:06 ` [PATCH v12 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2024-09-27 22:06 ` [PATCH v12 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2024-10-04 11:00 ` [PATCH v12 0/4] add support for self monitoring David Marchand
2024-10-09 12:45 ` [EXTERNAL] " Tomasz Duszynski
2024-10-09 11:23 ` [PATCH v13 " Tomasz Duszynski
2024-10-09 11:23 ` Tomasz Duszynski [this message]
2024-10-09 11:23 ` [PATCH v13 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2024-10-09 11:23 ` [PATCH v13 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2024-10-09 11:23 ` [PATCH v13 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2024-10-09 12:50 ` Morten Brørup
2024-10-09 17:56 ` Stephen Hemminger
2024-10-10 7:24 ` [EXTERNAL] " Tomasz Duszynski
2024-10-10 12:48 ` Morten Brørup
2024-10-11 9:49 ` [PATCH v14 0/4] add support for self monitoring Tomasz Duszynski
2024-10-11 9:49 ` [PATCH v14 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2024-10-11 11:56 ` Konstantin Ananyev
2024-10-11 14:19 ` Stephen Hemminger
2024-10-15 9:14 ` [EXTERNAL] " Tomasz Duszynski
2024-10-15 9:08 ` Tomasz Duszynski
2024-10-16 8:49 ` Konstantin Ananyev
2024-10-17 7:11 ` Tomasz Duszynski
2024-10-11 9:49 ` [PATCH v14 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2024-10-11 9:49 ` [PATCH v14 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2024-10-11 9:49 ` [PATCH v14 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2024-10-11 13:29 ` David Marchand
2024-10-15 9:18 ` [EXTERNAL] " Tomasz Duszynski
2024-10-25 8:54 ` [PATCH v15 0/4] add support for self monitoring Tomasz Duszynski
2024-10-25 8:54 ` [PATCH v15 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2024-11-05 10:58 ` Konstantin Ananyev
2024-11-15 10:20 ` Tomasz Duszynski
2024-10-25 8:54 ` [PATCH v15 2/4] pmu: support reading ARM PMU events in runtime Tomasz Duszynski
2024-10-25 8:54 ` [PATCH v15 3/4] pmu: support reading Intel x86_64 " Tomasz Duszynski
2024-10-25 8:54 ` [PATCH v15 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2024-10-25 11:02 ` Jerin Jacob
2024-10-28 10:32 ` [EXTERNAL] " Tomasz Duszynski
2024-11-05 7:41 ` Morten Brørup
2024-11-08 10:36 ` Tomasz Duszynski
2024-11-05 11:04 ` Konstantin Ananyev
2024-11-08 11:44 ` Tomasz Duszynski
2024-11-12 23:09 ` Stephen Hemminger
2024-11-15 10:24 ` [EXTERNAL] " Tomasz Duszynski
2024-11-05 4:04 ` [PATCH v15 0/4] add support for self monitoring Tomasz Duszynski
2023-01-25 10:33 ` [PATCH 0/2] add platform bus Tomasz Duszynski
2023-01-25 10:33 ` [PATCH 1/2] lib: add helper to read strings from sysfs files Tomasz Duszynski
2023-01-25 10:39 ` Thomas Monjalon
2023-01-25 16:16 ` Tyler Retzlaff
2023-01-26 8:30 ` [EXT] " Tomasz Duszynski
2023-01-26 17:21 ` Tyler Retzlaff
2023-01-26 8:35 ` Tomasz Duszynski
2023-01-25 10:33 ` [PATCH 2/2] bus: add platform bus Tomasz Duszynski
2023-01-25 10:41 ` [PATCH 0/2] " Tomasz Duszynski
2023-02-16 20:56 ` [PATCH v5 0/4] add support for self monitoring Liang Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241009112308.2973903-2-tduszynski@marvell.com \
--to=tduszynski@marvell.com \
--cc=Ruifeng.Wang@arm.com \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=jerinj@marvell.com \
--cc=konstantin.v.ananyev@yandex.ru \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=roretzla@linux.microsoft.com \
--cc=thomas@monjalon.net \
--cc=zhoumin@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).