DPDK patches and discussions
 help / color / mirror / Atom feed
From: Huisong Li <lihuisong@huawei.com>
To: <dev@dpdk.org>
Cc: <mb@smartsharesystems.com>, <thomas@monjalon.net>,
	<ferruh.yigit@amd.com>,  <anatoly.burakov@intel.com>,
	<david.hunt@intel.com>, <sivaprasad.tummala@amd.com>,
	<stephen@networkplumber.org>, <konstantin.ananyev@huawei.com>,
	<david.marchand@redhat.com>, <fengchengwen@huawei.com>,
	<liuyonglong@huawei.com>, <lihuisong@huawei.com>
Subject: [PATCH v11 1/2] power: introduce PM QoS API on CPU wide
Date: Mon, 21 Oct 2024 19:42:52 +0800	[thread overview]
Message-ID: <20241021114253.31216-2-lihuisong@huawei.com> (raw)
In-Reply-To: <20241021114253.31216-1-lihuisong@huawei.com>

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..91358b04f3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..d9e268274b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -237,6 +237,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9b5d3e8315 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+	'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+	'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..09692b2161
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == 0)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = 0;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..990c488373
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT    ((int)(UINT32_MAX >> 1))
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..08f178a39d 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
-- 
2.22.0


  reply	other threads:[~2024-10-21 11:53 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 10:55 [PATCH 0/2] introduce PM QoS interface Huisong Li
2024-03-20 10:55 ` [PATCH 1/2] power: " Huisong Li
2024-03-20 10:55 ` [PATCH 2/2] examples/l3fwd-power: add PM QoS request configuration Huisong Li
2024-03-20 14:05 ` [PATCH 0/2] introduce PM QoS interface Morten Brørup
2024-03-21  3:04   ` lihuisong (C)
2024-03-21 13:30     ` Morten Brørup
2024-03-22  8:54       ` lihuisong (C)
2024-03-22 12:35         ` Morten Brørup
2024-03-26  2:11           ` lihuisong (C)
2024-03-26  8:27             ` Morten Brørup
2024-03-26 12:15               ` lihuisong (C)
2024-03-26 12:46                 ` Morten Brørup
2024-03-29  1:59                   ` lihuisong (C)
2024-03-22 17:55         ` Tyler Retzlaff
2024-03-26  2:20           ` lihuisong (C)
2024-03-26 16:04             ` Tyler Retzlaff
2024-06-13 11:20 ` [PATCH v2 0/2] power: " Huisong Li
2024-06-13 11:20   ` [PATCH v2 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-14  8:04     ` Morten Brørup
2024-06-18 12:19       ` lihuisong (C)
2024-06-18 12:53         ` Morten Brørup
2024-06-13 11:20   ` [PATCH v2 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-06-19  6:31 ` [PATCH v3 0/2] power: introduce PM QoS interface Huisong Li
2024-06-19  6:31   ` [PATCH v3 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-19 14:56     ` Stephen Hemminger
2024-06-20  2:22       ` lihuisong (C)
2024-06-19 15:32     ` Thomas Monjalon
2024-06-20  2:32       ` lihuisong (C)
2024-06-19  6:31   ` [PATCH v3 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-06-19 14:54     ` Stephen Hemminger
2024-06-20  2:24       ` lihuisong (C)
2024-06-19  6:59   ` [PATCH v3 0/2] power: introduce PM QoS interface Morten Brørup
2024-06-27  6:00 ` [PATCH v4 " Huisong Li
2024-06-27  6:00   ` [PATCH v4 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-27 15:06     ` Stephen Hemminger
2024-06-28  4:07       ` lihuisong (C)
2024-06-27  6:00   ` [PATCH v4 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-02  3:50 ` [PATCH v5 0/2] power: introduce PM QoS interface Huisong Li
2024-07-02  3:50   ` [PATCH v5 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-03  1:32     ` zhoumin
2024-07-03  2:52       ` lihuisong (C)
2024-07-02  3:50   ` [PATCH v5 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  2:29 ` [PATCH v6 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  2:29   ` [PATCH v6 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  2:29   ` [PATCH v6 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  3:07     ` Stephen Hemminger
2024-07-09  3:18       ` lihuisong (C)
2024-07-09  6:31 ` [PATCH v7 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  6:31   ` [PATCH v7 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  6:31   ` [PATCH v7 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  7:25 ` [PATCH v8 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  7:25   ` [PATCH v8 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  7:25   ` [PATCH v8 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-08-09  9:50 ` [PATCH v9 0/2] power: introduce PM QoS interface Huisong Li
2024-08-09  9:50   ` [PATCH v9 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-09-10  2:00     ` fengchengwen
2024-09-10  9:32       ` lihuisong (C)
2024-09-12  1:14         ` fengchengwen
2024-08-09  9:50   ` [PATCH v9 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-09-10  2:26     ` fengchengwen
2024-09-10 12:07       ` lihuisong (C)
2024-09-12  1:15         ` fengchengwen
2024-09-12  2:38 ` [PATCH v10 0/2] power: introduce PM QoS interface Huisong Li
2024-09-12  2:38   ` [PATCH v10 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-13  1:10     ` Stephen Hemminger
2024-10-14 12:19       ` lihuisong (C)
2024-10-15  9:41         ` lihuisong (C)
2024-10-15 15:45           ` Stephen Hemminger
2024-10-17  2:11             ` lihuisong (C)
2024-10-17  3:20               ` Stephen Hemminger
2024-10-17  8:37                 ` lihuisong (C)
2024-10-22  3:10             ` lihuisong (C)
2024-10-14  8:29     ` Konstantin Ananyev
2024-10-15  7:47       ` lihuisong (C)
2024-09-12  2:38   ` [PATCH v10 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-10-14  8:24     ` Konstantin Ananyev
2024-10-14  8:46       ` Konstantin Ananyev
2024-10-15  7:32       ` lihuisong (C)
2024-10-16  0:24         ` Konstantin Ananyev
2024-10-17  2:25           ` lihuisong (C)
2024-10-17 11:14             ` Konstantin Ananyev
2024-09-12  3:07   ` [PATCH v10 0/2] power: introduce PM QoS interface fengchengwen
2024-10-12  2:07     ` lihuisong (C)
2024-10-14 15:27   ` Stephen Hemminger
2024-10-15  9:30     ` lihuisong (C)
2024-10-21 11:42 ` [PATCH v11 " Huisong Li
2024-10-21 11:42   ` Huisong Li [this message]
2024-10-22  9:08     ` [PATCH v11 1/2] power: introduce PM QoS API on CPU wide Konstantin Ananyev
2024-10-22  9:41       ` lihuisong (C)
2024-10-21 11:42   ` [PATCH v11 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-10-22  9:10     ` Konstantin Ananyev
2024-10-22  9:44       ` lihuisong (C)
2024-10-22 12:15         ` Konstantin Ananyev
2024-10-23  6:27       ` lihuisong (C)
2024-10-23  4:09 ` [PATCH v12 0/3] power: introduce PM QoS interface Huisong Li
2024-10-23  4:09   ` [PATCH v12 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-23  4:09   ` [PATCH v12 2/3] examples/l3fwd-power: fix data overflow when parse command line Huisong Li
2024-10-23  4:09   ` [PATCH v12 3/3] examples/l3fwd-power: add PM QoS configuration Huisong Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241021114253.31216-2-lihuisong@huawei.com \
    --to=lihuisong@huawei.com \
    --cc=anatoly.burakov@intel.com \
    --cc=david.hunt@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@amd.com \
    --cc=konstantin.ananyev@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=mb@smartsharesystems.com \
    --cc=sivaprasad.tummala@amd.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).