DPDK patches and discussions
 help / color / mirror / Atom feed
From: Huisong Li <lihuisong@huawei.com>
To: <dev@dpdk.org>
Cc: <thomas@monjalon.net>, <ferruh.yigit@amd.com>,
	<anatoly.burakov@intel.com>, <david.hunt@intel.com>,
	<sivaprasad.tummala@amd.com>, <liuyonglong@huawei.com>,
	<lihuisong@huawei.com>
Subject: [PATCH 1/2] power: introduce PM QoS interface
Date: Wed, 20 Mar 2024 18:55:28 +0800	[thread overview]
Message-ID: <20240320105529.5626-2-lihuisong@huawei.com> (raw)
In-Reply-To: <20240320105529.5626-1-lihuisong@huawei.com>

The system-wide CPU latency QoS limit has a positive impact on the idle
state selection in cpuidle governor.

Linux creates a cpu_dma_latency device under '/dev' directory to obtain the
CPU latency QoS limit on system and send the QoS request for userspace.
Please see the PM QoS framework in the following link:
https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos
This feature has beed supported by kernel-v2.6.25.

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

So this PM QoS API make it easy to obtain the CPU latency limit on system
and send the CPU latency QoS request for the application that need them.

The recommend usage method is as follows:
1) an application process first creates QoS request.
2) update the CPU latency request to zero when need.
3) back to the default value when no need(this step is optional).
4) release QoS request when process exit.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 doc/guides/prog_guide/power_man.rst    |  16 ++++
 doc/guides/rel_notes/release_24_03.rst |   4 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              |  98 ++++++++++++++++++++++++
 lib/power/rte_power_qos.h              | 101 +++++++++++++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 225 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..493c75bf9d 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -249,6 +249,22 @@ Get Num Pkgs
 Get Num Dies
   Get the number of die's on a given package.
 
+PM QoS API
+----------
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service threads are delay sensitive and very except
+the low resume time, like interrupt packet receiving mode.
+
+This PM QoS API is aimed to obtain the CPU latency limit on system and send the
+CPU latency QoS request for the application that need them.
+
+* ``rte_power_qos_get_curr_cpu_latency()`` is used to get the current CPU
+  latency limit on system.
+* For sending CPU latency QoS request, first call ``rte_power_create_qos_request()``
+  to create a QoS request, then update CPU latency value by calling
+  ``rte_power_qos_update_request()``. The ``rte_power_release_qos_request()`` is
+  used to release this QoS request when process exit.
+
 References
 ----------
 
diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
index 14826ea08f..b5be724133 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -196,6 +196,10 @@ New Features
   Added DMA producer mode to measure performance of ``OP_FORWARD`` mode
   of event DMA adapter.
 
+* **Added CPU latency PM QoS support.**
+
+  Added the interface querying cpu latency PM QoS limit on system and
+  the interface sending cpu latency QoS request in power lib.
 
 Removed Items
 -------------
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..8222e178b0 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+        'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+        'rte_power_qos.h',
 )
 if cc.has_argument('-Wno-cast-qual')
     cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..d2b55923a0
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define QOS_CPU_DMA_LATENCY_DEV "/dev/cpu_dma_latency"
+
+struct rte_power_qos_info {
+	/*
+	 * Keep file descriptor to update QoS request until there are no
+	 * necessary anymore.
+	 */
+	int fd;
+	int cur_cpu_latency; /* unit microseconds */
+	};
+
+struct rte_power_qos_info g_qos = {
+	.fd = -1,
+	.cur_cpu_latency = -1,
+};
+
+int
+rte_power_qos_get_curr_cpu_latency(int *latency)
+{
+	int fd, ret;
+
+	fd = open(QOS_CPU_DMA_LATENCY_DEV, O_RDONLY);
+	if (fd < 0) {
+		POWER_LOG(ERR, "Failed to open %s", QOS_CPU_DMA_LATENCY_DEV);
+		return -1;
+	}
+
+	ret = read(fd, latency, sizeof(*latency));
+	if (ret == 0) {
+		POWER_LOG(ERR, "Failed to read %s", QOS_CPU_DMA_LATENCY_DEV);
+		return -1;
+	}
+	close(fd);
+
+	return 0;
+}
+
+int
+rte_power_qos_update_request(int latency)
+{
+	int ret;
+
+	if (g_qos.fd == -1) {
+		POWER_LOG(ERR, "please create QoS request first.");
+		return -EINVAL;
+	}
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be non negative number.");
+		return -EINVAL;
+	}
+
+	if (g_qos.cur_cpu_latency != -1 && latency == g_qos.cur_cpu_latency)
+		return 0;
+
+	ret = write(g_qos.fd, &latency, sizeof(latency));
+	if (ret == 0) {
+		POWER_LOG(ERR, "Failed to write %s", QOS_CPU_DMA_LATENCY_DEV);
+		return -1;
+	}
+	g_qos.cur_cpu_latency = latency;
+
+	return 0;
+}
+
+int
+rte_power_create_qos_request(void)
+{
+	g_qos.fd = open(QOS_CPU_DMA_LATENCY_DEV, O_WRONLY);
+	if (g_qos.fd < 0) {
+		POWER_LOG(ERR, "Failed to open %s.", QOS_CPU_DMA_LATENCY_DEV);
+		return -1;
+	}
+
+	return 0;
+}
+
+void
+rte_power_release_qos_request(void)
+{
+	if (g_qos.fd != -1) {
+		close(g_qos.fd);
+		g_qos.fd = -1;
+	}
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..d39f5d0c0f
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The system-wide CPU latency QoS limit has a positive impact on the idle
+ * state selection in cpuidle governor.
+ *
+ * Linux creates a cpu_dma_latency device under '/dev' directory to obtain the
+ * CPU latency QoS limit on system and send the QoS request for userspace.
+ * Please see the PM QoS framework in the following link:
+ * https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos
+ *
+ * The deeper the idle state, the lower the power consumption, but the longer
+ * the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * So this PM QoS API make it easy to obtain the CPU latency limit on system and
+ * send the CPU latency QoS request for the application that need them.
+ *
+ * The recommend usage method is as follows:
+ * 1) an application process first creates QoS request.
+ * 2) update the CPU latency request to zero when need.
+ * 3) back to the default value @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE when
+ *    no need (this step is optional).
+ * 4)release QoS request when process exit.
+ */
+
+#define QOS_USEC_PER_SEC                        1000000
+#define PM_QOS_CPU_LATENCY_DEFAULT_VALUE        (2000 * QOS_USEC_PER_SEC)
+#define PM_QOS_STRICT_LATENCY_VALUE             0
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Create CPU latency QoS request and release this request by
+ * @see rte_power_release_qos_request.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_create_qos_request(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * release CPU latency QoS request.
+ */
+__rte_experimental
+void rte_power_release_qos_request(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current CPU latency QoS limit on system.
+ * The default value in kernel is @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_get_curr_cpu_latency(int *latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the CPU latency QoS request.
+ * Note: need to create QoS request first and then call this API.
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_update_request(int latency);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index ad92a65f91..42770762b1 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+	rte_power_create_qos_request;
+	rte_power_release_qos_request;
+	rte_power_qos_get_curr_cpu_latency;
+	rte_power_qos_update_request;
 };
-- 
2.22.0


  reply	other threads:[~2024-03-20 11:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 10:55 [PATCH 0/2] " Huisong Li
2024-03-20 10:55 ` Huisong Li [this message]
2024-03-20 10:55 ` [PATCH 2/2] examples/l3fwd-power: add PM QoS request configuration Huisong Li
2024-03-20 14:05 ` [PATCH 0/2] introduce PM QoS interface Morten Brørup
2024-03-21  3:04   ` lihuisong (C)
2024-03-21 13:30     ` Morten Brørup
2024-03-22  8:54       ` lihuisong (C)
2024-03-22 12:35         ` Morten Brørup
2024-03-26  2:11           ` lihuisong (C)
2024-03-26  8:27             ` Morten Brørup
2024-03-26 12:15               ` lihuisong (C)
2024-03-26 12:46                 ` Morten Brørup
2024-03-29  1:59                   ` lihuisong (C)
2024-03-22 17:55         ` Tyler Retzlaff
2024-03-26  2:20           ` lihuisong (C)
2024-03-26 16:04             ` Tyler Retzlaff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240320105529.5626-2-lihuisong@huawei.com \
    --to=lihuisong@huawei.com \
    --cc=anatoly.burakov@intel.com \
    --cc=david.hunt@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=liuyonglong@huawei.com \
    --cc=sivaprasad.tummala@amd.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).