DPDK patches and discussions
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: "lihuisong (C)" <lihuisong@huawei.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "mb@smartsharesystems.com" <mb@smartsharesystems.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
	"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
	"david.hunt@intel.com" <david.hunt@intel.com>,
	"sivaprasad.tummala@amd.com" <sivaprasad.tummala@amd.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	Fengchengwen <fengchengwen@huawei.com>,
	liuyonglong <liuyonglong@huawei.com>,
	"lihuisong (C)" <lihuisong@huawei.com>
Subject: RE: [PATCH v10 1/2] power: introduce PM QoS API on CPU wide
Date: Mon, 14 Oct 2024 08:29:56 +0000	[thread overview]
Message-ID: <773b9cf3df354a168e42aecb627b0b2c@huawei.com> (raw)
In-Reply-To: <20240912023812.30885-2-lihuisong@huawei.com>


> The deeper the idle state, the lower the power consumption, but the longer
> the resume time. Some service are delay sensitive and very except the low
> resume time, like interrupt packet receiving mode.
> 
> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> interface is used to set and get the resume latency limit on the cpuX for
> userspace. Each cpuidle governor in Linux select which idle state to enter
> based on this CPU resume latency in their idle task.
> 
> The per-CPU PM QoS API can be used to control this CPU's idle state
> selection and limit just enter the shallowest idle state to low the delay
> after sleep by setting strict resume latency (zero value).
> 
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> ---
>  doc/guides/prog_guide/power_man.rst    |  24 ++++++
>  doc/guides/rel_notes/release_24_11.rst |   5 ++
>  lib/power/meson.build                  |   2 +
>  lib/power/rte_power_qos.c              | 111 +++++++++++++++++++++++++
>  lib/power/rte_power_qos.h              |  73 ++++++++++++++++
>  lib/power/version.map                  |   4 +
>  6 files changed, 219 insertions(+)
>  create mode 100644 lib/power/rte_power_qos.c
>  create mode 100644 lib/power/rte_power_qos.h
> 
> diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
> index f6674efe2d..faa32b4320 100644
> --- a/doc/guides/prog_guide/power_man.rst
> +++ b/doc/guides/prog_guide/power_man.rst
> @@ -249,6 +249,30 @@ Get Num Pkgs
>  Get Num Dies
>    Get the number of die's on a given package.
> 
> +
> +PM QoS
> +------
> +
> +The deeper the idle state, the lower the power consumption, but the longer
> +the resume time. Some service are delay sensitive and very except the low
> +resume time, like interrupt packet receiving mode.
> +
> +And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> +interface is used to set and get the resume latency limit on the cpuX for
> +userspace. Each cpuidle governor in Linux select which idle state to enter
> +based on this CPU resume latency in their idle task.
> +
> +The per-CPU PM QoS API can be used to set and get the CPU resume latency based
> +on this sysfs.
> +
> +The ``rte_power_qos_set_cpu_resume_latency()`` function can control the CPU's
> +idle state selection in Linux and limit just to enter the shallowest idle state
> +to low the delay of resuming service after sleeping by setting strict resume
> +latency (zero value).
> +
> +The ``rte_power_qos_get_cpu_resume_latency()`` function can get the resume
> +latency on specified CPU.
> +
>  References
>  ----------
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
> index 0ff70d9057..bd72d0a595 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -55,6 +55,11 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Introduce per-CPU PM QoS interface.**
> +
> +  * Add per-CPU PM QoS interface to low the delay after sleep by controlling
> +    CPU idle state selection.
> +
> 
>  Removed Items
>  -------------
> diff --git a/lib/power/meson.build b/lib/power/meson.build
> index b8426589b2..8222e178b0 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -23,12 +23,14 @@ sources = files(
>          'rte_power.c',
>          'rte_power_uncore.c',
>          'rte_power_pmd_mgmt.c',
> +        'rte_power_qos.c',
>  )
>  headers = files(
>          'rte_power.h',
>          'rte_power_guest_channel.h',
>          'rte_power_pmd_mgmt.h',
>          'rte_power_uncore.h',
> +        'rte_power_qos.h',
>  )
>  if cc.has_argument('-Wno-cast-qual')
>      cflags += '-Wno-cast-qual'
> diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
> new file mode 100644
> index 0000000000..8eb26cd41a
> --- /dev/null
> +++ b/lib/power/rte_power_qos.c
> @@ -0,0 +1,111 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#include <errno.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <rte_lcore.h>
> +#include <rte_log.h>
> +
> +#include "power_common.h"
> +#include "rte_power_qos.h"
> +
> +#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
> +	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
> +
> +#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
> +
> +int
> +rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
> +{
> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +	FILE *f;
> +	int ret;
> +
> +	if (!rte_lcore_is_enabled(lcore_id)) {
> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +		return -EINVAL;
> +	}
> +
> +	if (latency < 0) {
> +		POWER_LOG(ERR, "latency should be greater than and equal to 0");
> +		return -EINVAL;
> +	}
> +
> +	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);

That was already brought by Morten:
lcore_id is not always equal to cpu_core_id (cpu affinity).
Looking through power library it is not specific to that particular patch,
but sort of common limitation (bug?) in rte_power lib.  
 

> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
> +		return ret;
> +	}
> +
> +	/*
> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
> +	 * is as follows for different input string.
> +	 * 1> the resume latency is 0 if the input is "n/a".
> +	 * 2> the resume latency is no constraint if the input is "0".
> +	 * 3> the resume latency is the actual value to be set.
> +	 */
> +	if (latency == 0)
> +		snprintf(buf, sizeof(buf), "%s", "n/a");
> +	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
> +		snprintf(buf, sizeof(buf), "%u", 0);
> +	else
> +		snprintf(buf, sizeof(buf), "%u", latency);
> +
> +	ret = write_core_sysfs_s(f, buf);
> +	if (ret != 0)
> +		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
> +
> +	fclose(f);
> +
> +	return ret;
> +}
> +
> +int
> +rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
> +{
> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +	int latency = -1;
> +	FILE *f;
> +	int ret;
> +
> +	if (!rte_lcore_is_enabled(lcore_id)) {
> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +		return -EINVAL;
> +	}
> +
> +	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
> +		return ret;
> +	}
> +
> +	ret = read_core_sysfs_s(f, buf, sizeof(buf));
> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
> +		goto out;
> +	}
> +
> +	/*
> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
> +	 * is as follows for different output string.
> +	 * 1> the resume latency is 0 if the output is "n/a".
> +	 * 2> the resume latency is no constraint if the output is "0".
> +	 * 3> the resume latency is the actual value in used for other string.
> +	 */
> +	if (strcmp(buf, "n/a") == 0)
> +		latency = 0;
> +	else {
> +		latency = strtoul(buf, NULL, 10);
> +		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
> +	}
> +
> +out:
> +	fclose(f);
> +
> +	return latency != -1 ? latency : ret;
> +}
> diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
> new file mode 100644
> index 0000000000..990c488373
> --- /dev/null
> +++ b/lib/power/rte_power_qos.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#ifndef RTE_POWER_QOS_H
> +#define RTE_POWER_QOS_H
> +
> +#include <stdint.h>
> +
> +#include <rte_compat.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file rte_power_qos.h
> + *
> + * PM QoS API.
> + *
> + * The CPU-wide resume latency limit has a positive impact on this CPU's idle
> + * state selection in each cpuidle governor.
> + * Please see the PM QoS on CPU wide in the following link:
> + * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-
> power-pm-qos-resume-latency-us
> + *
> + * The deeper the idle state, the lower the power consumption, but the
> + * longer the resume time. Some service are delay sensitive and very except the
> + * low resume time, like interrupt packet receiving mode.
> + *
> + * In these case, per-CPU PM QoS API can be used to control this CPU's idle
> + * state selection and limit just enter the shallowest idle state to low the
> + * delay after sleep by setting strict resume latency (zero value).
> + */
> +
> +#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
> +#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT    ((int)(UINT32_MAX >> 1))
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * @param lcore_id
> + *   target logical core id
> + *
> + * @param latency
> + *   The latency should be greater than and equal to zero in microseconds unit.
> + *
> + * @return
> + *   0 on success. Otherwise negative value is returned.
> + */
> +__rte_experimental
> +int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the current resume latency of this logical core.
> + * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
> + * if don't set it.
> + *
> + * @return
> + *   Negative value on failure.
> + *   >= 0 means the actual resume latency limit on this core.
> + */
> +__rte_experimental
> +int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_POWER_QOS_H */
> diff --git a/lib/power/version.map b/lib/power/version.map
> index c9a226614e..08f178a39d 100644
> --- a/lib/power/version.map
> +++ b/lib/power/version.map
> @@ -51,4 +51,8 @@ EXPERIMENTAL {
>  	rte_power_set_uncore_env;
>  	rte_power_uncore_freqs;
>  	rte_power_unset_uncore_env;
> +
> +	# added in 24.11
> +	rte_power_qos_get_cpu_resume_latency;
> +	rte_power_qos_set_cpu_resume_latency;
>  };
> --
> 2.22.0


  parent reply	other threads:[~2024-10-14  8:30 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 10:55 [PATCH 0/2] introduce PM QoS interface Huisong Li
2024-03-20 10:55 ` [PATCH 1/2] power: " Huisong Li
2024-03-20 10:55 ` [PATCH 2/2] examples/l3fwd-power: add PM QoS request configuration Huisong Li
2024-03-20 14:05 ` [PATCH 0/2] introduce PM QoS interface Morten Brørup
2024-03-21  3:04   ` lihuisong (C)
2024-03-21 13:30     ` Morten Brørup
2024-03-22  8:54       ` lihuisong (C)
2024-03-22 12:35         ` Morten Brørup
2024-03-26  2:11           ` lihuisong (C)
2024-03-26  8:27             ` Morten Brørup
2024-03-26 12:15               ` lihuisong (C)
2024-03-26 12:46                 ` Morten Brørup
2024-03-29  1:59                   ` lihuisong (C)
2024-03-22 17:55         ` Tyler Retzlaff
2024-03-26  2:20           ` lihuisong (C)
2024-03-26 16:04             ` Tyler Retzlaff
2024-06-13 11:20 ` [PATCH v2 0/2] power: " Huisong Li
2024-06-13 11:20   ` [PATCH v2 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-14  8:04     ` Morten Brørup
2024-06-18 12:19       ` lihuisong (C)
2024-06-18 12:53         ` Morten Brørup
2024-06-13 11:20   ` [PATCH v2 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-06-19  6:31 ` [PATCH v3 0/2] power: introduce PM QoS interface Huisong Li
2024-06-19  6:31   ` [PATCH v3 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-19 14:56     ` Stephen Hemminger
2024-06-20  2:22       ` lihuisong (C)
2024-06-19 15:32     ` Thomas Monjalon
2024-06-20  2:32       ` lihuisong (C)
2024-06-19  6:31   ` [PATCH v3 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-06-19 14:54     ` Stephen Hemminger
2024-06-20  2:24       ` lihuisong (C)
2024-06-19  6:59   ` [PATCH v3 0/2] power: introduce PM QoS interface Morten Brørup
2024-06-27  6:00 ` [PATCH v4 " Huisong Li
2024-06-27  6:00   ` [PATCH v4 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-06-27 15:06     ` Stephen Hemminger
2024-06-28  4:07       ` lihuisong (C)
2024-06-27  6:00   ` [PATCH v4 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-02  3:50 ` [PATCH v5 0/2] power: introduce PM QoS interface Huisong Li
2024-07-02  3:50   ` [PATCH v5 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-03  1:32     ` zhoumin
2024-07-03  2:52       ` lihuisong (C)
2024-07-02  3:50   ` [PATCH v5 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  2:29 ` [PATCH v6 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  2:29   ` [PATCH v6 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  2:29   ` [PATCH v6 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  3:07     ` Stephen Hemminger
2024-07-09  3:18       ` lihuisong (C)
2024-07-09  6:31 ` [PATCH v7 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  6:31   ` [PATCH v7 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  6:31   ` [PATCH v7 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-07-09  7:25 ` [PATCH v8 0/2] power: introduce PM QoS interface Huisong Li
2024-07-09  7:25   ` [PATCH v8 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-07-09  7:25   ` [PATCH v8 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-08-09  9:50 ` [PATCH v9 0/2] power: introduce PM QoS interface Huisong Li
2024-08-09  9:50   ` [PATCH v9 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-09-10  2:00     ` fengchengwen
2024-09-10  9:32       ` lihuisong (C)
2024-09-12  1:14         ` fengchengwen
2024-08-09  9:50   ` [PATCH v9 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-09-10  2:26     ` fengchengwen
2024-09-10 12:07       ` lihuisong (C)
2024-09-12  1:15         ` fengchengwen
2024-09-12  2:38 ` [PATCH v10 0/2] power: introduce PM QoS interface Huisong Li
2024-09-12  2:38   ` [PATCH v10 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-13  1:10     ` Stephen Hemminger
2024-10-14 12:19       ` lihuisong (C)
2024-10-15  9:41         ` lihuisong (C)
2024-10-15 15:45           ` Stephen Hemminger
2024-10-17  2:11             ` lihuisong (C)
2024-10-17  3:20               ` Stephen Hemminger
2024-10-17  8:37                 ` lihuisong (C)
2024-10-14  8:29     ` Konstantin Ananyev [this message]
2024-10-15  7:47       ` lihuisong (C)
2024-09-12  2:38   ` [PATCH v10 2/2] examples/l3fwd-power: add PM QoS configuration Huisong Li
2024-10-14  8:24     ` Konstantin Ananyev
2024-10-14  8:46       ` Konstantin Ananyev
2024-10-15  7:32       ` lihuisong (C)
2024-10-16  0:24         ` Konstantin Ananyev
2024-10-17  2:25           ` lihuisong (C)
2024-10-17 11:14             ` Konstantin Ananyev
2024-09-12  3:07   ` [PATCH v10 0/2] power: introduce PM QoS interface fengchengwen
2024-10-12  2:07     ` lihuisong (C)
2024-10-14 15:27   ` Stephen Hemminger
2024-10-15  9:30     ` lihuisong (C)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=773b9cf3df354a168e42aecb627b0b2c@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=anatoly.burakov@intel.com \
    --cc=david.hunt@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@amd.com \
    --cc=lihuisong@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=mb@smartsharesystems.com \
    --cc=sivaprasad.tummala@amd.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).