From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1EE4E45954; Tue, 10 Sep 2024 11:32:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 110CB42D28; Tue, 10 Sep 2024 11:32:45 +0200 (CEST) Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by mails.dpdk.org (Postfix) with ESMTP id 6F3CC402A1 for ; Tue, 10 Sep 2024 11:32:42 +0200 (CEST) Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4X2z4Y0jzsz1j7lq; Tue, 10 Sep 2024 17:32:13 +0800 (CST) Received: from kwepemm600004.china.huawei.com (unknown [7.193.23.242]) by mail.maildlp.com (Postfix) with ESMTPS id DC48B1402CF; Tue, 10 Sep 2024 17:32:40 +0800 (CST) Received: from [10.67.121.59] (10.67.121.59) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Tue, 10 Sep 2024 17:32:40 +0800 Message-ID: <23c6701f-9901-1ddd-ca24-c4ea12ace45a@huawei.com> Date: Tue, 10 Sep 2024 17:32:11 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH v9 1/2] power: introduce PM QoS API on CPU wide To: fengchengwen , CC: , , , , , , , , References: <20240320105529.5626-1-lihuisong@huawei.com> <20240809095012.16717-1-lihuisong@huawei.com> <20240809095012.16717-2-lihuisong@huawei.com> <77be0eac-5676-90af-9564-3c1a14779fd5@huawei.com> From: "lihuisong (C)" In-Reply-To: <77be0eac-5676-90af-9564-3c1a14779fd5@huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.121.59] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600004.china.huawei.com (7.193.23.242) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Chengwen, Thanks for your review. 在 2024/9/10 10:00, fengchengwen 写道: > Hi Huisong > > Please see comments inline. > > Thanks > > On 2024/8/9 17:50, Huisong Li wrote: >> The deeper the idle state, the lower the power consumption, but the longer >> the resume time. Some service are delay sensitive and very except the low >> resume time, like interrupt packet receiving mode. >> >> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs >> interface is used to set and get the resume latency limit on the cpuX for >> userspace. Each cpuidle governor in Linux select which idle state to enter >> based on this CPU resume latency in their idle task. >> >> The per-CPU PM QoS API can be used to control this CPU's idle state >> selection and limit just enter the shallowest idle state to low the delay >> after sleep by setting strict resume latency (zero value). >> >> Signed-off-by: Huisong Li >> Acked-by: Morten Brørup >> --- > ... > >> diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c >> new file mode 100644 >> index 0000000000..375746f832 >> --- /dev/null >> +++ b/lib/power/rte_power_qos.c >> @@ -0,0 +1,114 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2024 HiSilicon Limited >> + */ >> + >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> + >> +#include "power_common.h" >> +#include "rte_power_qos.h" >> + >> +#define PM_QOS_SYSFILE_RESUME_LATENCY_US \ >> + "/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us" >> + >> +int >> +rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency) >> +{ >> + char buf[LINE_MAX]; > no need LINE_MAX, [32] would enough. Ack > >> + FILE *f; >> + int ret; >> + >> + if (!rte_lcore_is_enabled(lcore_id)) { >> + POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id); >> + return -EINVAL; >> + } >> + >> + if (latency < 0) { >> + POWER_LOG(ERR, "latency should be greater than and equal to 0"); >> + return -EINVAL; >> + } >> + >> + ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + if (ret != 0) { >> + POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + return ret; >> + } >> + >> + /* >> + * Based on the sysfs interface pm_qos_resume_latency_us under >> + * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meanning > meanning -> meaning Ack > >> + * is as follows for different input string. >> + * 1> the resume latency is 0 if the input is "n/a". >> + * 2> the resume latency is no constraint if the input is "0". >> + * 3> the resume latency is the actual value to be set. >> + */ >> + if (latency == 0) >> + snprintf(buf, sizeof(buf), "%s", "n/a"); >> + else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT) >> + snprintf(buf, sizeof(buf), "%u", 0); >> + else >> + snprintf(buf, sizeof(buf), "%u", latency); >> + >> + ret = write_core_sysfs_s(f, buf); >> + if (ret != 0) { >> + POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + goto out; > no need of goto Ack > >> + } >> + >> +out: >> + if (f != NULL) >> + fclose(f); > just fclose(f) because f is valid here. Ack >> + >> + return ret; >> +} >> + >> +int >> +rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id) >> +{ >> + char buf[LINE_MAX]; >> + int latency = -1; >> + FILE *f; >> + int ret; >> + >> + if (!rte_lcore_is_enabled(lcore_id)) { >> + POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id); >> + return -EINVAL; >> + } >> + >> + ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + if (ret != 0) { >> + POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + return ret; >> + } >> + >> + ret = read_core_sysfs_s(f, buf, sizeof(buf)); >> + if (ret != 0) { >> + POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id); >> + goto out; >> + } >> + >> + /* >> + * Based on the sysfs interface pm_qos_resume_latency_us under >> + * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meanning > meanning -> meaning Ack > >> + * is as follows for different output string. >> + * 1> the resume latency is 0 if the output is "n/a". >> + * 2> the resume latency is no constraint if the output is "0". >> + * 3> the resume latency is the actual value in used for other string. >> + */ >> + if (strcmp(buf, "n/a") == 0) >> + latency = 0; >> + else { >> + latency = strtoul(buf, NULL, 10); >> + latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency; >> + } >> + >> +out: >> + if (f != NULL) >> + fclose(f); > just fclose(f) because f is valid here. Ack > >> + >> + return latency != -1 ? latency : ret; >> +} >> diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h >> new file mode 100644 >> index 0000000000..990c488373 >> --- /dev/null >> +++ b/lib/power/rte_power_qos.h >> @@ -0,0 +1,73 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2024 HiSilicon Limited >> + */ >> + >> +#ifndef RTE_POWER_QOS_H >> +#define RTE_POWER_QOS_H >> + >> +#include >> + >> +#include >> + >> +#ifdef __cplusplus >> +extern "C" { >> +#endif >> + >> +/** >> + * @file rte_power_qos.h >> + * >> + * PM QoS API. >> + * >> + * The CPU-wide resume latency limit has a positive impact on this CPU's idle >> + * state selection in each cpuidle governor. >> + * Please see the PM QoS on CPU wide in the following link: >> + * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us >> + * >> + * The deeper the idle state, the lower the power consumption, but the >> + * longer the resume time. Some service are delay sensitive and very except the >> + * low resume time, like interrupt packet receiving mode. >> + * >> + * In these case, per-CPU PM QoS API can be used to control this CPU's idle >> + * state selection and limit just enter the shallowest idle state to low the >> + * delay after sleep by setting strict resume latency (zero value). >> + */ >> + >> +#define RTE_POWER_QOS_STRICT_LATENCY_VALUE 0 >> +#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT ((int)(UINT32_MAX >> 1)) >> + >> +/** >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param lcore_id >> + * target logical core id >> + * >> + * @param latency >> + * The latency should be greater than and equal to zero in microseconds unit. >> + * >> + * @return >> + * 0 on success. Otherwise negative value is returned. >> + */ >> +__rte_experimental >> +int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency); >> + >> +/** >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * Get the current resume latency of this logical core. >> + * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT >> + * if don't set it. >> + * >> + * @return >> + * Negative value on failure. >> + * >= 0 means the actual resume latency limit on this core. >> + */ >> +__rte_experimental >> +int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id); >> + >> +#ifdef __cplusplus >> +} >> +#endif >> + >> +#endif /* RTE_POWER_QOS_H */ >> diff --git a/lib/power/version.map b/lib/power/version.map >> index c9a226614e..4e4955a4cf 100644 >> --- a/lib/power/version.map >> +++ b/lib/power/version.map >> @@ -51,4 +51,8 @@ EXPERIMENTAL { >> rte_power_set_uncore_env; >> rte_power_uncore_freqs; >> rte_power_unset_uncore_env; >> + >> + # added in 24.11 >> + rte_power_qos_set_cpu_resume_latency; >> + rte_power_qos_get_cpu_resume_latency; > order by alphabetic. Ack > > another question, I think rename cpu with core maybe more accurate, despite sysfs export with cpu, but in DPDK it means core. > and there are some rte_power_core_xxx name in rte_power library, I think better to keep the same. Firstly, the rte_power_qos_set/get_cpu_resume_latency is just consistent with linux sysfs interface. Having the same name is more releative for user. In addition, Sivaprasad Tummala is reworking power library and the name of rte_power_core_xxx also might be changed. > >> }; >> > .