From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F3D143CFF; Wed, 20 Mar 2024 12:02:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5E9B241132; Wed, 20 Mar 2024 12:02:47 +0100 (CET) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id 24A7B40298 for ; Wed, 20 Mar 2024 12:02:44 +0100 (CET) Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4V05GF10fczXjP1; Wed, 20 Mar 2024 19:00:05 +0800 (CST) Received: from kwepemm600004.china.huawei.com (unknown [7.193.23.242]) by mail.maildlp.com (Postfix) with ESMTPS id B12A61404DB; Wed, 20 Mar 2024 19:02:41 +0800 (CST) Received: from localhost.localdomain (10.28.79.22) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 20 Mar 2024 19:02:41 +0800 From: Huisong Li To: CC: , , , , , , Subject: [PATCH 1/2] power: introduce PM QoS interface Date: Wed, 20 Mar 2024 18:55:28 +0800 Message-ID: <20240320105529.5626-2-lihuisong@huawei.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20240320105529.5626-1-lihuisong@huawei.com> References: <20240320105529.5626-1-lihuisong@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.28.79.22] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600004.china.huawei.com (7.193.23.242) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The system-wide CPU latency QoS limit has a positive impact on the idle state selection in cpuidle governor. Linux creates a cpu_dma_latency device under '/dev' directory to obtain the CPU latency QoS limit on system and send the QoS request for userspace. Please see the PM QoS framework in the following link: https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos This feature has beed supported by kernel-v2.6.25. The deeper the idle state, the lower the power consumption, but the longer the resume time. Some service are delay sensitive and very except the low resume time, like interrupt packet receiving mode. So this PM QoS API make it easy to obtain the CPU latency limit on system and send the CPU latency QoS request for the application that need them. The recommend usage method is as follows: 1) an application process first creates QoS request. 2) update the CPU latency request to zero when need. 3) back to the default value when no need(this step is optional). 4) release QoS request when process exit. Signed-off-by: Huisong Li --- doc/guides/prog_guide/power_man.rst | 16 ++++ doc/guides/rel_notes/release_24_03.rst | 4 + lib/power/meson.build | 2 + lib/power/rte_power_qos.c | 98 ++++++++++++++++++++++++ lib/power/rte_power_qos.h | 101 +++++++++++++++++++++++++ lib/power/version.map | 4 + 6 files changed, 225 insertions(+) create mode 100644 lib/power/rte_power_qos.c create mode 100644 lib/power/rte_power_qos.h diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst index f6674efe2d..493c75bf9d 100644 --- a/doc/guides/prog_guide/power_man.rst +++ b/doc/guides/prog_guide/power_man.rst @@ -249,6 +249,22 @@ Get Num Pkgs Get Num Dies Get the number of die's on a given package. +PM QoS API +---------- +The deeper the idle state, the lower the power consumption, but the longer +the resume time. Some service threads are delay sensitive and very except +the low resume time, like interrupt packet receiving mode. + +This PM QoS API is aimed to obtain the CPU latency limit on system and send the +CPU latency QoS request for the application that need them. + +* ``rte_power_qos_get_curr_cpu_latency()`` is used to get the current CPU + latency limit on system. +* For sending CPU latency QoS request, first call ``rte_power_create_qos_request()`` + to create a QoS request, then update CPU latency value by calling + ``rte_power_qos_update_request()``. The ``rte_power_release_qos_request()`` is + used to release this QoS request when process exit. + References ---------- diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst index 14826ea08f..b5be724133 100644 --- a/doc/guides/rel_notes/release_24_03.rst +++ b/doc/guides/rel_notes/release_24_03.rst @@ -196,6 +196,10 @@ New Features Added DMA producer mode to measure performance of ``OP_FORWARD`` mode of event DMA adapter. +* **Added CPU latency PM QoS support.** + + Added the interface querying cpu latency PM QoS limit on system and + the interface sending cpu latency QoS request in power lib. Removed Items ------------- diff --git a/lib/power/meson.build b/lib/power/meson.build index b8426589b2..8222e178b0 100644 --- a/lib/power/meson.build +++ b/lib/power/meson.build @@ -23,12 +23,14 @@ sources = files( 'rte_power.c', 'rte_power_uncore.c', 'rte_power_pmd_mgmt.c', + 'rte_power_qos.c', ) headers = files( 'rte_power.h', 'rte_power_guest_channel.h', 'rte_power_pmd_mgmt.h', 'rte_power_uncore.h', + 'rte_power_qos.h', ) if cc.has_argument('-Wno-cast-qual') cflags += '-Wno-cast-qual' diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c new file mode 100644 index 0000000000..d2b55923a0 --- /dev/null +++ b/lib/power/rte_power_qos.c @@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 HiSilicon Limited + */ + +#include +#include +#include + +#include + +#include "power_common.h" +#include "rte_power_qos.h" + +#define QOS_CPU_DMA_LATENCY_DEV "/dev/cpu_dma_latency" + +struct rte_power_qos_info { + /* + * Keep file descriptor to update QoS request until there are no + * necessary anymore. + */ + int fd; + int cur_cpu_latency; /* unit microseconds */ + }; + +struct rte_power_qos_info g_qos = { + .fd = -1, + .cur_cpu_latency = -1, +}; + +int +rte_power_qos_get_curr_cpu_latency(int *latency) +{ + int fd, ret; + + fd = open(QOS_CPU_DMA_LATENCY_DEV, O_RDONLY); + if (fd < 0) { + POWER_LOG(ERR, "Failed to open %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + + ret = read(fd, latency, sizeof(*latency)); + if (ret == 0) { + POWER_LOG(ERR, "Failed to read %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + close(fd); + + return 0; +} + +int +rte_power_qos_update_request(int latency) +{ + int ret; + + if (g_qos.fd == -1) { + POWER_LOG(ERR, "please create QoS request first."); + return -EINVAL; + } + + if (latency < 0) { + POWER_LOG(ERR, "latency should be non negative number."); + return -EINVAL; + } + + if (g_qos.cur_cpu_latency != -1 && latency == g_qos.cur_cpu_latency) + return 0; + + ret = write(g_qos.fd, &latency, sizeof(latency)); + if (ret == 0) { + POWER_LOG(ERR, "Failed to write %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + g_qos.cur_cpu_latency = latency; + + return 0; +} + +int +rte_power_create_qos_request(void) +{ + g_qos.fd = open(QOS_CPU_DMA_LATENCY_DEV, O_WRONLY); + if (g_qos.fd < 0) { + POWER_LOG(ERR, "Failed to open %s.", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + + return 0; +} + +void +rte_power_release_qos_request(void) +{ + if (g_qos.fd != -1) { + close(g_qos.fd); + g_qos.fd = -1; + } +} diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h new file mode 100644 index 0000000000..d39f5d0c0f --- /dev/null +++ b/lib/power/rte_power_qos.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 HiSilicon Limited + */ + +#ifndef RTE_POWER_QOS_H +#define RTE_POWER_QOS_H + +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @file rte_power_qos.h + * + * PM QoS API. + * + * The system-wide CPU latency QoS limit has a positive impact on the idle + * state selection in cpuidle governor. + * + * Linux creates a cpu_dma_latency device under '/dev' directory to obtain the + * CPU latency QoS limit on system and send the QoS request for userspace. + * Please see the PM QoS framework in the following link: + * https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos + * + * The deeper the idle state, the lower the power consumption, but the longer + * the resume time. Some service are delay sensitive and very except the + * low resume time, like interrupt packet receiving mode. + * + * So this PM QoS API make it easy to obtain the CPU latency limit on system and + * send the CPU latency QoS request for the application that need them. + * + * The recommend usage method is as follows: + * 1) an application process first creates QoS request. + * 2) update the CPU latency request to zero when need. + * 3) back to the default value @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE when + * no need (this step is optional). + * 4)release QoS request when process exit. + */ + +#define QOS_USEC_PER_SEC 1000000 +#define PM_QOS_CPU_LATENCY_DEFAULT_VALUE (2000 * QOS_USEC_PER_SEC) +#define PM_QOS_STRICT_LATENCY_VALUE 0 + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Create CPU latency QoS request and release this request by + * @see rte_power_release_qos_request. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_create_qos_request(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * release CPU latency QoS request. + */ +__rte_experimental +void rte_power_release_qos_request(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Get the current CPU latency QoS limit on system. + * The default value in kernel is @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_qos_get_curr_cpu_latency(int *latency); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Update the CPU latency QoS request. + * Note: need to create QoS request first and then call this API. + * + * @param latency + * The latency should be greater than and equal to zero. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_qos_update_request(int latency); + +#ifdef __cplusplus +} +#endif + +#endif /* RTE_POWER_QOS_H */ diff --git a/lib/power/version.map b/lib/power/version.map index ad92a65f91..42770762b1 100644 --- a/lib/power/version.map +++ b/lib/power/version.map @@ -51,4 +51,8 @@ EXPERIMENTAL { rte_power_set_uncore_env; rte_power_uncore_freqs; rte_power_unset_uncore_env; + rte_power_create_qos_request; + rte_power_release_qos_request; + rte_power_qos_get_curr_cpu_latency; + rte_power_qos_update_request; }; -- 2.22.0