From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7B09B4548E; Wed, 19 Jun 2024 08:46:20 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6B33D4275C; Wed, 19 Jun 2024 08:46:20 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 3BFF942670 for ; Wed, 19 Jun 2024 08:46:18 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 39EEFDA7; Tue, 18 Jun 2024 23:46:42 -0700 (PDT) Received: from ampere-altra-2-1.usa.Arm.com (ampere-altra-2-1.usa.arm.com [10.118.91.158]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 763673F7B4; Tue, 18 Jun 2024 23:46:17 -0700 (PDT) From: Wathsala Vithanage To: Thomas Monjalon , Tyler Retzlaff , Ruifeng Wang Cc: dev@dpdk.org, nd@arm.com, Wathsala Vithanage , Dhruv Tripathi , Honnappa Nagarahalli , Jack Bond-Preston , Nick Connolly , Vinod Krishna Subject: [PATCH v2 2/2] eal: add Arm WFET in power management intrinsics Date: Wed, 19 Jun 2024 06:45:28 +0000 Message-Id: <20240619064529.619526-2-wathsala.vithanage@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240619064529.619526-1-wathsala.vithanage@arm.com> References: <20240604044401.3577707-2-wathsala.vithanage@arm.com> <20240619064529.619526-1-wathsala.vithanage@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Wait for event with timeout (WFET) puts the CPU in a low power mode and stays there until an event is signalled (SEV), loss of an exclusive monitor or a timeout. WFET is enabled selectively by checking FEAT_WFxT in Linux auxiliary vector. If FEAT_WFxT is not available power management will fallback to WFE. RTE_ARM_USE_WFE macro is not required to enable WFE feature for PMD power monitoring. Signed-off-by: Wathsala Vithanage Reviewed-by: Dhruv Tripathi Reviewed-by: Honnappa Nagarahalli Reviewed-by: Jack Bond-Preston Reviewed-by: Nick Connolly Reviewed-by: Vinod Krishna --- .mailmap | 2 ++ app/test/test_cpuflags.c | 3 ++ lib/eal/arm/include/rte_cpuflags_64.h | 1 + lib/eal/arm/include/rte_pause_64.h | 21 ++++++++++--- lib/eal/arm/rte_cpuflags.c | 3 +- lib/eal/arm/rte_power_intrinsics.c | 45 +++++++++++++++++---------- 6 files changed, 52 insertions(+), 23 deletions(-) diff --git a/.mailmap b/.mailmap index 3843868716..31995d492d 100644 --- a/.mailmap +++ b/.mailmap @@ -332,6 +332,7 @@ Dexia Li Dexuan Cui Dharmik Thakkar Dheemanth Mallikarjun +Dhruv Tripathi Diana Wang Didier Pallard Dilshod Urazov @@ -1516,6 +1517,7 @@ Vincent Jardin Vincent Li Vincent S. Cojot Vinh Tran +Vinod Krishna Vipin Varghese Vipul Ashri Visa Hankala diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c index a0ff74720c..22ab4dff0a 100644 --- a/app/test/test_cpuflags.c +++ b/app/test/test_cpuflags.c @@ -156,6 +156,9 @@ test_cpuflags(void) printf("Check for SVEBF16:\t"); CHECK_FOR_FLAG(RTE_CPUFLAG_SVEBF16); + + printf("Check for WFXT:\t"); + CHECK_FOR_FLAG(RTE_CPUFLAG_WFXT); #endif #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) diff --git a/lib/eal/arm/include/rte_cpuflags_64.h b/lib/eal/arm/include/rte_cpuflags_64.h index afe70209c3..1945a97ca1 100644 --- a/lib/eal/arm/include/rte_cpuflags_64.h +++ b/lib/eal/arm/include/rte_cpuflags_64.h @@ -36,6 +36,7 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_SVEF64MM, RTE_CPUFLAG_SVEBF16, RTE_CPUFLAG_AARCH64, + RTE_CPUFLAG_WFXT, }; #include "generic/rte_cpuflags.h" diff --git a/lib/eal/arm/include/rte_pause_64.h b/lib/eal/arm/include/rte_pause_64.h index 5cb8b59056..f732407425 100644 --- a/lib/eal/arm/include/rte_pause_64.h +++ b/lib/eal/arm/include/rte_pause_64.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2017 Cavium, Inc - * Copyright(c) 2019 Arm Limited + * Copyright(c) 2024 Arm Limited */ #ifndef _RTE_PAUSE_ARM64_H_ @@ -23,17 +23,28 @@ static inline void rte_pause(void) asm volatile("yield" ::: "memory"); } -#ifdef RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED -/* Send a local event to quit WFE. */ +/* Send a local event to quit WFE/WFxT. */ #define __RTE_ARM_SEVL() { asm volatile("sevl" : : : "memory"); } -/* Send a global event to quit WFE for all cores. */ +/* Send a global event to quit WFE/WFxT for all cores. */ #define __RTE_ARM_SEV() { asm volatile("sev" : : : "memory"); } /* Put processor into low power WFE(Wait For Event) state. */ #define __RTE_ARM_WFE() { asm volatile("wfe" : : : "memory"); } +/* Put processor into low power WFET (WFE with Timeout) state. */ +#ifdef RTE_ARM_FEATURE_WFXT +#define __RTE_ARM_WFET(t) { \ + asm volatile("wfet %x[to]" \ + : \ + : [to] "r" (t) \ + : "memory"); \ + } +#else +#define __RTE_ARM_WFET(t) { RTE_SET_USED(t); } +#endif + /* * Atomic exclusive load from addr, it returns the 8-bit content of * *addr while making it 'monitored', when it is written by someone @@ -147,6 +158,8 @@ static inline void rte_pause(void) __RTE_ARM_LOAD_EXC_128(src, dst, memorder) \ } +#ifdef RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED + static __rte_always_inline void rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected, int memorder) diff --git a/lib/eal/arm/rte_cpuflags.c b/lib/eal/arm/rte_cpuflags.c index 7ba4f8ba97..ad76f2448b 100644 --- a/lib/eal/arm/rte_cpuflags.c +++ b/lib/eal/arm/rte_cpuflags.c @@ -115,6 +115,7 @@ const struct feature_entry rte_cpu_feature_table[] = { FEAT_DEF(SVEF32MM, REG_HWCAP2, 10) FEAT_DEF(SVEF64MM, REG_HWCAP2, 11) FEAT_DEF(SVEBF16, REG_HWCAP2, 12) + FEAT_DEF(WFXT, REG_HWCAP2, 31) FEAT_DEF(AARCH64, REG_PLATFORM, 0) }; #endif /* RTE_ARCH */ @@ -163,7 +164,5 @@ void rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics) { memset(intrinsics, 0, sizeof(*intrinsics)); -#ifdef RTE_ARM_USE_WFE intrinsics->power_monitor = 1; -#endif } diff --git a/lib/eal/arm/rte_power_intrinsics.c b/lib/eal/arm/rte_power_intrinsics.c index f54cf59e80..fc7a0c61f0 100644 --- a/lib/eal/arm/rte_power_intrinsics.c +++ b/lib/eal/arm/rte_power_intrinsics.c @@ -4,20 +4,31 @@ #include +#include "rte_cpuflags.h" #include "rte_power_intrinsics.h" /** - * This function uses WFE instruction to make lcore suspend + * Set wfet_en if WFET is supported + */ +uint8_t wfet_en; + +RTE_INIT(rte_power_intrinsics_init) +{ +#ifdef RTE_ARCH_64 + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_WFXT)) + wfet_en = 1; +#endif +} + +/** + * This function uses WFE/WFET instruction to make lcore suspend * execution on ARM. - * Note that timestamp based timeout is not supported yet. */ int rte_power_monitor(const struct rte_power_monitor_cond *pmc, const uint64_t tsc_timestamp) { - RTE_SET_USED(tsc_timestamp); - -#ifdef RTE_ARM_USE_WFE +#ifdef RTE_ARCH_64 const unsigned int lcore_id = rte_lcore_id(); uint64_t cur_value; @@ -33,28 +44,30 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc, switch (pmc->size) { case sizeof(uint8_t): - __RTE_ARM_LOAD_EXC_8(pmc->addr, cur_value, rte_memory_order_relaxed) - __RTE_ARM_WFE() + __RTE_ARM_LOAD_EXC_8(pmc->addr, cur_value, rte_memory_order_relaxed); break; case sizeof(uint16_t): - __RTE_ARM_LOAD_EXC_16(pmc->addr, cur_value, rte_memory_order_relaxed) - __RTE_ARM_WFE() + __RTE_ARM_LOAD_EXC_16(pmc->addr, cur_value, rte_memory_order_relaxed); break; case sizeof(uint32_t): - __RTE_ARM_LOAD_EXC_32(pmc->addr, cur_value, rte_memory_order_relaxed) - __RTE_ARM_WFE() + __RTE_ARM_LOAD_EXC_32(pmc->addr, cur_value, rte_memory_order_relaxed); break; case sizeof(uint64_t): - __RTE_ARM_LOAD_EXC_64(pmc->addr, cur_value, rte_memory_order_relaxed) - __RTE_ARM_WFE() + __RTE_ARM_LOAD_EXC_64(pmc->addr, cur_value, rte_memory_order_relaxed); break; default: return -EINVAL; /* unexpected size */ } + if (wfet_en) + __RTE_ARM_WFET(tsc_timestamp) + else + __RTE_ARM_WFE() + return 0; #else RTE_SET_USED(pmc); + RTE_SET_USED(tsc_timestamp); return -ENOTSUP; #endif @@ -80,10 +93,8 @@ int rte_power_monitor_wakeup(const unsigned int lcore_id) { RTE_SET_USED(lcore_id); - -#ifdef RTE_ARM_USE_WFE - __RTE_ARM_SEV() - +#ifdef RTE_ARCH_64 + __RTE_ARM_SEV(); return 0; #else return -ENOTSUP; -- 2.34.1