From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C683243091; Thu, 17 Aug 2023 16:18:13 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 568B140ED9; Thu, 17 Aug 2023 16:18:13 +0200 (CEST) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id 27F3E40685 for ; Thu, 17 Aug 2023 16:18:12 +0200 (CEST) Received: from frapeml100005.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4RRRt03V1jz6HHxp; Thu, 17 Aug 2023 22:17:44 +0800 (CST) Received: from frapeml500007.china.huawei.com (7.182.85.172) by frapeml100005.china.huawei.com (7.182.85.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Thu, 17 Aug 2023 16:18:10 +0200 Received: from frapeml500007.china.huawei.com ([7.182.85.172]) by frapeml500007.china.huawei.com ([7.182.85.172]) with mapi id 15.01.2507.031; Thu, 17 Aug 2023 16:18:10 +0200 From: Konstantin Ananyev To: "Tummala, Sivaprasad" , Tyler Retzlaff CC: "david.hunt@intel.com" , "anatoly.burakov@intel.com" , "Yigit, Ferruh" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "dev@dpdk.org" Subject: RE: [PATCH v5 3/3] power: amd power monitor support Thread-Topic: [PATCH v5 3/3] power: amd power monitor support Thread-Index: AQHZ0HP+ggM1l/NZ/UiCDsaVSmY5La/tLRcAgAEN9ACAAE5lAA== Date: Thu, 17 Aug 2023 14:18:10 +0000 Message-ID: References: <20230418082529.544777-2-sivaprasad.tummala@amd.com> <20230816185959.1331336-1-sivaprasad.tummala@amd.com> <20230816185959.1331336-3-sivaprasad.tummala@amd.com> <20230816192758.GA12453@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.206.138.42] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > > Caution: This message originated from an External Source. Use proper ca= ution > > when opening attachments, clicking links, or responding. > > > > > > On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote: > > > mwaitx allows EPYC processors to enter a implementation dependent > > > power/performance optimized state (C1 state) for a specific period or > > > until a store to the monitored address range. > > > > > > Signed-off-by: Sivaprasad Tummala > > > Acked-by: Anatoly Burakov > > > --- > > > lib/eal/x86/rte_power_intrinsics.c | 77 > > > +++++++++++++++++++++++++----- > > > 1 file changed, 66 insertions(+), 11 deletions(-) > > > > > > diff --git a/lib/eal/x86/rte_power_intrinsics.c > > > b/lib/eal/x86/rte_power_intrinsics.c > > > index 6eb9e50807..b4754e17da 100644 > > > --- a/lib/eal/x86/rte_power_intrinsics.c > > > +++ b/lib/eal/x86/rte_power_intrinsics.c > > > @@ -17,6 +17,60 @@ static struct power_wait_status { > > > volatile void *monitor_addr; /**< NULL if not currently sleepin= g > > > */ } __rte_cache_aligned wait_status[RTE_MAX_LCORE]; > > > > > > +/** > > > + * These functions uses UMONITOR/UMWAIT instructions and will enter = C0.2 > > state. > > > + * For more information about usage of these instructions, please > > > +refer to > > > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. > > > + */ > > > +static void intel_umonitor(volatile void *addr) { > > > + /* UMONITOR */ > > > + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > > > + : > > > + : "D"(addr)); > > > +} > > > + > > > +static void intel_umwait(const uint64_t timeout) { > > > + const uint32_t tsc_l =3D (uint32_t)timeout; > > > + const uint32_t tsc_h =3D (uint32_t)(timeout >> 32); > > > + /* UMWAIT */ > > > + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > > > + : /* ignore rflags */ > > > + : "D"(0), /* enter C0.2 */ > > > + "a"(tsc_l), "d"(tsc_h)); } > > > > question and perhaps Anatoly Burakov can chime in with expertise. > > > > gcc/clang have built-in intrinsics for umonitor and umwait i believe as= per our other > > thread of discussion is there a benefit to also providing inline assemb= ly over just > > using the intrinsics? I understand that the intrinsics may not exist fo= r the monitorx > > and mwaitx below so it is probably necessary for amd. > > > > so the suggestion here is when they are available just use the intrinsi= cs. > > > > thanks > > > The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwait= x are available only when -mmwaitx > is used specific for AMD platforms. On generic builds, these built-ins ar= e not available and hence inline > assembly is required here. Ok... but we can probably put them into a separate .c file that will be com= piled with that specific flag? Same thing can be probably done for Intel specific instructions. In general, I think it is much more preferable to use built-ins vs inline a= ssembly (if possible off-course).=20 Konstantin >=20 > > > + > > > +/** > > > + * These functions uses MONITORX/MWAITX instructions and will enter = C1 state. > > > + * For more information about usage of these instructions, please > > > +refer to > > > + * AMD64 Architecture Programmer's Manual. > > > + */ > > > +static void amd_monitorx(volatile void *addr) { > > > + /* MONITORX */ > > > + asm volatile(".byte 0x0f, 0x01, 0xfa;" > > > + : > > > + : "a"(addr), > > > + "c"(0), /* no extensions */ > > > + "d"(0)); /* no hints */ } > > > + > > > +static void amd_mwaitx(const uint64_t timeout) { > > > + /* MWAITX */ > > > + asm volatile(".byte 0x0f, 0x01, 0xfb;" > > > + : /* ignore rflags */ > > > + : "a"(0), /* enter C1 */ > > > + "c"(2), /* enable timer */ > > > + "b"(timeout)); > > > +} > > > + > > > +static struct { > > > + void (*mmonitor)(volatile void *addr); > > > + void (*mwait)(const uint64_t timeout); } __rte_cache_aligned > > > +power_monitor_ops; > > > + > > > static inline void > > > __umwait_wakeup(volatile void *addr) > > > { > > > @@ -75,8 +129,6 @@ int > > > rte_power_monitor(const struct rte_power_monitor_cond *pmc, > > > const uint64_t tsc_timestamp) { > > > - const uint32_t tsc_l =3D (uint32_t)tsc_timestamp; > > > - const uint32_t tsc_h =3D (uint32_t)(tsc_timestamp >> 32); > > > const unsigned int lcore_id =3D rte_lcore_id(); > > > struct power_wait_status *s; > > > uint64_t cur_value; > > > @@ -109,10 +161,8 @@ rte_power_monitor(const struct > > rte_power_monitor_cond *pmc, > > > * versions support this instruction natively. > > > */ > > > > > > - /* set address for UMONITOR */ > > > - asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > > > - : > > > - : "D"(pmc->addr)); > > > + /* set address for mmonitor */ > > > + power_monitor_ops.mmonitor(pmc->addr); > > > > > > /* now that we've put this address into monitor, we can unlock = */ > > > rte_spinlock_unlock(&s->lock); > > > @@ -123,11 +173,8 @@ rte_power_monitor(const struct > > rte_power_monitor_cond *pmc, > > > if (pmc->fn(cur_value, pmc->opaque) !=3D 0) > > > goto end; > > > > > > - /* execute UMWAIT */ > > > - asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > > > - : /* ignore rflags */ > > > - : "D"(0), /* enter C0.2 */ > > > - "a"(tsc_l), "d"(tsc_h)); > > > + /* execute mwait */ > > > + power_monitor_ops.mwait(tsc_timestamp); > > > > > > end: > > > /* erase sleep address */ > > > @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) { > > > wait_multi_supported =3D 1; > > > if (i.power_monitor) > > > monitor_supported =3D 1; > > > + > > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */ > > > + power_monitor_ops.mmonitor =3D &amd_monitorx; > > > + power_monitor_ops.mwait =3D &amd_mwaitx; > > > + } else { /* Intel */ > > > + power_monitor_ops.mmonitor =3D &intel_umonitor; > > > + power_monitor_ops.mwait =3D &intel_umwait; > > > + } > > > } > > > > > > int > > > -- > > > 2.34.1