From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B7BAA4308D; Thu, 17 Aug 2023 13:34:14 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8E06740ED9; Thu, 17 Aug 2023 13:34:14 +0200 (CEST) Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2055.outbound.protection.outlook.com [40.107.95.55]) by mails.dpdk.org (Postfix) with ESMTP id 73D7740685 for ; Thu, 17 Aug 2023 13:34:13 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FkWh002zWrYXGlrI/0xPpy13jy5eRZ1c0wH8JQ6DCThH/3M58osclxyytMMT7ZN0Gs1HlcR57uZQtCvGOqydyJA/SyNHdfNoYtlGlK4gk2uSCSY3ac+Akx1k6KZtNGUBc3N+J12wofWhwJ6yIIosinM/J8rXkNauKxafR0/iRbLTy8xq3fihL8dW36O1Njx6tGInLKF/ZzMHwFytLAqagUf179+Un+KPRuXCcES2vbMOm9XcM4SV02oxCADvjtvpJJmzvuydYg/vJOmiYjZfDJZcQJCMZ8vW/Qjjp23v502cA4iLkDk7H7DFC0xq/FbaRtgyebgdN7Fv9/osbwSPUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=E1G9ftNATS+7CnoWzD27ub0gqUmo80lhont/h7GQWvI=; b=V4TGy2HHpijx3GhkKxaS/F52D+Os7BjeKLUrVXAWhgLuKl3px6fY7TWbaTNpCiJFW/01aisFPscwlJXfGYQYWVUR0Ho3MtWaTTJMytKmSpNSLDFbRe3zXvjCELR01ZaNjhrX+jYSd4jqbuutzJZ/CpaiepjHOUmGnm6Z8DKanxH9U7NVD8LtZvfJzVDGXr69Wg68cHbLHGorfFiMIpYu1KEK05OnkzuWPoohp2l40auSj1LqP6mgP9nodncveAmlxX5xhQlFyIHUr9OKpbJ6GO4xlAfkCwA2wgg5V5kRp1BjihJ8x7AqJADb7iOzq5rJEqnOHajduoqD8hNv7199Nw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E1G9ftNATS+7CnoWzD27ub0gqUmo80lhont/h7GQWvI=; b=mG1lipIRD8eGfl/2McCR6g0SRudtwjG0YrYVDDOQn8VADbTuLvXVZMjir+DWHJvPRFcZ/qymi8k+S0M1UfGpVY/5UZwAcovDOqgkhf1UP2aSzOYAjKdNOr5J5vqREdFbxHp2mYcNAExlrxEjbkXZwjK1rxg95L2zsUujYIJK7i0= Received: from PH7PR12MB5781.namprd12.prod.outlook.com (2603:10b6:510:1d0::18) by BL1PR12MB5351.namprd12.prod.outlook.com (2603:10b6:208:317::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6678.31; Thu, 17 Aug 2023 11:34:11 +0000 Received: from PH7PR12MB5781.namprd12.prod.outlook.com ([fe80::8006:4dfc:adbd:49a]) by PH7PR12MB5781.namprd12.prod.outlook.com ([fe80::8006:4dfc:adbd:49a%3]) with mapi id 15.20.6678.031; Thu, 17 Aug 2023 11:34:11 +0000 From: "Tummala, Sivaprasad" To: Tyler Retzlaff CC: "david.hunt@intel.com" , "anatoly.burakov@intel.com" , "Yigit, Ferruh" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "dev@dpdk.org" Subject: RE: [PATCH v5 3/3] power: amd power monitor support Thread-Topic: [PATCH v5 3/3] power: amd power monitor support Thread-Index: AQHZ0HPxrmlB1ewLskW7VJWOFkbF0q/tTp4AgAEM7hA= Date: Thu, 17 Aug 2023 11:34:10 +0000 Message-ID: References: <20230418082529.544777-2-sivaprasad.tummala@amd.com> <20230816185959.1331336-1-sivaprasad.tummala@amd.com> <20230816185959.1331336-3-sivaprasad.tummala@amd.com> <20230816192758.GA12453@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> In-Reply-To: <20230816192758.GA12453@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_ActionId=f0d5d315-1748-4e83-a684-593921db96f7; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_ContentBits=0; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Enabled=true; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Method=Standard; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Name=General; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_SetDate=2023-08-17T11:30:30Z; MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: PH7PR12MB5781:EE_|BL1PR12MB5351:EE_ x-ms-office365-filtering-correlation-id: 7d58280a-bbde-4981-9e28-08db9f15e047 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 3EeIn1nuStfxHe0Uc6ombWGH+DVW8wt+O3x4m6Q4boaus7hUiawL0Orh+oicaKfkyOQNHixBFoYoTBHHGkZLj7c8h78Qynj2y9ZFoQoaVro3Mk+aKkin2xdi25auaEq5t/oK83KiCvjKT3hJkfKEDit+sdiuYHqLDuZU8tZ9a3fWmwsf1FScT40o8t43AV1kUn0z+NNnEgIwXkwS0H0BWulwcZH/Pj4vTsC1263kIeeGm8WJxq2M+N7qErNQMJhyg+fH/Lu5JPMxyvlh5x4mx1PaQWEvZjOfM5TlOSrxhhW2Ni97sGRFUrGJ5e4R6fqommHWOhrEBYRRSAfV+51qWt4OYWtVuuuH6e1C1LZMAfkRV9uD45G4XbwK5CM5lGPF0QPvfvCNbpMlQci9ElNY4tydkImH6DWj/1C27XwQwrCSAecpJjQq6eyBwTwQs88o8Xs2d4wM4hFr6J/tu1MHnBmThFrdiUgc2djGPsLSgUrE4/KXfSQELuj1offX3wZq4POwV0qBNpU82H3GS6SxnXZEhO3NZT+UplDauxW0hDWgk/p9dQVmupnx2pth5Rh+zLbL156wNpfIxLUAwTeMZOh7bK54NKFZlcgC1uclXEE7g++DqxPiwAKqkAOoAahO x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH7PR12MB5781.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(346002)(376002)(366004)(136003)(396003)(39860400002)(451199024)(1800799009)(186009)(86362001)(33656002)(38070700005)(38100700002)(122000001)(55016003)(52536014)(5660300002)(66946007)(66446008)(66476007)(66556008)(76116006)(478600001)(6916009)(53546011)(64756008)(71200400001)(316002)(6506007)(7696005)(54906003)(9686003)(26005)(41300700001)(8936002)(4326008)(8676002)(2906002)(83380400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?OSmatoD2TQOLI0M6l6gH4JfLzO1qctNZUU1LD6h5COxDsAOHLGxGacFBXW1o?= =?us-ascii?Q?JrGwY2GpxtHNh5BY49YZvDe/4xfrXmdPCT5ENF6CFua0QeKRhWnXEM6ixC3h?= =?us-ascii?Q?iwZ82JRBmBHXMDRRYVXRcTNZNT/NCs087FWmphKSZ9/CgBmXIp+NqwjvTw4G?= =?us-ascii?Q?mTh2hQReheC1ClFXHnTRM13nwAPaDFn+GCDvrddYKG8M/K6/43lYTjlIEH1i?= =?us-ascii?Q?HV0vZHeTKz0z1p2HKtrgdklBaKODIFSnhe2Vl7xL3aS3MUZyelntG6ZulGRP?= =?us-ascii?Q?4QV4ZJuv6wevFvzLq0xqBfPEK05y4HoH1VZP/AwPedzTdS9b2TcfwnHYx3TI?= =?us-ascii?Q?aZ5bs6kS/b/8f3Xs9d+fQ2sbfInpAE/WdZ4D612/WEMCmZnlvT93nNvtc6FM?= =?us-ascii?Q?ZpCkBruL4qB6CNae4Ge0Ze8P+35FSbBrnL7rk/tD1FYkvYZVruYn6VX8GWjB?= =?us-ascii?Q?9wRuf/PCuHg6d474aMmuJ/CiDB9xVkPP9XnP53zVf815Y2uxbGlbE5UPW0D9?= =?us-ascii?Q?lSOtV72skic9kTgafnlsVVTctYpD8fy/iJaCqqRA1QisaVmNlKFxczKk0Sxy?= =?us-ascii?Q?hTBodyDLRnJlM/XicxZ02wVhESxOxCNQHX3fZvLV0wui374/b4Icjg/r7c4U?= =?us-ascii?Q?rTrkB6vmbUYsEKvnX8iDblpPbIQUZ7BJYIKrN6MvNcYgVc+BAopsm80nn5jn?= =?us-ascii?Q?ZU+/TZBLGEhYycPCCZqccsOtO56sN4FJ9UhllipyKX16+Pcbc5hue1fXgLRY?= =?us-ascii?Q?pZzT+gaJl2KUOTi2AfhZ+OVAGr1WsyzZB9S50cstVFdDTk8N4OQt0GglyRou?= =?us-ascii?Q?9CVfPEx2FpEFPdI4Wap1nKWuhQfmfePS+KIOGS9MMRMmRhqpf4/2gIIw9oD4?= =?us-ascii?Q?sY3RFCwPlRKTSzQ9qbAFMGBNxur8Zek8FcmWa9ad0s5KzcT3hElry58KsR5/?= =?us-ascii?Q?+KfvJBGCdncELEIYyq/gOej0l/3R41P0d/TVFfPF+oVC+0BE+X0Qws2Rg25/?= =?us-ascii?Q?/FHI3TxjeK+FxEM+UIcSjSlr9Ib2T6WQMNCZg+mH4Ps7JnkbcFCAD8Do0S07?= =?us-ascii?Q?kBwUJrd8uhEb15zSkVT62rZ8TG3IIRnYTdpAUI3261/N/At3sE9KKWOZjie/?= =?us-ascii?Q?woJgul22v/zv+annO21vtWB+/rlGKv9so821DNd68LuqZb99NTSjEMWR6DY/?= =?us-ascii?Q?4U66w+IcVZNK4cId8h2R2QW0QC/o6OT1kPnVB3FHIIZ7AjutERJSzCYZeEPz?= =?us-ascii?Q?/Sg1GmUNKgZH4kxGXJlnikiibPXwJkzKWc4H5piBnYUPDHac0cmwblaMZFZo?= =?us-ascii?Q?c8VEiBnAdmJ4r8O3x3Gn4LNN1LRZCgIdE6PXyFhdRoN7eTCu2/ayA3QWcp1c?= =?us-ascii?Q?wBg77B6p/Wmf8Xi59HqH4rOrFonQRsTf8d32YBQCsSrYakz438XbskaDTFEB?= =?us-ascii?Q?tUnEhFeE6ObDyxQ+QVMzuLLRMu0IuKPHTCE6iD5EB6W8VoGjs3Pczjy6Damc?= =?us-ascii?Q?Z297gQ0pgeOigmLc0SgW1+9upVTeZyRr7o4rNGJAMkxvwNPKy6nCr/MCcbB3?= =?us-ascii?Q?ddiMT9On0W2X0kAk5Fh+vg9093MvBIjRnUFrmsWw?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH7PR12MB5781.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7d58280a-bbde-4981-9e28-08db9f15e047 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Aug 2023 11:34:10.8521 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: YwVI1RYXSQRPJ06y6hbJkEBygVRItMJEx36zYAt3F79JyIOjAYFtQNu3qO9J/kosXMcR94OCxsUD0td5ozjSrQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR12MB5351 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org [AMD Official Use Only - General] Hi Tyler, > -----Original Message----- > From: Tyler Retzlaff > Sent: Thursday, August 17, 2023 12:58 AM > To: Tummala, Sivaprasad > Cc: david.hunt@intel.com; anatoly.burakov@intel.com; Yigit, Ferruh > ; david.marchand@redhat.com; thomas@monjalon.net; > dev@dpdk.org > Subject: Re: [PATCH v5 3/3] power: amd power monitor support > > Caution: This message originated from an External Source. Use proper caut= ion > when opening attachments, clicking links, or responding. > > > On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote: > > mwaitx allows EPYC processors to enter a implementation dependent > > power/performance optimized state (C1 state) for a specific period or > > until a store to the monitored address range. > > > > Signed-off-by: Sivaprasad Tummala > > Acked-by: Anatoly Burakov > > --- > > lib/eal/x86/rte_power_intrinsics.c | 77 > > +++++++++++++++++++++++++----- > > 1 file changed, 66 insertions(+), 11 deletions(-) > > > > diff --git a/lib/eal/x86/rte_power_intrinsics.c > > b/lib/eal/x86/rte_power_intrinsics.c > > index 6eb9e50807..b4754e17da 100644 > > --- a/lib/eal/x86/rte_power_intrinsics.c > > +++ b/lib/eal/x86/rte_power_intrinsics.c > > @@ -17,6 +17,60 @@ static struct power_wait_status { > > volatile void *monitor_addr; /**< NULL if not currently sleeping > > */ } __rte_cache_aligned wait_status[RTE_MAX_LCORE]; > > > > +/** > > + * These functions uses UMONITOR/UMWAIT instructions and will enter C0= .2 > state. > > + * For more information about usage of these instructions, please > > +refer to > > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. > > + */ > > +static void intel_umonitor(volatile void *addr) { > > + /* UMONITOR */ > > + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > > + : > > + : "D"(addr)); > > +} > > + > > +static void intel_umwait(const uint64_t timeout) { > > + const uint32_t tsc_l =3D (uint32_t)timeout; > > + const uint32_t tsc_h =3D (uint32_t)(timeout >> 32); > > + /* UMWAIT */ > > + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > > + : /* ignore rflags */ > > + : "D"(0), /* enter C0.2 */ > > + "a"(tsc_l), "d"(tsc_h)); } > > question and perhaps Anatoly Burakov can chime in with expertise. > > gcc/clang have built-in intrinsics for umonitor and umwait i believe as p= er our other > thread of discussion is there a benefit to also providing inline assembly= over just > using the intrinsics? I understand that the intrinsics may not exist for = the monitorx > and mwaitx below so it is probably necessary for amd. > > so the suggestion here is when they are available just use the intrinsics= . > > thanks > The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx = are available only when -mmwaitx is used specific for AMD platforms. On generic builds, these built-ins are = not available and hence inline assembly is required here. > > + > > +/** > > + * These functions uses MONITORX/MWAITX instructions and will enter C1= state. > > + * For more information about usage of these instructions, please > > +refer to > > + * AMD64 Architecture Programmer's Manual. > > + */ > > +static void amd_monitorx(volatile void *addr) { > > + /* MONITORX */ > > + asm volatile(".byte 0x0f, 0x01, 0xfa;" > > + : > > + : "a"(addr), > > + "c"(0), /* no extensions */ > > + "d"(0)); /* no hints */ } > > + > > +static void amd_mwaitx(const uint64_t timeout) { > > + /* MWAITX */ > > + asm volatile(".byte 0x0f, 0x01, 0xfb;" > > + : /* ignore rflags */ > > + : "a"(0), /* enter C1 */ > > + "c"(2), /* enable timer */ > > + "b"(timeout)); > > +} > > + > > +static struct { > > + void (*mmonitor)(volatile void *addr); > > + void (*mwait)(const uint64_t timeout); } __rte_cache_aligned > > +power_monitor_ops; > > + > > static inline void > > __umwait_wakeup(volatile void *addr) > > { > > @@ -75,8 +129,6 @@ int > > rte_power_monitor(const struct rte_power_monitor_cond *pmc, > > const uint64_t tsc_timestamp) { > > - const uint32_t tsc_l =3D (uint32_t)tsc_timestamp; > > - const uint32_t tsc_h =3D (uint32_t)(tsc_timestamp >> 32); > > const unsigned int lcore_id =3D rte_lcore_id(); > > struct power_wait_status *s; > > uint64_t cur_value; > > @@ -109,10 +161,8 @@ rte_power_monitor(const struct > rte_power_monitor_cond *pmc, > > * versions support this instruction natively. > > */ > > > > - /* set address for UMONITOR */ > > - asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > > - : > > - : "D"(pmc->addr)); > > + /* set address for mmonitor */ > > + power_monitor_ops.mmonitor(pmc->addr); > > > > /* now that we've put this address into monitor, we can unlock */ > > rte_spinlock_unlock(&s->lock); > > @@ -123,11 +173,8 @@ rte_power_monitor(const struct > rte_power_monitor_cond *pmc, > > if (pmc->fn(cur_value, pmc->opaque) !=3D 0) > > goto end; > > > > - /* execute UMWAIT */ > > - asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > > - : /* ignore rflags */ > > - : "D"(0), /* enter C0.2 */ > > - "a"(tsc_l), "d"(tsc_h)); > > + /* execute mwait */ > > + power_monitor_ops.mwait(tsc_timestamp); > > > > end: > > /* erase sleep address */ > > @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) { > > wait_multi_supported =3D 1; > > if (i.power_monitor) > > monitor_supported =3D 1; > > + > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */ > > + power_monitor_ops.mmonitor =3D &amd_monitorx; > > + power_monitor_ops.mwait =3D &amd_mwaitx; > > + } else { /* Intel */ > > + power_monitor_ops.mmonitor =3D &intel_umonitor; > > + power_monitor_ops.mwait =3D &intel_umwait; > > + } > > } > > > > int > > -- > > 2.34.1