From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1882FA04DB; Wed, 14 Oct 2020 19:48:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E4F6B1DA88; Wed, 14 Oct 2020 19:48:32 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id E8F471DA85 for ; Wed, 14 Oct 2020 19:48:30 +0200 (CEST) IronPort-SDR: yojLewGUaYSPI95hEIsIqGRGnWjJWTIrUYQ1mldXCEtQIg10TFfiTR43O1X6G55tEXq9taFo2k TNqK8mGSGCjQ== X-IronPort-AV: E=McAfee;i="6000,8403,9774"; a="165369899" X-IronPort-AV: E=Sophos;i="5.77,375,1596524400"; d="scan'208";a="165369899" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2020 10:48:28 -0700 IronPort-SDR: ikGohbYDcOS58c+Q0P+O7/NxcCHISRXgfvYJxgQmRlgJxYqCNtYPQ9neCUum1XS7gqzCU8ObDs PuzsLlRH/z8A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,375,1596524400"; d="scan'208";a="318735239" Received: from fmsmsx606.amr.corp.intel.com ([10.18.126.86]) by orsmga006.jf.intel.com with ESMTP; 14 Oct 2020 10:48:28 -0700 Received: from fmsmsx612.amr.corp.intel.com (10.18.126.92) by fmsmsx606.amr.corp.intel.com (10.18.126.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 14 Oct 2020 10:48:28 -0700 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by fmsmsx612.amr.corp.intel.com (10.18.126.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 14 Oct 2020 10:48:27 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 14 Oct 2020 10:48:27 -0700 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.104) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Wed, 14 Oct 2020 10:48:26 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZZtNB8Srz99lLPscEc/oB/fEhYMpC1jNy7LESovDPiO77yd757esCyRzNTivEG/zdFvvmhRDJ6SaXXAKKIDVm0DwanTJ8DJTIN3vlGFF0G8Mv30/+FxqbA4E1c+9CwZZg+k2l9++y2lzvXVi91QGbiZypAMH24BEH8XwT+NSeRj+qMBbkV/dmnjWUznVzqLs7LfQdw7ektGRfr+NuoH4LRi8eHHx5MR/GdPILOPGjyfYeCLKYFjbY4+qEgOlnhLG0RLyeD7mT9N0bYF4HMNPq4UGao3JKtQH3iYvQTROcad+Gx5A+/w2AT5xX9H1ueAOoOZRbR8xkFHOsM8epkWquA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Mx+vPvjtTcAyeRqYiepEQtzQf7dDATnireKFMtllag=; b=EQ2b8wbRkuiUI6jOSQH7JOwcLvHfFA5o6G9yLJpAqu7PEmZ/0sKSniWafXQu7s11uHp3QaHQofy+ZTJWLD++eEyseg5JDxq8+lPdKkfxTKRHHwE9AdvN4Fli3JtCiZ+jFa7X5Rt3Dv8OqOwdXciIuETOR7ZNZaePT6WkAxtbDoNKGI1E8RpO6nYkjqRHebYpsAkXXCrvJL6hpp1+dz7TbbQb4IE0ExL8AZKVYSZOj60OdHr9t+p1I6Rvh3YPRLSJDt33dm+7/+e6FBCaxLKF+R1Jcx25JZTYbZlxa2rfuBwh98SnCPEDAmF3fE7kI7vp5cUwJ7pS4vM/Dr7BgURV3Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Mx+vPvjtTcAyeRqYiepEQtzQf7dDATnireKFMtllag=; b=evUyNsxmc5f+OYpWniylHZFmipUM4+Y6KgvRalpC1+fVMdkoA8kFXbYluYPPdYEj8kTUlYOE9oBH5Gjrbe+F+RSBDXNIWRVY5luVezVG5VsZt5FH4xEh+o3MoiWPF/sMaaw3FgMPyIErcXejAaQ03RpQANG6oqK8afFz4RKUMJM= Received: from SN6PR11MB3310.namprd11.prod.outlook.com (2603:10b6:805:b9::13) by SN6PR11MB2686.namprd11.prod.outlook.com (2603:10b6:805:59::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.26; Wed, 14 Oct 2020 17:48:24 +0000 Received: from SN6PR11MB3310.namprd11.prod.outlook.com ([fe80::ed1d:ff6d:e1e2:6d07]) by SN6PR11MB3310.namprd11.prod.outlook.com ([fe80::ed1d:ff6d:e1e2:6d07%2]) with mapi id 15.20.3455.030; Wed, 14 Oct 2020 17:48:24 +0000 From: "Ananyev, Konstantin" To: "Burakov, Anatoly" , "dev@dpdk.org" CC: "Ma, Liang J" , Jan Viktorin , Ruifeng Wang , "David Christensen" , "Richardson, Bruce" , "Hunt, David" , "jerinjacobk@gmail.com" , "thomas@monjalon.net" , "McDaniel, Timothy" , "Eads, Gage" , "Macnamara, Chris" Thread-Topic: [PATCH v6 02/10] eal: add power management intrinsics Thread-Index: AQHWoi4+M6C4w50fQ0ecoccxdRHwBqmXVzhQ Date: Wed, 14 Oct 2020 17:48:23 +0000 Message-ID: References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1d373967-cd6c-4ed8-6cfb-08d870695914 x-ms-traffictypediagnostic: SN6PR11MB2686: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: zQhZU8anNqQB8wyMzgmiB04GFClDwz1XeiQpcAT8zzBgCqkvx7qUfkGdoLcm2iiLNXW1Xv09TIfiWLVN3Ygm5gW5WdYLrjnlA1sTWiokbGC1Yx9owPYt6DX/3eWd56AF/ov8W4WUkigayNd1L904wJNKwghFg1cWKbWUfZUgtWRSmc/2/SZZ6oIXoaI0lZjMM/iCWBL6Guz9Amp7tOYXt6WjRzF3x2czznfUF9OkzWeNJikkLqTG3k/2kd3IQYo6mOqEXc5ic5KfiyPtVstS3ZohQN5bV7rFD5knnw8Iu3a4BWJVZdT5mbtH0POPxPJI x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR11MB3310.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(366004)(346002)(396003)(376002)(136003)(71200400001)(55016002)(8676002)(8936002)(54906003)(5660300002)(86362001)(110136005)(83380400001)(2906002)(316002)(33656002)(30864003)(9686003)(66556008)(64756008)(66446008)(66946007)(66476007)(478600001)(107886003)(6506007)(186003)(76116006)(7696005)(26005)(4326008)(52536014); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: zPHoRDU6jAit0s8kfuw2glel3MbwLC575TKzf4SUy/T1NPRBoGJZAla189bFRIqUzfKz+VX98CefbzaEtmNcZ05i5Ad6olUophPDzPrIFVyWLa4ymWA1zR+QTdWpYX1UXHCbTqZRMVsvfJytWnXRN3mZL3f1puImM2ZryVoiNA1hoVvKDxtqnJsISPprXs/URgT2l0LcvcYEJAJK6v++ZzpLgizEKRIzqA2lfmVJ6ZeOCkZ/JwbifidLGl+b6Cgb+oaOqa68mMuCLt63vHl/lPvs5NZsXbVgMld3cJvWJkG+oDNzpcCAEMm1hc1DZfyRRj+cuircX48P5XwfU2UEGuzk8xF1rjoMZsz9wWxKDH+/XLZADFWGZ04MCR/gTka04gn0PLJgM9LrQ2aV2hCjmDdlqN77VGR2qcUpCQIco4Im9L9mzzckQRKGtRm7EFuff73k+Sw0itS3altS3qvY4/2a3EM7cWN/XnjtfGW+83EoEByYtCjw+OW9EJbVUGZ3CIOgOl0VaPn27Ydynh/4vUc/U3ar3lzhiAQ14yKTpqGrC0bEyW5X9p5OWe41OdkqaKdPC/p5sQ1C8+1JMdhWWSefKTvZck447yP7WcfmvvVGOvupkyAfZgRrtRB+fMZifK/IuT7epS4/R7U9rxG+Dg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN6PR11MB3310.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1d373967-cd6c-4ed8-6cfb-08d870695914 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Oct 2020 17:48:23.9104 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: QdUAgYsWSdLtWZecsm7E9+YjX3g4LQDpSFLHOctIQlN3sFM1BoTO0+JstVz1KCAbuCAYTikFKBdV+geWkg4Y+ApEJ+ezGDuNUF0Jpuev/yY= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR11MB2686 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v6 02/10] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >=20 > From: Liang Ma >=20 > Add two new power management intrinsics, and provide an implementation > in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions > are implemented as raw byte opcodes because there is not yet widespread > compiler support for these instructions. >=20 > The power management instructions provide an architecture-specific > function to either wait until a specified TSC timestamp is reached, or > optionally wait until either a TSC timestamp is reached or a memory > location is written to. The monitor function also provides an optional > comparison, to avoid sleeping when the expected write has already > happened, and no more writes are expected. >=20 > For more details, please refer to Intel(R) 64 and IA-32 Architectures > Software Developer's Manual, Volume 2. >=20 > Signed-off-by: Liang Ma > Signed-off-by: Anatoly Burakov > Acked-by: David Christensen > --- >=20 > Notes: > v6: > - Add spinlock-enabled version to allow pthread-wait-like > constructs with umwait > - Clarify comments > - Added experimental tags to intrinsics > - Added endianness support > v5: > - Removed return values > - Simplified intrinsics and hardcoded C0.2 state > - Added other arch stubs >=20 > lib/librte_eal/arm/include/meson.build | 1 + > .../arm/include/rte_power_intrinsics.h | 58 ++++++++ > .../include/generic/rte_power_intrinsics.h | 111 +++++++++++++++ > lib/librte_eal/include/meson.build | 1 + > lib/librte_eal/ppc/include/meson.build | 1 + > .../ppc/include/rte_power_intrinsics.h | 58 ++++++++ > lib/librte_eal/x86/include/meson.build | 1 + > .../x86/include/rte_power_intrinsics.h | 132 ++++++++++++++++++ > 8 files changed, 363 insertions(+) > create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h >=20 > diff --git a/lib/librte_eal/arm/include/meson.build b/lib/librte_eal/arm/= include/meson.build > index 73b750a18f..c6a9f70d73 100644 > --- a/lib/librte_eal/arm/include/meson.build > +++ b/lib/librte_eal/arm/include/meson.build > @@ -20,6 +20,7 @@ arch_headers =3D files( > 'rte_pause_32.h', > 'rte_pause_64.h', > 'rte_pause.h', > + 'rte_power_intrinsics.h', > 'rte_prefetch_32.h', > 'rte_prefetch_64.h', > 'rte_prefetch.h', > diff --git a/lib/librte_eal/arm/include/rte_power_intrinsics.h b/lib/libr= te_eal/arm/include/rte_power_intrinsics.h > new file mode 100644 > index 0000000000..b04ba10c76 > --- /dev/null > +++ b/lib/librte_eal/arm/include/rte_power_intrinsics.h > @@ -0,0 +1,58 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#ifndef _RTE_POWER_INTRINSIC_ARM_H_ > +#define _RTE_POWER_INTRINSIC_ARM_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include > + > +#include "generic/rte_power_intrinsics.h" > + > +/** > + * This function is not supported on ARM. > + */ Here and in other places - please follow dpdk coding convention for function definitions, i.e: static inline void rte_power_monitor(...=20 > +static inline void rte_power_monitor(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz) > +{ > + RTE_SET_USED(p); > + RTE_SET_USED(expected_value); > + RTE_SET_USED(value_mask); > + RTE_SET_USED(tsc_timestamp); > + RTE_SET_USED(data_sz); > +} You can probably put NOP implementations of these rte_powe_* functions into generic/rte_power_intrinsics.h. So, wouldn't need to duplicate them for every non-supported arch. Same as it was done for rte_wait_until_equal_*(). > + > +/** > + * This function is not supported on ARM. > + */ > +static inline void rte_power_monitor_sync(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz, > + rte_spinlock_t *lck) > +{ > + RTE_SET_USED(p); > + RTE_SET_USED(expected_value); > + RTE_SET_USED(value_mask); > + RTE_SET_USED(tsc_timestamp); > + RTE_SET_USED(lck); > + RTE_SET_USED(data_sz); > +} > + > +/** > + * This function is not supported on ARM. > + */ > +static inline void rte_power_pause(const uint64_t tsc_timestamp) > +{ > + RTE_SET_USED(tsc_timestamp); > +} > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_POWER_INTRINSIC_ARM_H_ */ > diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/= librte_eal/include/generic/rte_power_intrinsics.h > new file mode 100644 > index 0000000000..f9522f2776 > --- /dev/null > +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h > @@ -0,0 +1,111 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#ifndef _RTE_POWER_INTRINSIC_H_ > +#define _RTE_POWER_INTRINSIC_H_ > + > +#include > + > +#include > +#include > + > +/** > + * @file > + * Advanced power management operations. > + * > + * This file define APIs for advanced power management, > + * which are architecture-dependent. > + */ > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice > + * > + * Monitor specific address for changes. This will cause the CPU to ente= r an > + * architecture-defined optimized power state until either the specified > + * memory address is written to, a certain TSC timestamp is reached, or = other > + * reasons cause the CPU to wake up. > + * > + * Additionally, an `expected` 64-bit value and 64-bit mask are provided= . If > + * mask is non-zero, the current value pointed to by the `p` pointer wil= l be > + * checked against the expected value, and if they match, the entering o= f > + * optimized power state may be aborted. > + * > + * @param p > + * Address to monitor for changes. Must be aligned on an 64-byte bound= ary. Is 64B alignment really needed? > + * @param expected_value > + * Before attempting the monitoring, the `p` address may be read and c= ompared > + * against this value. If `value_mask` is zero, this step will be skip= ped. > + * @param value_mask > + * The 64-bit mask to use to extract current value from `p`. > + * @param tsc_timestamp > + * Maximum TSC timestamp to wait for. Note that the wait behavior is > + * architecture-dependent. > + * @param data_sz > + * Data size (in bytes) that will be used to compare expected value wi= th the > + * memory address. Can be 1, 2, 4 or 8. Supplying any other value will= lead > + * to undefined result. > + */ > +__rte_experimental > +static inline void rte_power_monitor(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz); > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice > + * > + * Monitor specific address for changes. This will cause the CPU to ente= r an > + * architecture-defined optimized power state until either the specified > + * memory address is written to, a certain TSC timestamp is reached, or = other > + * reasons cause the CPU to wake up. > + * > + * Additionally, an `expected` 64-bit value and 64-bit mask are provided= . If > + * mask is non-zero, the current value pointed to by the `p` pointer wil= l be > + * checked against the expected value, and if they match, the entering o= f > + * optimized power state may be aborted. > + * > + * This call will also lock a spinlock on entering sleep, and release it= on > + * waking up the CPU. > + * > + * @param p > + * Address to monitor for changes. Must be aligned on an 64-byte bound= ary. > + * @param expected_value > + * Before attempting the monitoring, the `p` address may be read and c= ompared > + * against this value. If `value_mask` is zero, this step will be skip= ped. > + * @param value_mask > + * The 64-bit mask to use to extract current value from `p`. > + * @param tsc_timestamp > + * Maximum TSC timestamp to wait for. Note that the wait behavior is > + * architecture-dependent. > + * @param data_sz > + * Data size (in bytes) that will be used to compare expected value wi= th the > + * memory address. Can be 1, 2, 4 or 8. Supplying any other value will= lead > + * to undefined result. > + * @param lck > + * A spinlock that must be locked before entering the function, will b= e > + * unlocked while the CPU is sleeping, and will be locked again once t= he CPU > + * wakes up. > + */ > +__rte_experimental > +static inline void rte_power_monitor_sync(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz, > + rte_spinlock_t *lck); > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice > + * > + * Enter an architecture-defined optimized power state until a certain T= SC > + * timestamp is reached. > + * > + * @param tsc_timestamp > + * Maximum TSC timestamp to wait for. Note that the wait behavior is > + * architecture-dependent. > + */ > +__rte_experimental > +static inline void rte_power_pause(const uint64_t tsc_timestamp); > + > +#endif /* _RTE_POWER_INTRINSIC_H_ */ > diff --git a/lib/librte_eal/include/meson.build b/lib/librte_eal/include/= meson.build > index cd09027958..3a12e87e19 100644 > --- a/lib/librte_eal/include/meson.build > +++ b/lib/librte_eal/include/meson.build > @@ -60,6 +60,7 @@ generic_headers =3D files( > 'generic/rte_memcpy.h', > 'generic/rte_pause.h', > 'generic/rte_prefetch.h', > + 'generic/rte_power_intrinsics.h', > 'generic/rte_rwlock.h', > 'generic/rte_spinlock.h', > 'generic/rte_ticketlock.h', > diff --git a/lib/librte_eal/ppc/include/meson.build b/lib/librte_eal/ppc/= include/meson.build > index ab4bd28092..0873b2aecb 100644 > --- a/lib/librte_eal/ppc/include/meson.build > +++ b/lib/librte_eal/ppc/include/meson.build > @@ -10,6 +10,7 @@ arch_headers =3D files( > 'rte_io.h', > 'rte_memcpy.h', > 'rte_pause.h', > + 'rte_power_intrinsics.h', > 'rte_prefetch.h', > 'rte_rwlock.h', > 'rte_spinlock.h', > diff --git a/lib/librte_eal/ppc/include/rte_power_intrinsics.h b/lib/libr= te_eal/ppc/include/rte_power_intrinsics.h > new file mode 100644 > index 0000000000..3bceefdc3f > --- /dev/null > +++ b/lib/librte_eal/ppc/include/rte_power_intrinsics.h > @@ -0,0 +1,58 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#ifndef _RTE_POWER_INTRINSIC_PPC_H_ > +#define _RTE_POWER_INTRINSIC_PPC_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include > + > +#include "generic/rte_power_intrinsics.h" > + > +/** > + * This function is not supported on PPC64. > + */ > +static inline void rte_power_monitor(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz) > +{ > + RTE_SET_USED(p); > + RTE_SET_USED(expected_value); > + RTE_SET_USED(value_mask); > + RTE_SET_USED(tsc_timestamp); > + RTE_SET_USED(data_sz); > +} > + > +/** > + * This function is not supported on PPC64. > + */ > +static inline void rte_power_monitor_sync(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz, > + rte_spinlock_t *lck) > +{ > + RTE_SET_USED(p); > + RTE_SET_USED(expected_value); > + RTE_SET_USED(value_mask); > + RTE_SET_USED(tsc_timestamp); > + RTE_SET_USED(lck); > + RTE_SET_USED(data_sz); > +} > + > +/** > + * This function is not supported on PPC64. > + */ > +static inline void rte_power_pause(const uint64_t tsc_timestamp) > +{ > + RTE_SET_USED(tsc_timestamp); > +} > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_POWER_INTRINSIC_PPC_H_ */ > diff --git a/lib/librte_eal/x86/include/meson.build b/lib/librte_eal/x86/= include/meson.build > index f0e998c2fe..494a8142a2 100644 > --- a/lib/librte_eal/x86/include/meson.build > +++ b/lib/librte_eal/x86/include/meson.build > @@ -13,6 +13,7 @@ arch_headers =3D files( > 'rte_io.h', > 'rte_memcpy.h', > 'rte_prefetch.h', > + 'rte_power_intrinsics.h', > 'rte_pause.h', > 'rte_rtm.h', > 'rte_rwlock.h', > diff --git a/lib/librte_eal/x86/include/rte_power_intrinsics.h b/lib/libr= te_eal/x86/include/rte_power_intrinsics.h > new file mode 100644 > index 0000000000..9ac8e6eef6 > --- /dev/null > +++ b/lib/librte_eal/x86/include/rte_power_intrinsics.h > @@ -0,0 +1,132 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2020 Intel Corporation > + */ > + > +#ifndef _RTE_POWER_INTRINSIC_X86_64_H_ > +#define _RTE_POWER_INTRINSIC_X86_64_H_ Why '_64_H'? My understanding was these ops are supported 32-bit mode too.=20 > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include > + > +#include "generic/rte_power_intrinsics.h" > + > +static inline uint64_t __get_umwait_val(const volatile void *p, > + const uint8_t sz) > +{ > + switch (sz) { > + case 1: Just as a nit: case sizeof(type_x): return *(const volatile type_x *)p; > + return *(const volatile uint8_t *)p; > + case 2: > + return *(const volatile uint16_t *)p; > + case 4: > + return *(const volatile uint32_t *)p; > + case 8: > + return *(const volatile uint64_t *)p; > + default: > + /* this is an intrinsic, so we can't have any error handling */ RTE_ASSERT(0); ? > + return 0; > + } > +} > + > +/** > + * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 s= tate. > + * For more information about usage of these instructions, please refer = to > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. > + */ > +static inline void rte_power_monitor(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz) > +{ > + const uint32_t tsc_l =3D (uint32_t)tsc_timestamp; > + const uint32_t tsc_h =3D (uint32_t)(tsc_timestamp >> 32); > + /* > + * we're using raw byte codes for now as only the newest compiler > + * versions support this instruction natively. > + */ > + > + /* set address for UMONITOR */ > + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > + : > + : "D"(p)); > + > + if (value_mask) { > + const uint64_t cur_value =3D __get_umwait_val(p, data_sz); > + const uint64_t masked =3D cur_value & value_mask; > + > + /* if the masked value is already matching, abort */ > + if (masked =3D=3D expected_value) > + return; > + } > + /* execute UMWAIT */ > + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > + : /* ignore rflags */ > + : "D"(0), /* enter C0.2 */ > + "a"(tsc_l), "d"(tsc_h)); > +} > + > +/** > + * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 s= tate. > + * For more information about usage of these instructions, please refer = to > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. > + */ > +static inline void rte_power_monitor_sync(const volatile void *p, > + const uint64_t expected_value, const uint64_t value_mask, > + const uint64_t tsc_timestamp, const uint8_t data_sz, > + rte_spinlock_t *lck) > +{ > + const uint32_t tsc_l =3D (uint32_t)tsc_timestamp; > + const uint32_t tsc_h =3D (uint32_t)(tsc_timestamp >> 32); > + /* > + * we're using raw byte codes for now as only the newest compiler > + * versions support this instruction natively. > + */ > + > + /* set address for UMONITOR */ > + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" > + : > + : "D"(p)); > + > + if (value_mask) { > + const uint64_t cur_value =3D __get_umwait_val(p, data_sz); > + const uint64_t masked =3D cur_value & value_mask; > + > + /* if the masked value is already matching, abort */ > + if (masked =3D=3D expected_value) > + return; > + } > + rte_spinlock_unlock(lck); > + > + /* execute UMWAIT */ > + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" > + : /* ignore rflags */ > + : "D"(0), /* enter C0.2 */ > + "a"(tsc_l), "d"(tsc_h)); > + > + rte_spinlock_lock(lck); > +} > + > +/** > + * This function uses TPAUSE instruction and will enter C0.2 state. For= more > + * information about usage of this instruction, please refer to Intel(R)= 64 and > + * IA-32 Architectures Software Developer's Manual. > + */ > +static inline void rte_power_pause(const uint64_t tsc_timestamp) > +{ > + const uint32_t tsc_l =3D (uint32_t)tsc_timestamp; > + const uint32_t tsc_h =3D (uint32_t)(tsc_timestamp >> 32); > + > + /* execute TPAUSE */ > + asm volatile(".byte 0x66, 0x0f, 0xae, 0xf7;" > + : /* ignore rflags */ > + : "D"(0), /* enter C0.2 */ > + "a"(tsc_l), "d"(tsc_h)); > +} > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* _RTE_POWER_INTRINSIC_X86_64_H_ */ > -- > 2.17.1