From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 430DC45B12; Fri, 11 Oct 2024 15:37:27 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CAED8402A1; Fri, 11 Oct 2024 15:37:26 +0200 (CEST) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id C50E44028B for ; Fri, 11 Oct 2024 15:37:24 +0200 (CEST) Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4XQ72h3s9Dz6K5lL; Fri, 11 Oct 2024 21:37:00 +0800 (CST) Received: from frapeml100005.china.huawei.com (unknown [7.182.85.132]) by mail.maildlp.com (Postfix) with ESMTPS id 199981400CB; Fri, 11 Oct 2024 21:37:24 +0800 (CST) Received: from frapeml500007.china.huawei.com (7.182.85.172) by frapeml100005.china.huawei.com (7.182.85.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 11 Oct 2024 15:37:23 +0200 Received: from frapeml500007.china.huawei.com ([7.182.85.172]) by frapeml500007.china.huawei.com ([7.182.85.172]) with mapi id 15.01.2507.039; Fri, 11 Oct 2024 15:37:23 +0200 From: Konstantin Ananyev To: Bruce Richardson , "dev@dpdk.org" CC: "david.marchand@redhat.com" , "wathsala.vithanage@arm.com" Subject: RE: [PATCH v2] eal/x86: cache queried CPU flags Thread-Topic: [PATCH v2] eal/x86: cache queried CPU flags Thread-Index: AQHbG+Ij81HRcEntXkWFyvdwJ82DM7KBjWSA Date: Fri, 11 Oct 2024 13:37:23 +0000 Message-ID: References: <20241007110725.377550-1-bruce.richardson@intel.com> <20241011133303.1496014-1-bruce.richardson@intel.com> In-Reply-To: <20241011133303.1496014-1-bruce.richardson@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.206.138.42] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > Rather than re-querying the HW each time a CPU flag is requested, we can > just save the return value in the flags array. This should speed up > repeated querying of CPU flags, and provides a workaround for a reported > issue where errors are seen with constant querying of the AVX-512 CPU > flag from a non-AVX VM. >=20 > Bugzilla ID: 1501 >=20 > Signed-off-by: Bruce Richardson >=20 > --- > V2: Add compiler barrier to prevent issues with multi-threaded calls > --- > lib/eal/x86/rte_cpuflags.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) >=20 > diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c > index 26163ab746..90389c66fc 100644 > --- a/lib/eal/x86/rte_cpuflags.c > +++ b/lib/eal/x86/rte_cpuflags.c > @@ -8,8 +8,10 @@ > #include > #include > #include > +#include >=20 > #include "rte_cpuid.h" > +#include "rte_atomic.h" >=20 > /** > * Struct to hold a processor feature entry > @@ -21,12 +23,14 @@ struct feature_entry { > uint32_t bit; /**< cpuid register bit */ > #define CPU_FLAG_NAME_MAX_LEN 64 > char name[CPU_FLAG_NAME_MAX_LEN]; /**< String for printing */ > + bool has_value; > + bool value; > }; >=20 > #define FEAT_DEF(name, leaf, subleaf, reg, bit) \ > [RTE_CPUFLAG_##name] =3D {leaf, subleaf, reg, bit, #name }, >=20 > -const struct feature_entry rte_cpu_feature_table[] =3D { > +struct feature_entry rte_cpu_feature_table[] =3D { > FEAT_DEF(SSE3, 0x00000001, 0, RTE_REG_ECX, 0) > FEAT_DEF(PCLMULQDQ, 0x00000001, 0, RTE_REG_ECX, 1) > FEAT_DEF(DTES64, 0x00000001, 0, RTE_REG_ECX, 2) > @@ -147,7 +151,7 @@ const struct feature_entry rte_cpu_feature_table[] = =3D { > int > rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > { > - const struct feature_entry *feat; > + struct feature_entry *feat; > cpuid_registers_t regs; > unsigned int maxleaf; >=20 > @@ -156,6 +160,8 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > return -ENOENT; >=20 > feat =3D &rte_cpu_feature_table[feature]; > + if (feat->has_value) > + return feat->value; >=20 > if (!feat->leaf) > /* This entry in the table wasn't filled out! */ > @@ -163,8 +169,10 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature= ) >=20 > maxleaf =3D __get_cpuid_max(feat->leaf & 0x80000000, NULL); >=20 > - if (maxleaf < feat->leaf) > - return 0; > + if (maxleaf < feat->leaf) { > + feat->value =3D 0; > + goto out; > + } >=20 > #ifdef RTE_TOOLCHAIN_MSVC > __cpuidex(regs, feat->leaf, feat->subleaf); > @@ -175,7 +183,11 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature= ) > #endif >=20 > /* check if the feature is enabled */ > - return (regs[feat->reg] >> feat->bit) & 1; > + feat->value =3D (regs[feat->reg] >> feat->bit) & 1; > +out: > + rte_compiler_barrier(); > + feat->has_value =3D true; > + return feat->value; > } >=20 > const char * > -- Acked-by: Konstantin Ananyev > 2.43.0 >=20