From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5C33F45B12; Fri, 11 Oct 2024 14:48:31 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 49EB3402A1; Fri, 11 Oct 2024 14:48:31 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by mails.dpdk.org (Postfix) with ESMTP id B91DD4028B for ; Fri, 11 Oct 2024 14:48:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1728650910; x=1760186910; h=date:from:to:cc:subject:message-id:references: in-reply-to:mime-version; bh=micietBu8nJH0ynIu4da3dDW7CMThuhrsgMjxAObd9U=; b=Etry3wirtBgc5Wso3vSY/l6mIRoCLKbFQCRwP+Z45sOCuozYH1ANieay zmsns9R6gJbZXGOSEkoyqY/DUZmD/4/afamPpH+z2A15ZwOhbQVJ6fYld 37n+9/sUZa0e50+t0ywd0FnqrRyf69mtxoG1zi3jL4Fzgv3DdafSrKIDC 7gZ3Myo/vRIwH66r7NheJNcAPsgmSraUM3bG7KMm+iQvzTysHkqVxgQ/M yEaG3k3k78eMlbBwRk98w52uhvbk0v+ibAfUKhk0yiZ1EipdrlKV76CWM f/3OtqgRQF0bwQci8UdeEiCIljldqYjpt7PUG+xeNCbx0HPO9BQAe9bPf g==; X-CSE-ConnectionGUID: ZehQZCNlR4CCvma9/Itz/g== X-CSE-MsgGUID: NvLd2i67Tdex5gyZVEnXJA== X-IronPort-AV: E=McAfee;i="6700,10204,11221"; a="53454848" X-IronPort-AV: E=Sophos;i="6.11,195,1725346800"; d="scan'208";a="53454848" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2024 05:48:23 -0700 X-CSE-ConnectionGUID: hdshffo1TGKl4UVNxCSZqQ== X-CSE-MsgGUID: SmXTe0GkTjSs7XHfZ4dc5w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,195,1725346800"; d="scan'208";a="81517060" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by fmviesa004.fm.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 11 Oct 2024 05:48:17 -0700 Received: from orsmsx612.amr.corp.intel.com (10.22.229.25) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 11 Oct 2024 05:48:12 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX612.amr.corp.intel.com (10.22.229.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 11 Oct 2024 05:48:11 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39 via Frontend Transport; Fri, 11 Oct 2024 05:48:11 -0700 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (104.47.56.44) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 11 Oct 2024 05:48:11 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Wfb//WU9gqOYWK8EYLjElt5XYv9aiUa4DhdfvmAGEEt/bUgZQjdaLgPgIENQxF+3LUNM9RjUTJdaFH0aWNMOkdmQ+MI0cSy/7IlLk7WBgZgBuONC5ufCjRak3F0uZixq3550bHRlv73iTqO9GdqvKlxoqRuGqoWcs9GNNTLjdf9C/0MqETXSQI2OLrZacJpSQ0kVx28pXJzjSO5ZEHhpaJ8liqeGZhnXXC5I8wxj/2JaR/bm06oNy82b+U1SH9BE2h5FVef9gMMZBCUfxzeWsgBil/0L9rmQuh+4rWG/4Anc0Bs5VKlLCDweewjJqDUeWjzTAw18dMuwSqdaGkQ38A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=T9CHQGqGnPe9ntLBQ3UHzfhXCD5jkcskbmo0AR80PMo=; b=MDpdLVZlUgFsheZC23zsJrKQxjeJZdsMUzOiCin1dBHqYi0kTP19VS0WoCZZ9e8R1LR2RjafATtoLR5Ke2pjHxZ8jgLhtRVnhNBIpuEHak9HtuxAQ85dgxz5dJFHLK3r8c0dK5d1/jLahjCWjw+aAiP8+CkgDyex8m1224HKEua7XonkcKP6KEvWxDttLKMxXpX4btqTmczIGUAeBWXDnTpA/piWgTQ//QHnKoE9LkuvkPka81L/IlCogQq21aQ18HTS35Ur8QkuXs/VuKvgq3Lrruz9Bl88JkpSbHkIOKe9KuMOOTnZbkT9iDDPCIqQwoIyCoDFLQCR1eVWqPUm1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from DS0PR11MB7309.namprd11.prod.outlook.com (2603:10b6:8:13e::17) by PH0PR11MB5782.namprd11.prod.outlook.com (2603:10b6:510:147::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8005.27; Fri, 11 Oct 2024 12:48:08 +0000 Received: from DS0PR11MB7309.namprd11.prod.outlook.com ([fe80::f120:cc1f:d78d:ae9b]) by DS0PR11MB7309.namprd11.prod.outlook.com ([fe80::f120:cc1f:d78d:ae9b%4]) with mapi id 15.20.8048.017; Fri, 11 Oct 2024 12:48:08 +0000 Date: Fri, 11 Oct 2024 13:48:03 +0100 From: Bruce Richardson To: Konstantin Ananyev CC: "dev@dpdk.org" , "david.marchand@redhat.com" Subject: Re: [PATCH] eal/x86: cache queried CPU flags Message-ID: References: <20241007110725.377550-1-bruce.richardson@intel.com> <94d414e76dad4378881efd1e6533c06e@huawei.com> Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <94d414e76dad4378881efd1e6533c06e@huawei.com> X-ClientProxiedBy: DUZPR01CA0232.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b4::6) To DS0PR11MB7309.namprd11.prod.outlook.com (2603:10b6:8:13e::17) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR11MB7309:EE_|PH0PR11MB5782:EE_ X-MS-Office365-Filtering-Correlation-Id: 31f9da73-d2e5-4990-b705-08dce9f2f4df X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|366016|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?b8iLg+2//gzL7bt9GbL2VYCY83SK4nCIQY1XQ7FYXLgcUzfAjfZf9SGFVP3/?= =?us-ascii?Q?wpoBZvnOOXnWB086GaguZ9Lw/L02ySVtKFlXjm9U9zNjmuYkU++QB+uQkTum?= =?us-ascii?Q?OMIcG44abX3ch999GTBX9yeSwyoP0dMA3wTJ3IAKZ5ujZEGDr1XQ8AsKr1st?= =?us-ascii?Q?CXJL/CuZC768uzkFfWCvbk3j0ujVTiQySMMLSEvm0a34LSGI1eJQYWL7K0BN?= =?us-ascii?Q?ZmAu14IOV6pDEghO2O8O4s4ti2w5KKw2D1R1tiI3DWPHEqO5yKrxBw3zUcJG?= =?us-ascii?Q?Qo9oN9yr7uyeSUHx9kPGWBovkyPf+qvo6bd6oWUKwhjP3g6Ao2XuH3QAb/xA?= =?us-ascii?Q?dmrXYJml18nQ7exZZWkMEOgwN2POeXL63Jof9rL1+bvx3e1PzgKtCrq6C6ow?= =?us-ascii?Q?eEU1oz8sRrQBJu6TEuAhyP0B4Lnc+LWAtJ5blabkAOW6IdhC1z+VZIaeH6Or?= =?us-ascii?Q?48lz36BgeT7emaBYMbhQLB9Kg/qJuRcDOPP10qHQVfJ8t+n7PIl/Sjx4PC2z?= =?us-ascii?Q?dGMLoq55eWy3IWtsP9RYTPD3HnfctrrDLa6rvy5ViPlkOcCWiOKrUcHGNfql?= =?us-ascii?Q?19h4nAPayr25xHb8vk6Bb4TZOwN7zsFet2lEDcBdZJsP8Xr8nYcufOUtAspV?= =?us-ascii?Q?MhGpzQxKUrtdJ350jolVA2hfSkZz9ZuJpzR0Yp2nPwe7tcOC5wSGxFVj1Apm?= =?us-ascii?Q?zrnWEWM4O2CGhCF9v1mn87oqZnouKmiw94tIJxlVmUtMgoHAPDoKE0WidZbl?= =?us-ascii?Q?nCDOdr22wwkfG5KzX+k/j+exDTF3MO76+ROetkS0PUxig3d9MT8RZ52Xbv8e?= =?us-ascii?Q?vAU+61MU+LBTFm4ar8Sz4yAW2LjaAgXX4p6NrPvIhkyKbhInHNxEULVh500m?= =?us-ascii?Q?xDJ7NkWVXbcKN9IT7PGuDxfUfY2pi/eedjgH8iLTpP3ljhVk6aw7iX6MaI2m?= =?us-ascii?Q?7+R27lhZUXLWMgIm7eZID+wRkJIsb/Mq10zuzZfKrVKqVPALjnwxcrGQTcup?= =?us-ascii?Q?1G2/6vqlOviiu2QPpu6PyQ+ljHILaSPA+YhHO1LtIpvdII2QfkUHIrQdUFBk?= =?us-ascii?Q?3oHBkCsjtAMXtTKqf9tlCvLq7PoqjgfH4fhbrU6L5VFkHtuGlJB1r6B8xYlx?= =?us-ascii?Q?dv4Xp95pNakfp45CAII11zpb2F1ifO4uARUN64z9Q4L0CuJaWn89ZqG+EZLa?= =?us-ascii?Q?/+hi3bsJzGOuNlEZYSHePWNvwpGTiWBUyZEypWfjYEJikppBaG/DTkg5gv72?= =?us-ascii?Q?YrMhwFtQNbbzP8OPG7EzZdzahb4zkHxyx2CpNQI/fQ=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS0PR11MB7309.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(366016)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?GYfQuIcoH0Mnfy59bOV1CxI9u3JXSI0b20bymMvR+I6Xd+YZOBQGfSAfMRjh?= =?us-ascii?Q?JMPZmAcU8fzqWxrADQ59lbELEhJUppyDZnJ+8JItDCYloEgs8OYHt1HjqiwJ?= =?us-ascii?Q?7/m+qj+8WzJilONmU2DH7CQOKJsBDVpWsIhTWHFKwTzZTYE183Kcf8Zk+3J3?= =?us-ascii?Q?9p43N+qH9z9mYD6w0fiFa0wou1a7phY6NRV1cMFv5jPTTO4X7/7WeDD9RSvl?= =?us-ascii?Q?GmYlbeJxbCyXGMkL0/0IJOFlJM30lp0RNljIUgpv97C5QU598OhmEeO4EUht?= =?us-ascii?Q?j+10mjfCqNDqo6D+c6lFZZSd3VnINc7HdlyIrcQeMuNIEkpyfOSKNYXWL1Ur?= =?us-ascii?Q?4oGailIKqc2mOqILAUegZLhFQocr+Z4sy//NaVLAsJR+wrFkLSGagHy9PaIC?= =?us-ascii?Q?fq5wVTbN2i4L5K8nzrkSyFh6ctQ7aOjxFYLiCBViXesxHPAkpS+9uu5WTbFa?= =?us-ascii?Q?4EaQCsBLh3tC1VHRVPsysl8b5Am4jzCqtAm371sB5lhGiE80W0CIA3Diw6aS?= =?us-ascii?Q?PhMT3FB5I7pUXwCAMl0cVvN4biom2tL7McqtLQoyhB6oOh7LsmkB/B1v5H0K?= =?us-ascii?Q?AWBJGIvwrS4O8BuejcDnd/XGRXCLN/IRs7y5aAroPvwkV7Rg79MPtdqksFIa?= =?us-ascii?Q?E4vXm9AXBdMaUjz5S/MeO7ds5BVS9rf+u9FufXW38j4daXpXIG8P0tYI7yVd?= =?us-ascii?Q?1AD3lmg8l0uXR1HDJaK4Zf8nK3je3POVGyioLhJ8eftpdGYkveHvKOTuB07Y?= =?us-ascii?Q?gK7Y1yy9Kksh+QqGzkCtyXTT5Yrvh+VWK9rPFwta9ErPseEjw3dNmpuvRH3O?= =?us-ascii?Q?SwlpjqadZQBwWSFJ7VFKCHdC9UDr0NMsoFytafLO+jpYh9GR47iso9lU19Z0?= =?us-ascii?Q?adn78sueQyfNuh90m5Z/o5G3sbAsHN1MphKVfP97NOAa5y8F2Yr2NDRaFPMB?= =?us-ascii?Q?Nph3z+ARUIJmeZtLvW06qHiA6X+pUcqEG93GHXKeh5Omic5lato0xqa4jf3S?= =?us-ascii?Q?tdWRcwREn8yaypd6kLqZwJR3sbR7TcmvQE9DQbUAXMO6MMGUBqhND7e+pfG6?= =?us-ascii?Q?HyEkGvu9j7DEftiNgbYKS5meOXCroPDYYeTJSwWxyCqBAO66Kci0J2CDWGnf?= =?us-ascii?Q?EMpdwxjbyXBbQiVuWyxp8URjsEQxOM/egHDVkA4Ia5cB6DFYmXf62kTQvgOb?= =?us-ascii?Q?eU/WoiQ8jo2giHRlAOLDvps3UZQuPONXI2KaeKNPNL77DsD8Jk8r9IEeOnOL?= =?us-ascii?Q?m9ft7YVuiMGV3YOIj72xF3OA5/Qi52A/N4qGjGGURB5nVmcPgb8nc/22rhe/?= =?us-ascii?Q?8gFzEH6QthBcAXrH+bdRw+V8+VvSO6xGuDjWxxDvwNU9vSQ1SLCn930L3j0P?= =?us-ascii?Q?N8tzGX77U1TcALupibIhWXamBcSoN2mIaRG+CbLxWpj+gDFiQxIzhNyvQl0M?= =?us-ascii?Q?ZUAqcfaHkdU6JTiG9/MybqCfQHnb+6WU3sY9EV2Hbn3brDy1e5/bPbE8A0cK?= =?us-ascii?Q?YQVwZ3UXLuWeaYj2dhGZ2UXAAH+RsgDGEo+txGGEO9hWzbIzdR0Iu5QVqm5k?= =?us-ascii?Q?t87nICd7l2szCWqlxFkMpqjjrwfa/17r75p9OGLvz+RQL0GKWZX12qeoqdHO?= =?us-ascii?Q?IA=3D=3D?= X-MS-Exchange-CrossTenant-Network-Message-Id: 31f9da73-d2e5-4990-b705-08dce9f2f4df X-MS-Exchange-CrossTenant-AuthSource: DS0PR11MB7309.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Oct 2024 12:48:08.1635 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: zSe+0XFp5HJ7r2553S+FGVmHYZb7oF12zzzUg2P+NhGEJ5VB/pdwIMgkAlQGoXK02pwEItgEAn0pyj0vSDogc3Dz9sCD/GMA9m9CqozD/+4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR11MB5782 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, Oct 11, 2024 at 12:42:01PM +0000, Konstantin Ananyev wrote: > > > > Rather than re-querying the HW each time a CPU flag is requested, we can > > just save the return value in the flags array. This should speed up > > repeated querying of CPU flags, and provides a workaround for a reported > > issue where errors are seen with constant querying of the AVX-512 CPU > > flag from a non-AVX VM. > > > > Bugzilla Id: 1501 > > > > Signed-off-by: Bruce Richardson > > --- > > lib/eal/x86/rte_cpuflags.c | 20 +++++++++++++++----- > > 1 file changed, 15 insertions(+), 5 deletions(-) > > > > diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c > > index 26163ab746..62e782fb4b 100644 > > --- a/lib/eal/x86/rte_cpuflags.c > > +++ b/lib/eal/x86/rte_cpuflags.c > > @@ -8,6 +8,7 @@ > > #include > > #include > > #include > > +#include > > > > #include "rte_cpuid.h" > > > > @@ -21,12 +22,14 @@ struct feature_entry { > > uint32_t bit; /**< cpuid register bit */ > > #define CPU_FLAG_NAME_MAX_LEN 64 > > char name[CPU_FLAG_NAME_MAX_LEN]; /**< String for printing */ > > + bool has_value; > > + bool value; > > }; > > > > #define FEAT_DEF(name, leaf, subleaf, reg, bit) \ > > [RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name }, > > > > -const struct feature_entry rte_cpu_feature_table[] = { > > +struct feature_entry rte_cpu_feature_table[] = { > > FEAT_DEF(SSE3, 0x00000001, 0, RTE_REG_ECX, 0) > > FEAT_DEF(PCLMULQDQ, 0x00000001, 0, RTE_REG_ECX, 1) > > FEAT_DEF(DTES64, 0x00000001, 0, RTE_REG_ECX, 2) > > @@ -147,7 +150,7 @@ const struct feature_entry rte_cpu_feature_table[] = { > > int > > rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > > { > > - const struct feature_entry *feat; > > + struct feature_entry *feat; > > cpuid_registers_t regs; > > unsigned int maxleaf; > > > > @@ -156,6 +159,8 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > > return -ENOENT; > > > > feat = &rte_cpu_feature_table[feature]; > > + if (feat->has_value) > > + return feat->value; > > > > if (!feat->leaf) > > /* This entry in the table wasn't filled out! */ > > @@ -163,8 +168,10 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > > > > maxleaf = __get_cpuid_max(feat->leaf & 0x80000000, NULL); > > > > - if (maxleaf < feat->leaf) > > - return 0; > > + if (maxleaf < feat->leaf) { > > + feat->value = 0; > > + goto out; > > + } > > > > #ifdef RTE_TOOLCHAIN_MSVC > > __cpuidex(regs, feat->leaf, feat->subleaf); > > @@ -175,7 +182,10 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) > > #endif > > > > /* check if the feature is enabled */ > > - return (regs[feat->reg] >> feat->bit) & 1; > > + feat->value = (regs[feat->reg] >> feat->bit) & 1; > > +out: > > + feat->has_value = true; > > + return feat->value; > > If that function can be called by 2 (or more) threads simultaneously, > then In theory 'feat->has_value = true;' can be reordered with > ' feat->value = (regs[feat->reg] >> feat->bit) & 1;' (by cpu or complier) > and some thread(s) can get wrong feat->value. > The probability of such collision is really low, but still seems not impossible. > Well since this code is x86-specific the externally visible store ordering will match the instruction store ordering. Therefore, I think a compiler barrier is all that is necessary before feat->has_value assignment, correct? /Bruce