From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D4925A04B7; Tue, 13 Oct 2020 16:01:35 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9C1361DB8A; Tue, 13 Oct 2020 15:57:13 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 402C31D998 for ; Tue, 13 Oct 2020 15:57:11 +0200 (CEST) IronPort-SDR: uwxcRKWNZ5vMBqMfnXNcFHfxLqlLi8uRhse7/2nbCmzIO0MJyuad9rImi20So1/G0OAHtlV9Pe 68sJAaQ5Zfmw== X-IronPort-AV: E=McAfee;i="6000,8403,9772"; a="165971757" X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="165971757" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2020 06:57:09 -0700 IronPort-SDR: s1y5CKkKhHlCJsgXuuRtLoIyT9MYR8S9666NgddurfN3dqwful9xDnUvUye70CnaIxW5YgUjMI DPvhd5aR1JCg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="519986817" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by fmsmga006.fm.intel.com with ESMTP; 13 Oct 2020 06:57:08 -0700 Received: from fmsmsx607.amr.corp.intel.com (10.18.126.87) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 06:57:08 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx607.amr.corp.intel.com (10.18.126.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 13 Oct 2020 06:57:08 -0700 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (104.47.57.175) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 13 Oct 2020 06:57:07 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U/tmLTqWR8CFIYKOGBgM1UAkCxWgGVOXEppZkEE+GabvzNxZ38glsFCYyKRjFd1cReHZ2UE+QWxnr+ul8/5GV5wAkBodRk4iQ9tZrsm7f16BJglCEMdeLkvrRHTRVd9+U9PI5JPfXbdF3p/aEJY7n8loG1QHwp1ifUyXU8JNFeOObxIXu3JSpxGZNn6C1CR/j+c5BPpX39+mq1RUdBT+QTqYaUJeHkTIDBVEcZZd78GLQ5ZUU6fhZa4G4ezjkuRsVIZiNO0SX3CQjn1hIkjPUO5ee3OXpjogFpVMdfbVj1G5Jc16uoa7cx83bghr/GzUdHd9yO5ID/Xk9OGOgw1VfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0ct2cYXOLTwSak29G+oxjueA6TPlF8lB+dKIpo8mqYM=; b=Wm4eh/3Ps8Nk/P8KUbkosRtvNO12Lg8nINkZdKELhXD/KCgJ61+tBSfDaTBcUGbIsATLyLpyzxPHBf6tY2N8GjvVKK1RIjXjdi5MlL5P2r+A2MtvZtdohZWTuWHBZdWFmbiccrDiWYNYHOHs3XPtO4dIU6jj4DcoysJ1bIJoZ0MtC+1JkDsK7RJ1rv714/nFuyUjQ44HJGT/gWPNV0m9oOv8Kb9di2LRodzk6hqPNfa2Se4IvvsKBpo8Rj45xPHPJhAS9PGyR3YOrTCxOnBYtNKsm2SRWhOmuBHsIqyz/MloSFzr1qcMttcfzg2GBwrRgJwTy5DGNJQEaUJUxNur5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0ct2cYXOLTwSak29G+oxjueA6TPlF8lB+dKIpo8mqYM=; b=MR4UvqequbnLmaM0UW7eejb5yWplsz7K593htjzHm/3WI6PbAxsj5bVKZjlZX2MJeJCvhxJ7v68U8ap4Z20AuYGVAfsTNIM/8CCzqHFDAfpQkQMGTCBH+KTeGyQxNvNzpiF82iw2YwzIqzDrf31KCCaab1vjzcFOkR9gtAhNgwM= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB2837.namprd11.prod.outlook.com (2603:10b6:a02:c6::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.27; Tue, 13 Oct 2020 13:57:02 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b%3]) with mapi id 15.20.3455.030; Tue, 13 Oct 2020 13:57:02 +0000 From: "Ananyev, Konstantin" To: "Power, Ciara" , "dev@dpdk.org" CC: "viktorin@rehivetech.com" , "ruifeng.wang@arm.com" , "jerinj@marvell.com" , "drc@linux.vnet.ibm.com" , "Richardson, Bruce" , "Singh, Jasvinder" , Olivier Matz Thread-Topic: [dpdk-dev] [PATCH v5 16/17] net: add checks for max SIMD bitwidth Thread-Index: AQHWoWRYpM4Cnbn4+UWkm5NGvpXo96mVjgTg Date: Tue, 13 Oct 2020 13:57:02 +0000 Message-ID: References: <20200807155859.63888-1-ciara.power@intel.com> <20201013110437.309110-1-ciara.power@intel.com> <20201013110437.309110-17-ciara.power@intel.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d525c866-9339-4201-69cf-08d86f7fdcbb x-ms-traffictypediagnostic: BYAPR11MB2837: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ypBey19RGXysEncS2+gakKiktc+8zrlOfSlH/bZbR7qAQFbxKsvGFZccFtLh6e2AZrFjWmQGI1Nn9E/P1zKTWl07chlpB1K8TRdoFtk2+TpAPTftGe2pRsREAovMKeupuah6wIWvt8hM4Dba2+cbcN0HIUylfa7dbA3nduaxNyd8Yc67hmeWKBrzxZhJg4Dm3GfxP9P0BepBAKayz94JdGMO8yAZhj/yaga3PteyiuYIlF7USnjOnoM6r/ileXRtWkEr2hI0UWaGPIJykZmyJ1b6osu1JSHOcvTX1DIalzTY8IrNr0GQoElhiFp3x5HkF5z0BwMhE3PlOIumolSa8A== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(346002)(376002)(136003)(366004)(39860400002)(6506007)(55016002)(9686003)(316002)(26005)(83380400001)(7696005)(76116006)(66946007)(71200400001)(186003)(4326008)(8936002)(54906003)(110136005)(2940100002)(52536014)(64756008)(86362001)(66556008)(5660300002)(66446008)(478600001)(66476007)(33656002)(2906002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: jisCFfK2odbakZnoiQkY/uLWT1THWXd1ZqZnPL9iMuRU87GjRLz67o+Wo5JKD3p/vCZNq7CntkgQw/DwzicphCGU9j6z+1Gh8bfaJbcQI4v1iUf4jRux7U/YcbFZeTFcBXN/fbpcfvHQFtXM/VSZChn7vVlAKrRKuLHuMcu08dP5j7+fiyUZ5lZ82gwxF8q4wQnwWn+OkiyZReTY3iPtS7v1XoP1HXKs6nPwHqx8aeWFv+BOLQko+Bk7NdzmL2gu6IVS7IaxsVd3r2RmSOiFGfTMwa+/IUth2aIK1J/TOU+ijpT0uOlqpjk0Hu74GL0EaekNOqEt82ljQAmXW7cyptCYtsuyEXcmKcTHYi8EWsyfWp/1UMzoeUNCPXuDJXBp43zC+umx+fL8115oQY3fasWVe35i806+vbmMFtIICYOWfFZKvrIpUIFFVGaTyDlpp8M1o2HG+3iVRnEsp01W/VUsAhUoRgKridl9J3aM0nCuYBtexnpmuyU+RNAOpya1DZEpnEKz/WZ9JXzKxW7svD/IouB2+598obdggsYLlNFZoOpQLoT5bBdoGQ1E18dZhjBgdwL0w3ugXETGbbsHLD9VcNX5RCwsECbJCwip71goOdTrLorB8oVHs5hfDpKrS3xYK+5/EzVQoHp+0ZSuAw== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: d525c866-9339-4201-69cf-08d86f7fdcbb X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Oct 2020 13:57:02.5436 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: VarOzXQtr5MMjlz8NS0WG3vZnkPirZ2yKUwZfBQ4K7FGX2yh9Kse2RO1NCWsZOQDs899QpwDhtrc3t3b6LHehIFtZO7ZD7TOFPCFV9yGtxs= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB2837 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v5 16/17] net: add checks for max SIMD bitwidth X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > > When choosing a vector path to take, an extra condition must be > > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled > > > path. > > > > > > The vector path was initially chosen in RTE_INIT, however this is no > > > longer suitable as we cannot check the max SIMD bitwidth at that time= . > > > Default handlers are now chosen in RTE_INIT, these default handlers > > > are used the first time the crc calc is called, and they set the suit= able > > > handlers to be used going forward. > > > > > > Suggested-by: Jasvinder Singh > > > Suggested-by: Olivier Matz > > > > > > Signed-off-by: Ciara Power > > > > > > --- > > > v4: > > > - Added default handlers to be set at RTE_INIT time, rather than > > > choosing scalar handlers. > > > - Modified logging. > > > - Updated enum name. > > > v3: > > > - Moved choosing vector paths out of RTE_INIT. > > > - Moved checking max_simd_bitwidth into the set_alg function. > > > --- > > > lib/librte_net/rte_net_crc.c | 75 ++++++++++++++++++++++++++++++----= -- > > > lib/librte_net/rte_net_crc.h | 8 ++++ > > > 2 files changed, 72 insertions(+), 11 deletions(-) > > > > > > diff --git a/lib/librte_net/rte_net_crc.c b/lib/librte_net/rte_net_cr= c.c > > > index 4f5b9e8286..11d0161a32 100644 > > > --- a/lib/librte_net/rte_net_crc.c > > > +++ b/lib/librte_net/rte_net_crc.c > > > @@ -9,6 +9,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > #if defined(RTE_ARCH_X86_64) && defined(__PCLMUL__) > > > #define X86_64_SSE42_PCLMULQDQ 1 > > > @@ -32,6 +33,12 @@ > > > static uint32_t crc32_eth_lut[CRC_LUT_SIZE]; > > > static uint32_t crc16_ccitt_lut[CRC_LUT_SIZE]; > > > > > > +static uint32_t > > > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_l= en); > > > + > > > +static uint32_t > > > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len= ); > > > + > > > static uint32_t > > > rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len); > > > > > > @@ -41,7 +48,12 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_= t data_len); > > > typedef uint32_t > > > (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len); > > > > > > -static rte_net_crc_handler *handlers; > > > +static rte_net_crc_handler handlers_default[] =3D { > > > + [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_default_handler, > > > + [RTE_NET_CRC32_ETH] =3D rte_crc32_eth_default_handler, > > > +}; > > > + > > > +static rte_net_crc_handler *handlers =3D handlers_default; > > > > > > static rte_net_crc_handler handlers_scalar[] =3D { > > > [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_handler, > > > @@ -60,6 +72,9 @@ static rte_net_crc_handler handlers_neon[] =3D { > > > }; > > > #endif > > > > > > +static uint16_t max_simd_bitwidth; > > > +RTE_LOG_REGISTER(libnet_logtype, lib.net, INFO); > > > + > > > /** > > > * Reflect the bits about the middle > > > * > > > @@ -112,6 +127,42 @@ crc32_eth_calc_lut(const uint8_t *data, > > > return crc; > > > } > > > > > > +static uint32_t > > > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_l= en) > > > +{ > > > + if (max_simd_bitwidth =3D=3D 0) > > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > > + handlers =3D handlers_scalar; > > > +#ifdef X86_64_SSE42_PCLMULQDQ > > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > > > + handlers =3D handlers_sse42; > > > +#endif > > > +#ifdef ARM64_NEON_PMULL > > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > > + handlers =3D handlers_neon; > > > +#endif > > > > You probably don't want to make all these checks for *every* invocation > > of that function. I think it would be better: > > if (ma_simd_bitwidth =3D=3D 0) {....} > > return handlers[..](...); >=20 > As another thougth - it is probably a bit safer to update max_simd_bitwid= ht > after handler value update. >=20 > handler =3D ...; rte_smp_wmb(); max_simd_width =3D ...; >=20 > > > > BTW, while it allows us to use best possible handler, > > such approach means extra indirect call(/jump) anyway. Ah sorry, that would only happen once, please ignore that one.=20 > > Hard to say off-hand would it affect performance, > > and if yes how significantly. > > Couldn't find any perf tests in our UT for it... > > > > > + return handlers[RTE_NET_CRC16_CCITT](data, data_len); > > > +} > > > + > > > +static uint32_t > > > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len= ) > > > +{ > > > + if (max_simd_bitwidth =3D=3D 0) > > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > > + handlers =3D handlers_scalar; > > > +#ifdef X86_64_SSE42_PCLMULQDQ > > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > > > + handlers =3D handlers_sse42; > > > +#endif > > > +#ifdef ARM64_NEON_PMULL > > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > > + handlers =3D handlers_neon; > > > +#endif > > > + return handlers[RTE_NET_CRC32_ETH](data, data_len); > > > +} > > > + > > > static void > > > rte_net_crc_scalar_init(void) > > > { > > > @@ -145,18 +196,26 @@ rte_crc32_eth_handler(const uint8_t *data, uint= 32_t data_len) > > > void > > > rte_net_crc_set_alg(enum rte_net_crc_alg alg) > > > { > > > + if (max_simd_bitwidth =3D=3D 0) > > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > > + > > > switch (alg) { > > > #ifdef X86_64_SSE42_PCLMULQDQ > > > case RTE_NET_CRC_SSE42: > > > - handlers =3D handlers_sse42; > > > - break; > > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) { > > > + handlers =3D handlers_sse42; > > > + return; > > > + } > > > + NET_LOG(INFO, "Max SIMD Bitwidth too low, can't use SSE\n"); > > > #elif defined ARM64_NEON_PMULL > > > /* fall-through */ > > > case RTE_NET_CRC_NEON: > > > - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > > handlers =3D handlers_neon; > > > - break; > > > + return; > > > } > > > + NET_LOG(INFO, "Max SIMD Bitwidth too low or CPU flag not enabled, = can't use NEON\n"); > > > #endif > > > /* fall-through */ > > > case RTE_NET_CRC_SCALAR: > > > @@ -184,19 +243,13 @@ rte_net_crc_calc(const void *data, > > > /* Select highest available crc algorithm as default one */ > > > RTE_INIT(rte_net_crc_init) > > > { > > > - enum rte_net_crc_alg alg =3D RTE_NET_CRC_SCALAR; > > > - > > > rte_net_crc_scalar_init(); > > > > > > #ifdef X86_64_SSE42_PCLMULQDQ > > > - alg =3D RTE_NET_CRC_SSE42; > > > rte_net_crc_sse42_init(); > > > #elif defined ARM64_NEON_PMULL > > > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > > > - alg =3D RTE_NET_CRC_NEON; > > > rte_net_crc_neon_init(); > > > } > > > #endif > > > - > > > - rte_net_crc_set_alg(alg); > > > } > > > diff --git a/lib/librte_net/rte_net_crc.h b/lib/librte_net/rte_net_cr= c.h > > > index 16e85ca970..c942865ecf 100644 > > > --- a/lib/librte_net/rte_net_crc.h > > > +++ b/lib/librte_net/rte_net_crc.h > > > @@ -7,6 +7,8 @@ > > > > > > #include > > > > > > +#include > > > + > > > #ifdef __cplusplus > > > extern "C" { > > > #endif > > > @@ -25,6 +27,12 @@ enum rte_net_crc_alg { > > > RTE_NET_CRC_NEON, > > > }; > > > > > > +extern int libnet_logtype; > > > + > > > +#define NET_LOG(level, fmt, args...) \ > > > + rte_log(RTE_LOG_ ## level, libnet_logtype, "%s(): " fmt "\n", \ > > > + __func__, ## args) > > > + > > > /** > > > * This API set the CRC computation algorithm (i.e. scalar version, > > > * x86 64-bit sse4.2 intrinsic version, etc.) and internal data > > > -- > > > 2.22.0