From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 87127A04B7; Tue, 13 Oct 2020 15:25:53 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7B05F1C0DC; Tue, 13 Oct 2020 15:25:51 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 687751BB47 for ; Tue, 13 Oct 2020 15:25:49 +0200 (CEST) IronPort-SDR: M8Kuu17tUzw5PxKbo6NJZNtq38xlXPCQeJeEA8u06ds8zELruxgFowPevOokrELlcZAxnnEQim ylTj2nQwkRnw== X-IronPort-AV: E=McAfee;i="6000,8403,9772"; a="250601512" X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="250601512" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2020 06:25:45 -0700 IronPort-SDR: FsO9uGYsfn1fQU6JaZs89GAeJn7K5tf4ZvmToIR08oDofZQ1sLXtJalLqwJDx2VN5E5lRKN1RP rsuyHebf318Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="519979205" Received: from orsmsx606.amr.corp.intel.com ([10.22.229.19]) by fmsmga006.fm.intel.com with ESMTP; 13 Oct 2020 06:25:45 -0700 Received: from orsmsx612.amr.corp.intel.com (10.22.229.25) by ORSMSX606.amr.corp.intel.com (10.22.229.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 06:25:44 -0700 Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by ORSMSX612.amr.corp.intel.com (10.22.229.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 06:25:44 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 13 Oct 2020 06:25:44 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (104.47.55.107) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 13 Oct 2020 06:25:43 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U7EbFybL0GOvSHkW3paghb0yDZBQngsyCOuaK7xYXeKvMqVI194OJ4ntfE28l1XhNdDN6/wtOSfjpOZavy3MB0lCsulofTEKSfwCFrIPCLXcfTGREukBCzWBiWr5SvWKdvKuuR7fZ8HseIEN+Ulbd+ukzHMMcac+Js3QSxITBjlay4QvgBCH1MSxEv4r0BtWSQf1KmFxKUJB7P+CRJzHY8ElpIW5yOa19hUoiPd49OCSj+FWl+CFmR7oI/P1zsQ6jM5cNb0AkUZ1BodyvkSz/BtGlYgnXy4iFTOIGbi/Il86dIqXRbQiHNUcV0AGsMeu1+HapiFtFC02OQhpvjLKkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SZo4tvz7k2nppkAdErwlvbW5S22L9+MAS9ZV98seTGM=; b=a3iNmi/b5Kuw+EOeMABx7kcCYMTZP/NQbEgndgrlKdfgG/5CM2NwzFAtv3xD0ZtsKWDBQaMA5xM8Q+gyJsGNdKs0g9WgZDwMNwdvLfkgTp0ahHWWS/ZYM62a9uqBUXjgtnYvjYNwS8Sc+ytJsoRv9eOuQjnc4T1F1mneQvyFW77kpyjKrjkCm93pipvjVxNjvyqPDTGdnvwlgUFbSZfrHk+Q7AX/e0zqv6MEvHnXyIcqkCFws58VbiHmY89+lKtEg4tzZ/hXCW8RhUT2b+UTfyx4dGCeo1qBHTWAvcfGPfsyjrgOiE5iigsk1u4JhMwpQZ6qlTR2d3T4WQPt5K8l7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SZo4tvz7k2nppkAdErwlvbW5S22L9+MAS9ZV98seTGM=; b=OORYuTxufTPN1JEDb4YixR9Yt8cJCkJgA2Q0zGPn1NtBAFMows/9Z/pOunsF7VhAVATeKhgVziq3qv9RxdelZugf9W/OLaWFQLY8m+2OLLTn/uPhCjiiQpxXG9cR4JTF5lfbyvdBtEW6Ev4urulkS2Uek/HMBP6Dw6wqjF0SSZM= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB3303.namprd11.prod.outlook.com (2603:10b6:a03:18::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.27; Tue, 13 Oct 2020 13:25:40 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b%3]) with mapi id 15.20.3455.030; Tue, 13 Oct 2020 13:25:40 +0000 From: "Ananyev, Konstantin" To: "Ananyev, Konstantin" , "Power, Ciara" , "dev@dpdk.org" CC: "viktorin@rehivetech.com" , "ruifeng.wang@arm.com" , "jerinj@marvell.com" , "drc@linux.vnet.ibm.com" , "Richardson, Bruce" , "Singh, Jasvinder" , Olivier Matz Thread-Topic: [dpdk-dev] [PATCH v5 16/17] net: add checks for max SIMD bitwidth Thread-Index: AQHWoWRYpM4Cnbn4+UWkm5NGvpXo9w== Date: Tue, 13 Oct 2020 13:25:40 +0000 Message-ID: References: <20200807155859.63888-1-ciara.power@intel.com> <20201013110437.309110-1-ciara.power@intel.com> <20201013110437.309110-17-ciara.power@intel.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 74e99b09-cd41-4905-0869-08d86f7b7b05 x-ms-traffictypediagnostic: BYAPR11MB3303: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ZyyrRQdaQQ+XgIyBGUl7ra3Q8wHAkWZpcc9f7ZjP6uvevIOVNQeZtsq4W+cnqg1GSLKX8LUFVdvgy2KjwZHeCJh+lHKZnjBb/zvBTi1Ngyox1hhvA7pa+thoLnFuGBbpF5IbtSmsgVfJ7nsWZW6Ydv5oKAvm4Azeu71MMdYcWvli6kLYf4QUvl0MqvaL7eXmHehh3GpRlQogK1Vc39NBsBY8itHJoGl62zGWOfae6xH5uJA8/JBxUjbUDsKCOgzT4rk+x173YE8z7GS9OlimxSLtdnP/cCdYsDW8xAhxa7vY13kdMbw7MrPL1vP2KsnSaUfOUzrRxktfQdkDYtzxtw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(396003)(366004)(136003)(346002)(376002)(5660300002)(66556008)(52536014)(66446008)(2906002)(8936002)(76116006)(83380400001)(66476007)(66946007)(64756008)(71200400001)(6506007)(478600001)(110136005)(54906003)(7696005)(55016002)(4326008)(2940100002)(33656002)(86362001)(9686003)(186003)(26005)(316002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: I9R6K2vbwdj88Le9Bp6sDCSbAygQ5vufmlj56RI4ABO+Sr0B5o4FhJgbJXmR9owhdObzr5vhGv3nBPqJWzj6qa54+f5D/6B16ZMmUQbtI5JusApQUGX5wBOWfrEEqFsDpYJITY+Ur0R7Rh9gbPKldrJ8rokEmk8yXYQtBYz1UuMh9ocil51OYCBdU97xQ8X7bCRJG61t+5E9q3yB2r3I6h0uWyspUGmFH8tAbgWLGuGXf0LHy9nR6w3KZ3xYXbLFC0pv00k3SyQHLQmuoJyUHPCgcDu18fPl9lJ4QcX9fkCg2kguBnQ6jUulQufJo4kgIzUkzsfkMPnX8dHXjmL0JMK/Ue/aKOfheXQREVLSPepWu2vge354CeHJaf2gSsAI6rnpTdbaemHsSPVgogBOPE8O2JrpfzDToAP7F+OMONAealXhG4D9/7YIcjmEsLD0Se5PPiIq39LRBK9E4Ykfibsr5Z1mQEsOaYY0Xi+Cim/PAMTqsPXaIt92RyNCQP8IE+PC5UDaYHAJPoinMcAfcW7lepEsikwVz/UIWcpCmgBjW281vIegrnkbnHV1OJQ4IW4Dn6sJD73O8Mja7zD60pUXVsmzFpOTQof+oArQiM/iaRzHX/HH/RuJfWe4s3Lho7uOJmC7/iXz8B/Zayp6Ng== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 74e99b09-cd41-4905-0869-08d86f7b7b05 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Oct 2020 13:25:40.6414 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: MxwX0b6FUI+g0IJrEauo3ll8mPuzVffVrQjrh4fvPM9bUvTJ2RJcePt9qSFJJTMyn7ZDwouZBwJECAYLzp2I8Hj+jT9cG2j22XLJdGeewLU= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3303 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v5 16/17] net: add checks for max SIMD bitwidth X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >=20 > > When choosing a vector path to take, an extra condition must be > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled > > path. > > > > The vector path was initially chosen in RTE_INIT, however this is no > > longer suitable as we cannot check the max SIMD bitwidth at that time. > > Default handlers are now chosen in RTE_INIT, these default handlers > > are used the first time the crc calc is called, and they set the suitab= le > > handlers to be used going forward. > > > > Suggested-by: Jasvinder Singh > > Suggested-by: Olivier Matz > > > > Signed-off-by: Ciara Power > > > > --- > > v4: > > - Added default handlers to be set at RTE_INIT time, rather than > > choosing scalar handlers. > > - Modified logging. > > - Updated enum name. > > v3: > > - Moved choosing vector paths out of RTE_INIT. > > - Moved checking max_simd_bitwidth into the set_alg function. > > --- > > lib/librte_net/rte_net_crc.c | 75 ++++++++++++++++++++++++++++++------ > > lib/librte_net/rte_net_crc.h | 8 ++++ > > 2 files changed, 72 insertions(+), 11 deletions(-) > > > > diff --git a/lib/librte_net/rte_net_crc.c b/lib/librte_net/rte_net_crc.= c > > index 4f5b9e8286..11d0161a32 100644 > > --- a/lib/librte_net/rte_net_crc.c > > +++ b/lib/librte_net/rte_net_crc.c > > @@ -9,6 +9,7 @@ > > #include > > #include > > #include > > +#include > > > > #if defined(RTE_ARCH_X86_64) && defined(__PCLMUL__) > > #define X86_64_SSE42_PCLMULQDQ 1 > > @@ -32,6 +33,12 @@ > > static uint32_t crc32_eth_lut[CRC_LUT_SIZE]; > > static uint32_t crc16_ccitt_lut[CRC_LUT_SIZE]; > > > > +static uint32_t > > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_len= ); > > + > > +static uint32_t > > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len); > > + > > static uint32_t > > rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len); > > > > @@ -41,7 +48,12 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t = data_len); > > typedef uint32_t > > (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len); > > > > -static rte_net_crc_handler *handlers; > > +static rte_net_crc_handler handlers_default[] =3D { > > + [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_default_handler, > > + [RTE_NET_CRC32_ETH] =3D rte_crc32_eth_default_handler, > > +}; > > + > > +static rte_net_crc_handler *handlers =3D handlers_default; > > > > static rte_net_crc_handler handlers_scalar[] =3D { > > [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_handler, > > @@ -60,6 +72,9 @@ static rte_net_crc_handler handlers_neon[] =3D { > > }; > > #endif > > > > +static uint16_t max_simd_bitwidth; > > +RTE_LOG_REGISTER(libnet_logtype, lib.net, INFO); > > + > > /** > > * Reflect the bits about the middle > > * > > @@ -112,6 +127,42 @@ crc32_eth_calc_lut(const uint8_t *data, > > return crc; > > } > > > > +static uint32_t > > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_len= ) > > +{ > > + if (max_simd_bitwidth =3D=3D 0) > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > + handlers =3D handlers_scalar; > > +#ifdef X86_64_SSE42_PCLMULQDQ > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > > + handlers =3D handlers_sse42; > > +#endif > > +#ifdef ARM64_NEON_PMULL > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > + handlers =3D handlers_neon; > > +#endif >=20 > You probably don't want to make all these checks for *every* invocation > of that function. I think it would be better: > if (ma_simd_bitwidth =3D=3D 0) {....} > return handlers[..](...); As another thougth - it is probably a bit safer to update max_simd_bitwidht after handler value update. handler =3D ...; rte_smp_wmb(); max_simd_width =3D ...; >=20 > BTW, while it allows us to use best possible handler, > such approach means extra indirect call(/jump) anyway. > Hard to say off-hand would it affect performance, > and if yes how significantly. > Couldn't find any perf tests in our UT for it... >=20 > > + return handlers[RTE_NET_CRC16_CCITT](data, data_len); > > +} > > + > > +static uint32_t > > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len) > > +{ > > + if (max_simd_bitwidth =3D=3D 0) > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > + handlers =3D handlers_scalar; > > +#ifdef X86_64_SSE42_PCLMULQDQ > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > > + handlers =3D handlers_sse42; > > +#endif > > +#ifdef ARM64_NEON_PMULL > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > + handlers =3D handlers_neon; > > +#endif > > + return handlers[RTE_NET_CRC32_ETH](data, data_len); > > +} > > + > > static void > > rte_net_crc_scalar_init(void) > > { > > @@ -145,18 +196,26 @@ rte_crc32_eth_handler(const uint8_t *data, uint32= _t data_len) > > void > > rte_net_crc_set_alg(enum rte_net_crc_alg alg) > > { > > + if (max_simd_bitwidth =3D=3D 0) > > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > > + > > switch (alg) { > > #ifdef X86_64_SSE42_PCLMULQDQ > > case RTE_NET_CRC_SSE42: > > - handlers =3D handlers_sse42; > > - break; > > + if (max_simd_bitwidth >=3D RTE_SIMD_128) { > > + handlers =3D handlers_sse42; > > + return; > > + } > > + NET_LOG(INFO, "Max SIMD Bitwidth too low, can't use SSE\n"); > > #elif defined ARM64_NEON_PMULL > > /* fall-through */ > > case RTE_NET_CRC_NEON: > > - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > > + max_simd_bitwidth >=3D RTE_SIMD_128) { > > handlers =3D handlers_neon; > > - break; > > + return; > > } > > + NET_LOG(INFO, "Max SIMD Bitwidth too low or CPU flag not enabled, ca= n't use NEON\n"); > > #endif > > /* fall-through */ > > case RTE_NET_CRC_SCALAR: > > @@ -184,19 +243,13 @@ rte_net_crc_calc(const void *data, > > /* Select highest available crc algorithm as default one */ > > RTE_INIT(rte_net_crc_init) > > { > > - enum rte_net_crc_alg alg =3D RTE_NET_CRC_SCALAR; > > - > > rte_net_crc_scalar_init(); > > > > #ifdef X86_64_SSE42_PCLMULQDQ > > - alg =3D RTE_NET_CRC_SSE42; > > rte_net_crc_sse42_init(); > > #elif defined ARM64_NEON_PMULL > > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > > - alg =3D RTE_NET_CRC_NEON; > > rte_net_crc_neon_init(); > > } > > #endif > > - > > - rte_net_crc_set_alg(alg); > > } > > diff --git a/lib/librte_net/rte_net_crc.h b/lib/librte_net/rte_net_crc.= h > > index 16e85ca970..c942865ecf 100644 > > --- a/lib/librte_net/rte_net_crc.h > > +++ b/lib/librte_net/rte_net_crc.h > > @@ -7,6 +7,8 @@ > > > > #include > > > > +#include > > + > > #ifdef __cplusplus > > extern "C" { > > #endif > > @@ -25,6 +27,12 @@ enum rte_net_crc_alg { > > RTE_NET_CRC_NEON, > > }; > > > > +extern int libnet_logtype; > > + > > +#define NET_LOG(level, fmt, args...) \ > > + rte_log(RTE_LOG_ ## level, libnet_logtype, "%s(): " fmt "\n", \ > > + __func__, ## args) > > + > > /** > > * This API set the CRC computation algorithm (i.e. scalar version, > > * x86 64-bit sse4.2 intrinsic version, etc.) and internal data > > -- > > 2.22.0