From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9BDA0A04B7; Tue, 13 Oct 2020 15:08:15 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3EE671DA83; Tue, 13 Oct 2020 15:08:14 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 8A90F1D736 for ; Tue, 13 Oct 2020 15:08:12 +0200 (CEST) IronPort-SDR: i11an3ezoeGGdNIfGTIaYJ3PmpqZC1uHT/qnITbOFWs5i7MhHz2d8kP8CKuqWwRrencuYIiP6e /MELB3oilYbw== X-IronPort-AV: E=McAfee;i="6000,8403,9772"; a="145222245" X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="145222245" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2020 06:08:07 -0700 IronPort-SDR: v6RPy0Vj7Ls4H8V1ZxszmmQMi2LGimDNBkK99Xw403bWV3LmSj3aSpP+Hph/CVJyRQIZ8h2IYJ yLxUe8qWq8iw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="299596539" Received: from fmsmsx605.amr.corp.intel.com ([10.18.126.85]) by fmsmga008.fm.intel.com with ESMTP; 13 Oct 2020 06:08:06 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx605.amr.corp.intel.com (10.18.126.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 06:08:06 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 13 Oct 2020 06:08:05 -0700 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (104.47.70.108) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 13 Oct 2020 06:07:59 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h8jIWf4fVAHunr+00lSzUN9x68095iiFHQ3p2RXBwJWyRVOjVdfQrSnAl7XsDRguH5eN0D7fsrBKAkAqjXcd1l4qZzD9o7Cqa9YzTYcsRgBYzgrazJ8lznantlHQeZHhRQtmJE+BiSYEsR6PUFdr/U0PwoiikUFDHU4X/owRkGAoBic92RuIwG3ACBA8MUsjMA9itP3qTWGdeyt7vh0Zi9Dw9+jyUBt9JI5sCUd+75APeD2U6HSJ8Izl6niEVTZ4lZNIAg/KSxzJ8zDB4wsx3hVSQhRfiRYYUC7syodPQefsIW7TVR1HAyT0oih79vm2j54/JacEvOXWQo9VAAhqiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xFpURlE6PeRGd4pgnBkYG7+dsBGRkHhBksLMbj7fJl8=; b=VAgiFV1OOkxCMlCeUPqy2aQq4u65w3IvquW33kOGkCenANOgJuJFaNR9VSORYEyCpBh5bKwn+sdGNg43/i/RoFkxhQeAgqoMOffwWq16mX6Hdn0D7JDQZqefLVKd8IGSo016lFuLM+FIQkj0AH3jgeiEb6h6hkXC5kPhcID/hKGQO2iSxc/i9Ab+0wysSFXXSAyJtlR35h/Je/eJCk/57OZgODp/DEitnKk7cNSI/kCbRzpYe2Zp7VEIvVZPeCCzT3gEqaI/CyXAoV/intQ5NfZFKEsV9TZcOveihkqYAwO9nWwZg4c6DENVgNm4MIXSi3y+Q9y7MahVYr7T5FZ4/Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xFpURlE6PeRGd4pgnBkYG7+dsBGRkHhBksLMbj7fJl8=; b=Mq41KoSJFoaeTuIeAK299/E9/qUsey8eeX0Su1Vm77iToCs8tvQa1vU31FhfkW7720y3xC47o7SRNyFjYXo8/kfzkXT6LA194dVvjBcFkCXkpx9ZYPcc/0gMsAB+72A4HXEr6qu8mguG+soGe3D1jcrY9bgTNppl0Un5IU/hB9E= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB3192.namprd11.prod.outlook.com (2603:10b6:a03:7a::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.21; Tue, 13 Oct 2020 13:07:53 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b%3]) with mapi id 15.20.3455.030; Tue, 13 Oct 2020 13:07:52 +0000 From: "Ananyev, Konstantin" To: "Power, Ciara" , "dev@dpdk.org" CC: "viktorin@rehivetech.com" , "ruifeng.wang@arm.com" , "jerinj@marvell.com" , "drc@linux.vnet.ibm.com" , "Richardson, Bruce" , "Singh, Jasvinder" , Olivier Matz Thread-Topic: [PATCH v5 16/17] net: add checks for max SIMD bitwidth Thread-Index: AQHWoVDEHVYVuuuvCE6v8AeXUtjdqamVfmkw Date: Tue, 13 Oct 2020 13:07:52 +0000 Message-ID: References: <20200807155859.63888-1-ciara.power@intel.com> <20201013110437.309110-1-ciara.power@intel.com> <20201013110437.309110-17-ciara.power@intel.com> In-Reply-To: <20201013110437.309110-17-ciara.power@intel.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 0414b2fc-7aa8-4523-3aa6-08d86f78fe85 x-ms-traffictypediagnostic: BYAPR11MB3192: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: jQVwaseOYt326TOmy8MgzD4v9MTdVkX2bqUxPGGiC4tEl6ofpF5WK8//PBynnMU5tz4g7PeCMGuI2K1S3fy3C7hY9bbErY9ggSRtNCE273Hz21OsD2bQEc8S2Wv1DmbTQpZC53+YeVBwWGLd6r7DFxmXvkVegdY2xS0YEtGrOe2wGHE+Od6syLAmHy7M4PS2yEjkGo7b4sg4GRSweWAEB0xhqn03hMcHaoyi58m8V37ao5I6ocAtXAmn4hgGeJ/+DoxZSGYiPAX2crqUeRwB4bF6T6NY18yDcCxBB0R1eCooWECu0OQ8ou+79vPVVC/IQH8nw+yFMO3/2O8DzjTIAQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(39860400002)(396003)(376002)(346002)(366004)(8676002)(4326008)(316002)(54906003)(110136005)(52536014)(5660300002)(55016002)(6506007)(478600001)(7696005)(71200400001)(8936002)(76116006)(26005)(86362001)(64756008)(66476007)(33656002)(66446008)(66556008)(66946007)(9686003)(2906002)(83380400001)(186003); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: cv6jVWjHla1X7Co6jzPv6XZnWFFoVuC++q6EAb3KtA6c+zBeSxjjbnIJBv8bBdWhZ3ghRYxDpLePakdBUDEZfSJt8NPlKjraJwImBP/niWhWjXFNnB00CwPUkXdAoP0GHgOPerQKaOkxNA8u/AYj/Znk7cBQNj+h/x05PRDNyP7MjU7NiFP9XpOO8uecBZMs5sy9k3AbZC4p6nlW4F9LRi0lyXJrohWlz9yHIhmiDSkTUvMxn79te5OSM61RS3oXj2D0uBpvOYwzZmPzqthW5U7fPPdqEcu2oKDB43aVJ4qRLSEXIXn3tiZrTAGuzfxrEGYGHOlXfaTtyHROfD1JczqDib9Vxi6oBF1vvmdFkNXGjPif4sgNBNX5CCAGLzUGWQ7xiut/E+MgfTHOromoVno6fxw+UeMUHcuZ8B5iyZ7MDLM3JqC/gewAhB6OqGFku0xs8xAN7vXpX0BymzgSidv9r0gr3Rek+QggED2U6Ob8lwiwSSwQ7lIFvM0Ibl/Y8mHOg/JVQ3xJoFfeLLbB0tgo0zQuSLflLlJ+0KqMJK1p3N+mc/Sk5ehbg7hBKfFd+8JFam9nbidhcC6DUjmjYr0QLyckU+YbhQ06pqn/v1jG6Wj21aTF7V/XGtoKCikyparBmCvFi+jITbP1wAHHdw== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0414b2fc-7aa8-4523-3aa6-08d86f78fe85 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Oct 2020 13:07:52.7669 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 6zLRiOPpgOWmuetDZWXHmQCjQ4BzrrD8BlgRp5YfmuH22KPYvVeiNfy0oMcusEus0xCECOJjW+KYfLo/qfr3KURvFFo3IQ79iQyrNCKAzak= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3192 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v5 16/17] net: add checks for max SIMD bitwidth X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > When choosing a vector path to take, an extra condition must be > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled > path. >=20 > The vector path was initially chosen in RTE_INIT, however this is no > longer suitable as we cannot check the max SIMD bitwidth at that time. > Default handlers are now chosen in RTE_INIT, these default handlers > are used the first time the crc calc is called, and they set the suitable > handlers to be used going forward. >=20 > Suggested-by: Jasvinder Singh > Suggested-by: Olivier Matz >=20 > Signed-off-by: Ciara Power >=20 > --- > v4: > - Added default handlers to be set at RTE_INIT time, rather than > choosing scalar handlers. > - Modified logging. > - Updated enum name. > v3: > - Moved choosing vector paths out of RTE_INIT. > - Moved checking max_simd_bitwidth into the set_alg function. > --- > lib/librte_net/rte_net_crc.c | 75 ++++++++++++++++++++++++++++++------ > lib/librte_net/rte_net_crc.h | 8 ++++ > 2 files changed, 72 insertions(+), 11 deletions(-) >=20 > diff --git a/lib/librte_net/rte_net_crc.c b/lib/librte_net/rte_net_crc.c > index 4f5b9e8286..11d0161a32 100644 > --- a/lib/librte_net/rte_net_crc.c > +++ b/lib/librte_net/rte_net_crc.c > @@ -9,6 +9,7 @@ > #include > #include > #include > +#include >=20 > #if defined(RTE_ARCH_X86_64) && defined(__PCLMUL__) > #define X86_64_SSE42_PCLMULQDQ 1 > @@ -32,6 +33,12 @@ > static uint32_t crc32_eth_lut[CRC_LUT_SIZE]; > static uint32_t crc16_ccitt_lut[CRC_LUT_SIZE]; >=20 > +static uint32_t > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_len); > + > +static uint32_t > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len); > + > static uint32_t > rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len); >=20 > @@ -41,7 +48,12 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t da= ta_len); > typedef uint32_t > (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len); >=20 > -static rte_net_crc_handler *handlers; > +static rte_net_crc_handler handlers_default[] =3D { > + [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_default_handler, > + [RTE_NET_CRC32_ETH] =3D rte_crc32_eth_default_handler, > +}; > + > +static rte_net_crc_handler *handlers =3D handlers_default; >=20 > static rte_net_crc_handler handlers_scalar[] =3D { > [RTE_NET_CRC16_CCITT] =3D rte_crc16_ccitt_handler, > @@ -60,6 +72,9 @@ static rte_net_crc_handler handlers_neon[] =3D { > }; > #endif >=20 > +static uint16_t max_simd_bitwidth; > +RTE_LOG_REGISTER(libnet_logtype, lib.net, INFO); > + > /** > * Reflect the bits about the middle > * > @@ -112,6 +127,42 @@ crc32_eth_calc_lut(const uint8_t *data, > return crc; > } >=20 > +static uint32_t > +rte_crc16_ccitt_default_handler(const uint8_t *data, uint32_t data_len) > +{ > + if (max_simd_bitwidth =3D=3D 0) > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > + handlers =3D handlers_scalar; > +#ifdef X86_64_SSE42_PCLMULQDQ > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > + handlers =3D handlers_sse42; > +#endif > +#ifdef ARM64_NEON_PMULL > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > + max_simd_bitwidth >=3D RTE_SIMD_128) { > + handlers =3D handlers_neon; > +#endif You probably don't want to make all these checks for *every* invocation of that function. I think it would be better: if (ma_simd_bitwidth =3D=3D 0) {....} return handlers[..](...); BTW, while it allows us to use best possible handler, such approach means extra indirect call(/jump) anyway. Hard to say off-hand would it affect performance, and if yes how significantly. Couldn't find any perf tests in our UT for it... > + return handlers[RTE_NET_CRC16_CCITT](data, data_len); > +} > + > +static uint32_t > +rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len) > +{ > + if (max_simd_bitwidth =3D=3D 0) > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > + handlers =3D handlers_scalar; > +#ifdef X86_64_SSE42_PCLMULQDQ > + if (max_simd_bitwidth >=3D RTE_SIMD_128) > + handlers =3D handlers_sse42; > +#endif > +#ifdef ARM64_NEON_PMULL > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > + max_simd_bitwidth >=3D RTE_SIMD_128) { > + handlers =3D handlers_neon; > +#endif > + return handlers[RTE_NET_CRC32_ETH](data, data_len); > +} > + > static void > rte_net_crc_scalar_init(void) > { > @@ -145,18 +196,26 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t= data_len) > void > rte_net_crc_set_alg(enum rte_net_crc_alg alg) > { > + if (max_simd_bitwidth =3D=3D 0) > + max_simd_bitwidth =3D rte_get_max_simd_bitwidth(); > + > switch (alg) { > #ifdef X86_64_SSE42_PCLMULQDQ > case RTE_NET_CRC_SSE42: > - handlers =3D handlers_sse42; > - break; > + if (max_simd_bitwidth >=3D RTE_SIMD_128) { > + handlers =3D handlers_sse42; > + return; > + } > + NET_LOG(INFO, "Max SIMD Bitwidth too low, can't use SSE\n"); > #elif defined ARM64_NEON_PMULL > /* fall-through */ > case RTE_NET_CRC_NEON: > - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) && > + max_simd_bitwidth >=3D RTE_SIMD_128) { > handlers =3D handlers_neon; > - break; > + return; > } > + NET_LOG(INFO, "Max SIMD Bitwidth too low or CPU flag not enabled, can'= t use NEON\n"); > #endif > /* fall-through */ > case RTE_NET_CRC_SCALAR: > @@ -184,19 +243,13 @@ rte_net_crc_calc(const void *data, > /* Select highest available crc algorithm as default one */ > RTE_INIT(rte_net_crc_init) > { > - enum rte_net_crc_alg alg =3D RTE_NET_CRC_SCALAR; > - > rte_net_crc_scalar_init(); >=20 > #ifdef X86_64_SSE42_PCLMULQDQ > - alg =3D RTE_NET_CRC_SSE42; > rte_net_crc_sse42_init(); > #elif defined ARM64_NEON_PMULL > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { > - alg =3D RTE_NET_CRC_NEON; > rte_net_crc_neon_init(); > } > #endif > - > - rte_net_crc_set_alg(alg); > } > diff --git a/lib/librte_net/rte_net_crc.h b/lib/librte_net/rte_net_crc.h > index 16e85ca970..c942865ecf 100644 > --- a/lib/librte_net/rte_net_crc.h > +++ b/lib/librte_net/rte_net_crc.h > @@ -7,6 +7,8 @@ >=20 > #include >=20 > +#include > + > #ifdef __cplusplus > extern "C" { > #endif > @@ -25,6 +27,12 @@ enum rte_net_crc_alg { > RTE_NET_CRC_NEON, > }; >=20 > +extern int libnet_logtype; > + > +#define NET_LOG(level, fmt, args...) \ > + rte_log(RTE_LOG_ ## level, libnet_logtype, "%s(): " fmt "\n", \ > + __func__, ## args) > + > /** > * This API set the CRC computation algorithm (i.e. scalar version, > * x86 64-bit sse4.2 intrinsic version, etc.) and internal data > -- > 2.22.0