From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 39FE7A04BC; Fri, 9 Oct 2020 15:55:38 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8BB9D1D6C8; Fri, 9 Oct 2020 15:52:32 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 7744E1D68D for ; Fri, 9 Oct 2020 15:52:28 +0200 (CEST) IronPort-SDR: jfee9W8x6gCtrPCuo9/ASGs5FReAMY9rEbGt+3mQAonxEvsPXeiv/1Uwx9iho0VVDpzysFPVCZ 88gIeve2uUZg== X-IronPort-AV: E=McAfee;i="6000,8403,9768"; a="164696531" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="164696531" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 06:52:24 -0700 IronPort-SDR: uJSjEuvtspjDgfPymEv0UB3jScnl/ea3+zW/Sb/N5rUUy7UHEzAolJ9tTAbHc2CL4RE8KfqNsF clPP7yO+e4aA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="354854750" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 09 Oct 2020 06:52:23 -0700 Received: from sivswdev10.ir.intel.com (sivswdev10.ir.intel.com [10.237.217.4]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 099DqMPY014781; Fri, 9 Oct 2020 14:52:22 +0100 Received: by sivswdev10.ir.intel.com (Postfix, from userid 28780) id 84BBC1800910; Fri, 9 Oct 2020 14:52:22 +0100 (IST) From: Mairtin o Loingsigh To: jasvinder.singh@intel.com, bruce.richardson@intel.com, pablo.de.lara.guarch@intel.com, konstantin.ananyev@intel.com Cc: dev@dpdk.org, brendan.ryan@intel.com, mairtin.oloingsigh@intel.com, david.coyle@intel.com Date: Fri, 9 Oct 2020 14:50:44 +0100 Message-Id: <20201009135045.8505-2-mairtin.oloingsigh@intel.com> X-Mailer: git-send-email 2.12.3 In-Reply-To: <20201009135045.8505-1-mairtin.oloingsigh@intel.com> References: <20201006162319.7981-1-mairtin.oloingsigh@intel.com> <20201009135045.8505-1-mairtin.oloingsigh@intel.com> Subject: [dpdk-dev] [PATCH v5 1/2] net: add run-time architecture specific CRC selection X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch adds support for run-time selection of the optimal architecture-specific CRC path, based on the supported instruction set(s) of the CPU. The compiler option checks have been moved from the C files to the meson script. The rte_cpu_get_flag_enabled function is called automatically by the library at process initialization time to determine which instructions the CPU supports, with the most optimal supported CRC path ultimately selected. Signed-off-by: Mairtin o Loingsigh Signed-off-by: David Coyle Acked-by: Konstantin Ananyev --- doc/guides/rel_notes/release_20_11.rst | 4 + lib/librte_net/meson.build | 34 ++++++- lib/librte_net/net_crc.h | 34 +++++++ lib/librte_net/{net_crc_neon.h => net_crc_neon.c} | 26 ++--- lib/librte_net/{net_crc_sse.h => net_crc_sse.c} | 34 ++----- lib/librte_net/rte_net_crc.c | 116 +++++++++++++++------- 6 files changed, 168 insertions(+), 80 deletions(-) create mode 100644 lib/librte_net/net_crc.h rename lib/librte_net/{net_crc_neon.h => net_crc_neon.c} (95%) rename lib/librte_net/{net_crc_sse.h => net_crc_sse.c} (94%) diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index 808bdc4e5..b77297f7e 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -55,6 +55,10 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Updated CRC modules of rte_net library.** + + * Added run-time selection of the optimal architecture-specific CRC path. + * **Updated Broadcom bnxt driver.** Updated the Broadcom bnxt driver with new features and improvements, including: diff --git a/lib/librte_net/meson.build b/lib/librte_net/meson.build index 24ed8253b..fa439b9e5 100644 --- a/lib/librte_net/meson.build +++ b/lib/librte_net/meson.build @@ -1,5 +1,5 @@ # SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2017 Intel Corporation +# Copyright(c) 2017-2020 Intel Corporation headers = files('rte_ip.h', 'rte_tcp.h', @@ -20,3 +20,35 @@ headers = files('rte_ip.h', sources = files('rte_arp.c', 'rte_ether.c', 'rte_net.c', 'rte_net_crc.c') deps += ['mbuf'] + +if dpdk_conf.has('RTE_ARCH_X86_64') + net_crc_sse42_cpu_support = ( + cc.get_define('__PCLMUL__', args: machine_args) != '') + net_crc_sse42_cc_support = ( + cc.has_argument('-mpclmul') and cc.has_argument('-maes')) + + build_static_net_crc_sse42_lib = 0 + + if net_crc_sse42_cpu_support == true + sources += files('net_crc_sse.c') + cflags += ['-DCC_X86_64_SSE42_PCLMULQDQ_SUPPORT'] + elif net_crc_sse42_cc_support == true + build_static_net_crc_sse42_lib = 1 + net_crc_sse42_lib_cflags = ['-mpclmul', '-maes'] + cflags += ['-DCC_X86_64_SSE42_PCLMULQDQ_SUPPORT'] + endif + + if build_static_net_crc_sse42_lib == 1 + net_crc_sse42_lib = static_library( + 'net_crc_sse42_lib', + 'net_crc_sse.c', + dependencies: static_rte_eal, + c_args: [cflags, + net_crc_sse42_lib_cflags]) + objs += net_crc_sse42_lib.extract_objects('net_crc_sse.c') + endif +elif (dpdk_conf.has('RTE_ARCH_ARM64') and + cc.get_define('__ARM_FEATURE_CRYPTO', args: machine_args) != '') + sources += files('net_crc_neon.c') + cflags += ['-DCC_ARM64_NEON_PMULL_SUPPORT'] +endif diff --git a/lib/librte_net/net_crc.h b/lib/librte_net/net_crc.h new file mode 100644 index 000000000..a1578a56c --- /dev/null +++ b/lib/librte_net/net_crc.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _NET_CRC_H_ +#define _NET_CRC_H_ + +/* + * Different implementations of CRC + */ + +/* SSE4.2 */ + +void +rte_net_crc_sse42_init(void); + +uint32_t +rte_crc16_ccitt_sse42_handler(const uint8_t *data, uint32_t data_len); + +uint32_t +rte_crc32_eth_sse42_handler(const uint8_t *data, uint32_t data_len); + +/* NEON */ + +void +rte_net_crc_neon_init(void); + +uint32_t +rte_crc16_ccitt_neon_handler(const uint8_t *data, uint32_t data_len); + +uint32_t +rte_crc32_eth_neon_handler(const uint8_t *data, uint32_t data_len); + +#endif /* _NET_CRC_H_ */ diff --git a/lib/librte_net/net_crc_neon.h b/lib/librte_net/net_crc_neon.c similarity index 95% rename from lib/librte_net/net_crc_neon.h rename to lib/librte_net/net_crc_neon.c index 63fa1d4a1..f61d75a8c 100644 --- a/lib/librte_net/net_crc_neon.h +++ b/lib/librte_net/net_crc_neon.c @@ -2,17 +2,15 @@ * Copyright(c) 2017 Cavium, Inc */ -#ifndef _NET_CRC_NEON_H_ -#define _NET_CRC_NEON_H_ +#include +#include #include #include #include #include -#ifdef __cplusplus -extern "C" { -#endif +#include "net_crc.h" /** PMULL CRC computation context structure */ struct crc_pmull_ctx { @@ -218,7 +216,7 @@ crc32_eth_calc_pmull( return n; } -static inline void +void rte_net_crc_neon_init(void) { /* Initialize CRC16 data */ @@ -242,9 +240,8 @@ rte_net_crc_neon_init(void) crc32_eth_pmull.rk7_rk8 = vld1q_u64(eth_k7_k8); } -static inline uint32_t -rte_crc16_ccitt_neon_handler(const uint8_t *data, - uint32_t data_len) +uint32_t +rte_crc16_ccitt_neon_handler(const uint8_t *data, uint32_t data_len) { return (uint16_t)~crc32_eth_calc_pmull(data, data_len, @@ -252,18 +249,11 @@ rte_crc16_ccitt_neon_handler(const uint8_t *data, &crc16_ccitt_pmull); } -static inline uint32_t -rte_crc32_eth_neon_handler(const uint8_t *data, - uint32_t data_len) +uint32_t +rte_crc32_eth_neon_handler(const uint8_t *data, uint32_t data_len) { return ~crc32_eth_calc_pmull(data, data_len, 0xffffffffUL, &crc32_eth_pmull); } - -#ifdef __cplusplus -} -#endif - -#endif /* _NET_CRC_NEON_H_ */ diff --git a/lib/librte_net/net_crc_sse.h b/lib/librte_net/net_crc_sse.c similarity index 94% rename from lib/librte_net/net_crc_sse.h rename to lib/librte_net/net_crc_sse.c index 1c7b7a548..053b54b39 100644 --- a/lib/librte_net/net_crc_sse.h +++ b/lib/librte_net/net_crc_sse.c @@ -1,18 +1,16 @@ /* SPDX-License-Identifier: BSD-3-Clause - * Copyright(c) 2017 Intel Corporation + * Copyright(c) 2017-2020 Intel Corporation */ -#ifndef _RTE_NET_CRC_SSE_H_ -#define _RTE_NET_CRC_SSE_H_ +#include +#include #include +#include -#include -#include +#include "net_crc.h" -#ifdef __cplusplus -extern "C" { -#endif +#include /** PCLMULQDQ CRC computation context structure */ struct crc_pclmulqdq_ctx { @@ -259,8 +257,7 @@ crc32_eth_calc_pclmulqdq( return n; } - -static inline void +void rte_net_crc_sse42_init(void) { uint64_t k1, k2, k5, k6; @@ -303,12 +300,10 @@ rte_net_crc_sse42_init(void) * use other data types such as float, double, etc. */ _mm_empty(); - } -static inline uint32_t -rte_crc16_ccitt_sse42_handler(const uint8_t *data, - uint32_t data_len) +uint32_t +rte_crc16_ccitt_sse42_handler(const uint8_t *data, uint32_t data_len) { /** return 16-bit CRC value */ return (uint16_t)~crc32_eth_calc_pclmulqdq(data, @@ -317,18 +312,11 @@ rte_crc16_ccitt_sse42_handler(const uint8_t *data, &crc16_ccitt_pclmulqdq); } -static inline uint32_t -rte_crc32_eth_sse42_handler(const uint8_t *data, - uint32_t data_len) +uint32_t +rte_crc32_eth_sse42_handler(const uint8_t *data, uint32_t data_len) { return ~crc32_eth_calc_pclmulqdq(data, data_len, 0xffffffffUL, &crc32_eth_pclmulqdq); } - -#ifdef __cplusplus -} -#endif - -#endif /* _RTE_NET_CRC_SSE_H_ */ diff --git a/lib/librte_net/rte_net_crc.c b/lib/librte_net/rte_net_crc.c index 4f5b9e828..d271d5205 100644 --- a/lib/librte_net/rte_net_crc.c +++ b/lib/librte_net/rte_net_crc.c @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: BSD-3-Clause - * Copyright(c) 2017 Intel Corporation + * Copyright(c) 2017-2020 Intel Corporation */ #include @@ -10,17 +10,7 @@ #include #include -#if defined(RTE_ARCH_X86_64) && defined(__PCLMUL__) -#define X86_64_SSE42_PCLMULQDQ 1 -#elif defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_CRYPTO) -#define ARM64_NEON_PMULL 1 -#endif - -#ifdef X86_64_SSE42_PCLMULQDQ -#include -#elif defined ARM64_NEON_PMULL -#include -#endif +#include "net_crc.h" /** CRC polynomials */ #define CRC32_ETH_POLYNOMIAL 0x04c11db7UL @@ -41,25 +31,27 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len); typedef uint32_t (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len); -static rte_net_crc_handler *handlers; +static const rte_net_crc_handler *handlers; -static rte_net_crc_handler handlers_scalar[] = { +static const rte_net_crc_handler handlers_scalar[] = { [RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler, [RTE_NET_CRC32_ETH] = rte_crc32_eth_handler, }; - -#ifdef X86_64_SSE42_PCLMULQDQ -static rte_net_crc_handler handlers_sse42[] = { +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT +static const rte_net_crc_handler handlers_sse42[] = { [RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_sse42_handler, [RTE_NET_CRC32_ETH] = rte_crc32_eth_sse42_handler, }; -#elif defined ARM64_NEON_PMULL -static rte_net_crc_handler handlers_neon[] = { +#endif +#ifdef CC_ARM64_NEON_PMULL_SUPPORT +static const rte_net_crc_handler handlers_neon[] = { [RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_neon_handler, [RTE_NET_CRC32_ETH] = rte_crc32_eth_neon_handler, }; #endif +/* Scalar handling */ + /** * Reflect the bits about the middle * @@ -142,29 +134,82 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len) crc32_eth_lut); } +/* SSE4.2/PCLMULQDQ handling */ + +#define SSE42_PCLMULQDQ_CPU_SUPPORTED \ + rte_cpu_get_flag_enabled(RTE_CPUFLAG_PCLMULQDQ) + +static const rte_net_crc_handler * +sse42_pclmulqdq_get_handlers(void) +{ +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT + if (SSE42_PCLMULQDQ_CPU_SUPPORTED) + return handlers_sse42; +#endif + return NULL; +} + +static uint8_t +sse42_pclmulqdq_init(void) +{ +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT + if (SSE42_PCLMULQDQ_CPU_SUPPORTED) { + rte_net_crc_sse42_init(); + return 1; + } +#endif + return 0; +} + +/* NEON/PMULL handling */ + +#define NEON_PMULL_CPU_SUPPORTED \ + rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL) + +static const rte_net_crc_handler * +neon_pmull_get_handlers(void) +{ +#ifdef CC_ARM64_NEON_PMULL_SUPPORT + if (NEON_PMULL_CPU_SUPPORTED) + return handlers_neon; +#endif + return NULL; +} + +static uint8_t +neon_pmull_init(void) +{ +#ifdef CC_ARM64_NEON_PMULL_SUPPORT + if (NEON_PMULL_CPU_SUPPORTED) { + rte_net_crc_neon_init(); + return 1; + } +#endif + return 0; +} + +/* Public API */ + void rte_net_crc_set_alg(enum rte_net_crc_alg alg) { + handlers = NULL; + switch (alg) { -#ifdef X86_64_SSE42_PCLMULQDQ case RTE_NET_CRC_SSE42: - handlers = handlers_sse42; - break; -#elif defined ARM64_NEON_PMULL - /* fall-through */ + handlers = sse42_pclmulqdq_get_handlers(); + break; /* for x86, always break here */ case RTE_NET_CRC_NEON: - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { - handlers = handlers_neon; - break; - } -#endif + handlers = neon_pmull_get_handlers(); /* fall-through */ case RTE_NET_CRC_SCALAR: /* fall-through */ default: - handlers = handlers_scalar; break; } + + if (handlers == NULL) + handlers = handlers_scalar; } uint32_t @@ -188,15 +233,10 @@ RTE_INIT(rte_net_crc_init) rte_net_crc_scalar_init(); -#ifdef X86_64_SSE42_PCLMULQDQ - alg = RTE_NET_CRC_SSE42; - rte_net_crc_sse42_init(); -#elif defined ARM64_NEON_PMULL - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_PMULL)) { + if (sse42_pclmulqdq_init()) + alg = RTE_NET_CRC_SSE42; + if (neon_pmull_init()) alg = RTE_NET_CRC_NEON; - rte_net_crc_neon_init(); - } -#endif rte_net_crc_set_alg(alg); } -- 2.12.3