From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2B27745614; Fri, 12 Jul 2024 17:47:10 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3865B4066F; Fri, 12 Jul 2024 17:46:56 +0200 (CEST) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by mails.dpdk.org (Postfix) with ESMTP id 7CEB140666 for ; Fri, 12 Jul 2024 17:46:53 +0200 (CEST) Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2eede942e04so976521fa.3 for ; Fri, 12 Jul 2024 08:46:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1720799213; x=1721404013; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GZz+VozUktku38s5C19MboKoiADUTAfocexFR02Zrdc=; b=CSiWQSbCRx7NwtQSG/EnD3j9HNTxhpxyBpd3RnkoYhtK8T+7aXGMQTgA5tGbFEfHV9 I+VhatZptfAiQPS7jBqWwiZh4BJvXV76VdwQaUiYiDm8debDE9iWWxEKLNBoIP+Lj9w3 /z2viDeO1SlJrlMSAU3Fui5p3z2wijyUIRRf8pU47WTy2qJTjytu6hixJDB/b0qDU876 mWYNkUzeQxeE85Bkf4NMlQbhtxMecetAY+dStroQG6wvFhX+FfSCyVlDo0eFoMln7dST RLccpXzXCY12+XQwipu+M2eszOSAd/eils97my7sbLmdPGe8+bOzSAqY8P5xKtj9IuUA lizw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720799213; x=1721404013; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GZz+VozUktku38s5C19MboKoiADUTAfocexFR02Zrdc=; b=uOdGmNcs+Izpe+c36lKrBsSWoNWUd2KMsk4/p20uq5/v06AmwII2Q6LYeqJ6Qt14LZ HIVXNDwRIAqcuVoRntM9Arwgwn5WqWrSF3GAHrMdZaz7AtMJkSjauJC4Cn3qxmakq5wo 7yw2s5gn0FuMOlHqEips4tIPoc/6gzU4z+3isxJTvhw+YPrfpU9L9PnR2AtRV8OUyssJ IiapU1OiWwANv1iMPEyVPWn+/NjkKgTelbUWvJGnmsFvXBXn9EUvLaDclS0Oh3oSfD8R xuOGdd4CypXPySweNh+RIphoV7TD0kfftJASJ9et5jk2Elx2BQh/Fa9Mk1WZFygW8Eg1 cgoQ== X-Gm-Message-State: AOJu0YzESL/3Gx9cC5JNyF8wpEsmY3Z6keoeQGBm7MKwUZaVLzwgLeUR 4tsW5WHsgvQeVgm09uOVcA2/eRUBRzufnPgwlPHJaosV8MIdN9hmNSmOuUK1nOo= X-Google-Smtp-Source: AGHT+IHAnYaRtp8OZnCi1gp01v/mDMmHPZG73qDkJbcXREskbtA+pgpFlSQjDGQBmHsoExJGhOaZLQ== X-Received: by 2002:a2e:81c7:0:b0:2ee:8d04:7689 with SMTP id 38308e7fff4ca-2eeb30e5f12mr76255501fa.20.1720799212547; Fri, 12 Jul 2024 08:46:52 -0700 (PDT) Received: from C02FF2N1MD6T.bytedance.net (ec2-3-9-240-80.eu-west-2.compute.amazonaws.com. [3.9.240.80]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-367cde7e023sm10468615f8f.13.2024.07.12.08.46.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Jul 2024 08:46:52 -0700 (PDT) From: Daniel Gregory To: Stanislaw Kardach , Thomas Monjalon , Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin Cc: dev@dpdk.org, Punit Agrawal , Liang Ma , Pengcheng Wang , Chunsong Feng , Daniel Gregory Subject: [PATCH v2 2/9] hash: implement crc using riscv carryless multiply Date: Fri, 12 Jul 2024 16:46:38 +0100 Message-Id: <20240712154645.80622-3-daniel.gregory@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240712154645.80622-1-daniel.gregory@bytedance.com> References: <20240618174133.33457-1-daniel.gregory@bytedance.com> <20240712154645.80622-1-daniel.gregory@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Using carryless multiply instructions from RISC-V's Zbc extension, implement a Barrett reduction that calculates CRC-32C checksums. Based on the approach described by Intel's whitepaper on "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction", which is also described here (https://web.archive.org/web/20240111232520/https://mary.rs/lab/crc32/) Add a case to the autotest_hash unit test. Signed-off-by: Daniel Gregory --- MAINTAINERS | 1 + app/test/test_hash.c | 7 +++ lib/hash/meson.build | 1 + lib/hash/rte_crc_riscv64.h | 89 ++++++++++++++++++++++++++++++++++++++ lib/hash/rte_hash_crc.c | 13 +++++- lib/hash/rte_hash_crc.h | 6 ++- 6 files changed, 115 insertions(+), 2 deletions(-) create mode 100644 lib/hash/rte_crc_riscv64.h diff --git a/MAINTAINERS b/MAINTAINERS index 533f707d5f..81f13ebcf2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -318,6 +318,7 @@ M: Stanislaw Kardach F: config/riscv/ F: doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst F: lib/eal/riscv/ +F: lib/hash/rte_crc_riscv64.h Intel x86 M: Bruce Richardson diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 24d3b547ad..c8c4197ad8 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -205,6 +205,13 @@ test_crc32_hash_alg_equiv(void) printf("Failed checking CRC32_SW against CRC32_ARM64\n"); break; } + + /* Check against 8-byte-operand RISCV64 CRC32 if available */ + rte_hash_crc_set_alg(CRC32_RISCV64); + if (hash_val != rte_hash_crc(data64, data_len, init_val)) { + printf("Failed checking CRC32_SW against CRC32_RISC64\n"); + break; + } } /* Resetting to best available algorithm */ diff --git a/lib/hash/meson.build b/lib/hash/meson.build index 277eb9fa93..8355869a80 100644 --- a/lib/hash/meson.build +++ b/lib/hash/meson.build @@ -12,6 +12,7 @@ headers = files( indirect_headers += files( 'rte_crc_arm64.h', 'rte_crc_generic.h', + 'rte_crc_riscv64.h', 'rte_crc_sw.h', 'rte_crc_x86.h', 'rte_thash_x86_gfni.h', diff --git a/lib/hash/rte_crc_riscv64.h b/lib/hash/rte_crc_riscv64.h new file mode 100644 index 0000000000..94f6857c69 --- /dev/null +++ b/lib/hash/rte_crc_riscv64.h @@ -0,0 +1,89 @@ +/* SPDX-License_Identifier: BSD-3-Clause + * Copyright(c) ByteDance 2024 + */ + +#include +#include + +#include + +#ifndef _RTE_CRC_RISCV64_H_ +#define _RTE_CRC_RISCV64_H_ + +/* + * CRC-32C takes a reflected input (bit 7 is the lsb) and produces a reflected + * output. As reflecting the value we're checksumming is expensive, we instead + * reflect the polynomial P (0x11EDC6F41) and mu and our CRC32 algorithm. + * + * The mu constant is used for a Barrett reduction. It's 2^96 / P (0x11F91CAF6) + * reflected. Picking 2^96 rather than 2^64 means we can calculate a 64-bit crc + * using only two multiplications (https://mary.rs/lab/crc32/) + */ +static const uint64_t p = 0x105EC76F1; +static const uint64_t mu = 0x4869EC38DEA713F1UL; + +/* Calculate the CRC32C checksum using a Barrett reduction */ +static inline uint32_t +crc32c_riscv64(uint64_t data, uint32_t init_val, uint32_t bits) +{ + assert((bits == 64) || (bits == 32) || (bits == 16) || (bits == 8)); + + /* Combine data with the initial value */ + uint64_t crc = (uint64_t)(data ^ init_val) << (64 - bits); + + /* + * Multiply by mu, which is 2^96 / P. Division by 2^96 occurs by taking + * the lower 64 bits of the result (remember we're inverted) + */ + crc = __riscv_clmul_64(crc, mu); + /* Multiply by P */ + crc = __riscv_clmulh_64(crc, p); + + /* Subtract from original (only needed for smaller sizes) */ + if (bits == 16 || bits == 8) + crc ^= init_val >> bits; + + return crc; +} + +/* + * Use carryless multiply to perform hash on a value, falling back on the + * software in case the Zbc extension is not supported + */ +static inline uint32_t +rte_hash_crc_1byte(uint8_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 8); + + return crc32c_1byte(data, init_val); +} + +static inline uint32_t +rte_hash_crc_2byte(uint16_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 16); + + return crc32c_2bytes(data, init_val); +} + +static inline uint32_t +rte_hash_crc_4byte(uint32_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 32); + + return crc32c_1word(data, init_val); +} + +static inline uint32_t +rte_hash_crc_8byte(uint64_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 64); + + return crc32c_2words(data, init_val); +} + +#endif /* _RTE_CRC_RISCV64_H_ */ diff --git a/lib/hash/rte_hash_crc.c b/lib/hash/rte_hash_crc.c index c037cdb0f0..3eb696a576 100644 --- a/lib/hash/rte_hash_crc.c +++ b/lib/hash/rte_hash_crc.c @@ -15,7 +15,7 @@ RTE_LOG_REGISTER_SUFFIX(hash_crc_logtype, crc, INFO); uint8_t rte_hash_crc32_alg = CRC32_SW; /** - * Allow or disallow use of SSE4.2/ARMv8 intrinsics for CRC32 hash + * Allow or disallow use of SSE4.2/ARMv8/RISC-V intrinsics for CRC32 hash * calculation. * * @param alg @@ -24,6 +24,7 @@ uint8_t rte_hash_crc32_alg = CRC32_SW; * - (CRC32_SSE42) Use SSE4.2 intrinsics if available * - (CRC32_SSE42_x64) Use 64-bit SSE4.2 intrinsic if available (default x86) * - (CRC32_ARM64) Use ARMv8 CRC intrinsic if available (default ARMv8) + * - (CRC32_RISCV64) Use RISCV64 Zbc extension if available * */ void @@ -52,6 +53,14 @@ rte_hash_crc_set_alg(uint8_t alg) rte_hash_crc32_alg = CRC32_ARM64; #endif +#if defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) + if (!(alg & CRC32_RISCV64)) + HASH_CRC_LOG(WARNING, + "Unsupported CRC32 algorithm requested using CRC32_RISCV64"); + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RISCV_EXT_ZBC)) + rte_hash_crc32_alg = CRC32_RISCV64; +#endif + if (rte_hash_crc32_alg == CRC32_SW) HASH_CRC_LOG(WARNING, "Unsupported CRC32 algorithm requested using CRC32_SW"); @@ -64,6 +73,8 @@ RTE_INIT(rte_hash_crc_init_alg) rte_hash_crc_set_alg(CRC32_SSE42_x64); #elif defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_CRC32) rte_hash_crc_set_alg(CRC32_ARM64); +#elif defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) + rte_hash_crc_set_alg(CRC32_RISCV64); #else rte_hash_crc_set_alg(CRC32_SW); #endif diff --git a/lib/hash/rte_hash_crc.h b/lib/hash/rte_hash_crc.h index 8ad2422ec3..034ce1f8b4 100644 --- a/lib/hash/rte_hash_crc.h +++ b/lib/hash/rte_hash_crc.h @@ -28,6 +28,7 @@ extern "C" { #define CRC32_x64 (1U << 2) #define CRC32_SSE42_x64 (CRC32_x64|CRC32_SSE42) #define CRC32_ARM64 (1U << 3) +#define CRC32_RISCV64 (1U << 4) extern uint8_t rte_hash_crc32_alg; @@ -35,12 +36,14 @@ extern uint8_t rte_hash_crc32_alg; #include "rte_crc_arm64.h" #elif defined(RTE_ARCH_X86) #include "rte_crc_x86.h" +#elif defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) +#include "rte_crc_riscv64.h" #else #include "rte_crc_generic.h" #endif /** - * Allow or disallow use of SSE4.2/ARMv8 intrinsics for CRC32 hash + * Allow or disallow use of SSE4.2/ARMv8/RISC-V intrinsics for CRC32 hash * calculation. * * @param alg @@ -49,6 +52,7 @@ extern uint8_t rte_hash_crc32_alg; * - (CRC32_SSE42) Use SSE4.2 intrinsics if available * - (CRC32_SSE42_x64) Use 64-bit SSE4.2 intrinsic if available (default x86) * - (CRC32_ARM64) Use ARMv8 CRC intrinsic if available (default ARMv8) + * - (CRC32_RISCV64) Use RISC-V Carry-less multiply if available (default rv64gc_zbc) */ void rte_hash_crc_set_alg(uint8_t alg); -- 2.39.2