From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D1F9445879; Tue, 27 Aug 2024 17:32:59 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D2B0940E43; Tue, 27 Aug 2024 17:32:48 +0200 (CEST) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by mails.dpdk.org (Postfix) with ESMTP id 6670940E48 for ; Tue, 27 Aug 2024 17:32:45 +0200 (CEST) Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-53349d3071eso6893691e87.2 for ; Tue, 27 Aug 2024 08:32:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1724772765; x=1725377565; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3f8T6qrehfwP8viYHX1d9WqFe0OOy2QZ/GYogd7cK2Q=; b=KObziGS9vNyM8Y+mU8p9dTvsb+ZmcDqp3DAYE9j3LBtTukjCSnOgX5KoZ5V+ogo1Pm aaCxm2BIWzHkvglaco5cwDM+G+wpf+KTTSws1hBgVQiM7Kn5KvtDIL5Ky/CIwfoMz9UU s9eTQ2XfYbOMXZXgOdbnheCOXINuX6AaxW9g0Cu339QT0uUMe2y/1cSoeVBYGHBI42g5 ni6FKL5xtYxYcpBGq6iYRGntMsPgNueSl+5w/vXJPmK824Co5f8PM84j1loyOlC5wSYd OPzq/icM+KdoDTTzGQeJBwTw4WAC+2g1M0lf0hP+b9lC83gdPvgiMnkUwBPUOqywanbU OuNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724772765; x=1725377565; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3f8T6qrehfwP8viYHX1d9WqFe0OOy2QZ/GYogd7cK2Q=; b=mGE8Hs1AGnI9CB6bGnGMpR6b+dclYIuwGto53oaejRwnS/98mfSxriKzR8s/OL5Xtg uKpVb/glwb/NdWDi80UjnAO6CYtpGQaxJ8WB49wptOWYX6etoSt3RJYPoLKnPwRZ0lqh y1QlbtGmf/7LWrjdlNWIkjGlVfZPtr3lGVbD0C+8/+zF1b2at7zI/7tRi0pIda1ecbox W66xUHHGetHTqKVEZ/oEjgpiJCe4FmljAtc4tKiR8LMeMez+Nxjwy2ENEute4lPU0456 FnOtquWxMROQMetX3dSJqZcnO/GquCQpJy22fxdUbh7/ZfPI++T9GKfxkOKkUk7IT7di AktQ== X-Gm-Message-State: AOJu0YxjvhRKATD/tyq+kHVHDp6vGA4OeNxayx67TK34nw3NWnAEB9Dj spXMcKrbM2edP/5HUMe41kzAIJ5mM6n4tymEf5gIoc3eDnwzKDY3b0eXrmWZRHM= X-Google-Smtp-Source: AGHT+IFpWB8gaBiqOeYW92B5i5g6TarVPhM2NwlAXtttCF/j1LS6evueAFFu8Yan4a/E3rA2ZZH2tg== X-Received: by 2002:a05:6512:3a94:b0:52e:9619:e26a with SMTP id 2adb3069b0e04-53438784996mr8811656e87.26.1724772764560; Tue, 27 Aug 2024 08:32:44 -0700 (PDT) Received: from C02FF2N1MD6T.bytedance.net ([93.115.195.2]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a86e582da41sm122323066b.109.2024.08.27.08.32.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 08:32:44 -0700 (PDT) From: Daniel Gregory To: Stanislaw Kardach , Thomas Monjalon , Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin Cc: dev@dpdk.org, Punit Agrawal , Liang Ma , Pengcheng Wang , Chunsong Feng , Daniel Gregory Subject: [PATCH v3 2/9] hash: implement CRC using riscv carryless multiply Date: Tue, 27 Aug 2024 16:32:22 +0100 Message-Id: <20240827153230.52880-3-daniel.gregory@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240827153230.52880-1-daniel.gregory@bytedance.com> References: <20240712154645.80622-1-daniel.gregory@bytedance.com> <20240827153230.52880-1-daniel.gregory@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Using carryless multiply instructions from RISC-V's Zbc extension, implement a Barrett reduction that calculates CRC-32C checksums. Based on the approach described by Intel's whitepaper on "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction", which is also described here (https://web.archive.org/web/20240111232520/https://mary.rs/lab/crc32/) Add a case to the autotest_hash unit test. Signed-off-by: Daniel Gregory --- MAINTAINERS | 1 + app/test/test_hash.c | 7 +++ lib/hash/meson.build | 1 + lib/hash/rte_crc_riscv64.h | 89 ++++++++++++++++++++++++++++++++++++++ lib/hash/rte_hash_crc.c | 13 +++++- lib/hash/rte_hash_crc.h | 6 ++- 6 files changed, 115 insertions(+), 2 deletions(-) create mode 100644 lib/hash/rte_crc_riscv64.h diff --git a/MAINTAINERS b/MAINTAINERS index c5a703b5c0..fa081552c7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -322,6 +322,7 @@ M: Stanislaw Kardach F: config/riscv/ F: doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst F: lib/eal/riscv/ +F: lib/hash/rte_crc_riscv64.h Intel x86 M: Bruce Richardson diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 65b9cad93c..dd491ea4d9 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -231,6 +231,13 @@ test_crc32_hash_alg_equiv(void) printf("Failed checking CRC32_SW against CRC32_ARM64\n"); break; } + + /* Check against 8-byte-operand RISCV64 CRC32 if available */ + rte_hash_crc_set_alg(CRC32_RISCV64); + if (hash_val != rte_hash_crc(data64, data_len, init_val)) { + printf("Failed checking CRC32_SW against CRC32_RISC64\n"); + break; + } } /* Resetting to best available algorithm */ diff --git a/lib/hash/meson.build b/lib/hash/meson.build index 277eb9fa93..8355869a80 100644 --- a/lib/hash/meson.build +++ b/lib/hash/meson.build @@ -12,6 +12,7 @@ headers = files( indirect_headers += files( 'rte_crc_arm64.h', 'rte_crc_generic.h', + 'rte_crc_riscv64.h', 'rte_crc_sw.h', 'rte_crc_x86.h', 'rte_thash_x86_gfni.h', diff --git a/lib/hash/rte_crc_riscv64.h b/lib/hash/rte_crc_riscv64.h new file mode 100644 index 0000000000..94f6857c69 --- /dev/null +++ b/lib/hash/rte_crc_riscv64.h @@ -0,0 +1,89 @@ +/* SPDX-License_Identifier: BSD-3-Clause + * Copyright(c) ByteDance 2024 + */ + +#include +#include + +#include + +#ifndef _RTE_CRC_RISCV64_H_ +#define _RTE_CRC_RISCV64_H_ + +/* + * CRC-32C takes a reflected input (bit 7 is the lsb) and produces a reflected + * output. As reflecting the value we're checksumming is expensive, we instead + * reflect the polynomial P (0x11EDC6F41) and mu and our CRC32 algorithm. + * + * The mu constant is used for a Barrett reduction. It's 2^96 / P (0x11F91CAF6) + * reflected. Picking 2^96 rather than 2^64 means we can calculate a 64-bit crc + * using only two multiplications (https://mary.rs/lab/crc32/) + */ +static const uint64_t p = 0x105EC76F1; +static const uint64_t mu = 0x4869EC38DEA713F1UL; + +/* Calculate the CRC32C checksum using a Barrett reduction */ +static inline uint32_t +crc32c_riscv64(uint64_t data, uint32_t init_val, uint32_t bits) +{ + assert((bits == 64) || (bits == 32) || (bits == 16) || (bits == 8)); + + /* Combine data with the initial value */ + uint64_t crc = (uint64_t)(data ^ init_val) << (64 - bits); + + /* + * Multiply by mu, which is 2^96 / P. Division by 2^96 occurs by taking + * the lower 64 bits of the result (remember we're inverted) + */ + crc = __riscv_clmul_64(crc, mu); + /* Multiply by P */ + crc = __riscv_clmulh_64(crc, p); + + /* Subtract from original (only needed for smaller sizes) */ + if (bits == 16 || bits == 8) + crc ^= init_val >> bits; + + return crc; +} + +/* + * Use carryless multiply to perform hash on a value, falling back on the + * software in case the Zbc extension is not supported + */ +static inline uint32_t +rte_hash_crc_1byte(uint8_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 8); + + return crc32c_1byte(data, init_val); +} + +static inline uint32_t +rte_hash_crc_2byte(uint16_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 16); + + return crc32c_2bytes(data, init_val); +} + +static inline uint32_t +rte_hash_crc_4byte(uint32_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 32); + + return crc32c_1word(data, init_val); +} + +static inline uint32_t +rte_hash_crc_8byte(uint64_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 64); + + return crc32c_2words(data, init_val); +} + +#endif /* _RTE_CRC_RISCV64_H_ */ diff --git a/lib/hash/rte_hash_crc.c b/lib/hash/rte_hash_crc.c index c037cdb0f0..3eb696a576 100644 --- a/lib/hash/rte_hash_crc.c +++ b/lib/hash/rte_hash_crc.c @@ -15,7 +15,7 @@ RTE_LOG_REGISTER_SUFFIX(hash_crc_logtype, crc, INFO); uint8_t rte_hash_crc32_alg = CRC32_SW; /** - * Allow or disallow use of SSE4.2/ARMv8 intrinsics for CRC32 hash + * Allow or disallow use of SSE4.2/ARMv8/RISC-V intrinsics for CRC32 hash * calculation. * * @param alg @@ -24,6 +24,7 @@ uint8_t rte_hash_crc32_alg = CRC32_SW; * - (CRC32_SSE42) Use SSE4.2 intrinsics if available * - (CRC32_SSE42_x64) Use 64-bit SSE4.2 intrinsic if available (default x86) * - (CRC32_ARM64) Use ARMv8 CRC intrinsic if available (default ARMv8) + * - (CRC32_RISCV64) Use RISCV64 Zbc extension if available * */ void @@ -52,6 +53,14 @@ rte_hash_crc_set_alg(uint8_t alg) rte_hash_crc32_alg = CRC32_ARM64; #endif +#if defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) + if (!(alg & CRC32_RISCV64)) + HASH_CRC_LOG(WARNING, + "Unsupported CRC32 algorithm requested using CRC32_RISCV64"); + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RISCV_EXT_ZBC)) + rte_hash_crc32_alg = CRC32_RISCV64; +#endif + if (rte_hash_crc32_alg == CRC32_SW) HASH_CRC_LOG(WARNING, "Unsupported CRC32 algorithm requested using CRC32_SW"); @@ -64,6 +73,8 @@ RTE_INIT(rte_hash_crc_init_alg) rte_hash_crc_set_alg(CRC32_SSE42_x64); #elif defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_CRC32) rte_hash_crc_set_alg(CRC32_ARM64); +#elif defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) + rte_hash_crc_set_alg(CRC32_RISCV64); #else rte_hash_crc_set_alg(CRC32_SW); #endif diff --git a/lib/hash/rte_hash_crc.h b/lib/hash/rte_hash_crc.h index 8ad2422ec3..034ce1f8b4 100644 --- a/lib/hash/rte_hash_crc.h +++ b/lib/hash/rte_hash_crc.h @@ -28,6 +28,7 @@ extern "C" { #define CRC32_x64 (1U << 2) #define CRC32_SSE42_x64 (CRC32_x64|CRC32_SSE42) #define CRC32_ARM64 (1U << 3) +#define CRC32_RISCV64 (1U << 4) extern uint8_t rte_hash_crc32_alg; @@ -35,12 +36,14 @@ extern uint8_t rte_hash_crc32_alg; #include "rte_crc_arm64.h" #elif defined(RTE_ARCH_X86) #include "rte_crc_x86.h" +#elif defined(RTE_ARCH_RISCV) && defined(RTE_RISCV_FEATURE_ZBC) +#include "rte_crc_riscv64.h" #else #include "rte_crc_generic.h" #endif /** - * Allow or disallow use of SSE4.2/ARMv8 intrinsics for CRC32 hash + * Allow or disallow use of SSE4.2/ARMv8/RISC-V intrinsics for CRC32 hash * calculation. * * @param alg @@ -49,6 +52,7 @@ extern uint8_t rte_hash_crc32_alg; * - (CRC32_SSE42) Use SSE4.2 intrinsics if available * - (CRC32_SSE42_x64) Use 64-bit SSE4.2 intrinsic if available (default x86) * - (CRC32_ARM64) Use ARMv8 CRC intrinsic if available (default ARMv8) + * - (CRC32_RISCV64) Use RISC-V Carry-less multiply if available (default rv64gc_zbc) */ void rte_hash_crc_set_alg(uint8_t alg); -- 2.39.2