From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4D6AC42361; Wed, 11 Oct 2023 14:44:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3DA4640685; Wed, 11 Oct 2023 14:44:58 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id B262F402E2 for ; Wed, 11 Oct 2023 14:44:56 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ECF5CC15; Wed, 11 Oct 2023 05:45:36 -0700 (PDT) Received: from ampere-altra-2-2.usa.Arm.com (unknown [10.118.91.160]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2FB7A3F5A1; Wed, 11 Oct 2023 05:44:56 -0700 (PDT) From: Paul Szczepanek To: dev@dpdk.org Cc: Paul Szczepanek , Honnappa Nagarahalli , Kamalakshitha Aligeri Subject: [RFC v2 1/2] eal: add pointer compression functions Date: Wed, 11 Oct 2023 12:43:27 +0000 Message-Id: <20231011124328.4002766-2-paul.szczepanek@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231011124328.4002766-1-paul.szczepanek@arm.com> References: <20230927150854.3670391-2-paul.szczepanek@arm.com> <20231011124328.4002766-1-paul.szczepanek@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add a new utility header for compressing pointers. The provided functions can store pointers in 32-bit offsets. The compression takes advantage of the fact that pointers are usually located in a limited memory region (like a mempool). We can compress them by converting them to offsets from a base memory address. Offsets can be stored in fewer bytes (dictated by the memory region size and alignment of the pointer). For example: an 8 byte aligned pointer which is part of a 32GB memory pool can be stored in 4 bytes. Suggested-by: Honnappa Nagarahalli Signed-off-by: Paul Szczepanek Signed-off-by: Kamalakshitha Aligeri Reviewed-by: Honnappa Nagarahalli --- .mailmap | 1 + lib/eal/include/meson.build | 1 + lib/eal/include/rte_ptr_compress.h | 160 +++++++++++++++++++++++++++++ 3 files changed, 162 insertions(+) create mode 100644 lib/eal/include/rte_ptr_compress.h diff --git a/.mailmap b/.mailmap index 864d33ee46..3f0c9d32f5 100644 --- a/.mailmap +++ b/.mailmap @@ -1058,6 +1058,7 @@ Paul Greenwalt Paulis Gributs Paul Luse Paul M Stillwell Jr +Paul Szczepanek Pavan Kumar Linga Pavan Nikhilesh Pavel Belous diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index a0463efac7..17d8373648 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -36,6 +36,7 @@ headers += files( 'rte_pci_dev_features.h', 'rte_per_lcore.h', 'rte_pflock.h', + 'rte_ptr_compress.h', 'rte_random.h', 'rte_reciprocal.h', 'rte_seqcount.h', diff --git a/lib/eal/include/rte_ptr_compress.h b/lib/eal/include/rte_ptr_compress.h new file mode 100644 index 0000000000..73bde22973 --- /dev/null +++ b/lib/eal/include/rte_ptr_compress.h @@ -0,0 +1,160 @@ +/* SPDX-License-Identifier: BSD-shift-Clause + * Copyright(c) 2023 Arm Limited + */ + +#ifndef RTE_PTR_COMPRESS_H +#define RTE_PTR_COMPRESS_H + +/** + * @file + * Pointer compression and decompression. + */ + +#include +#include + +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Compress pointers into 32-bit offsets from base pointer. + * + * @note It is programmer's responsibility to ensure the resulting offsets fit + * into 32 bits. Alignment of the structures pointed to by the pointers allows + * us to drop bits from the offsets. This is controlled by the bit_shift + * parameter. This means that if structures are aligned by 8 bytes they must be + * within 32GB of the base pointer. If there is no such alignment guarantee they + * must be within 4GB. + * + * @param ptr_base + * A pointer used to calculate offsets of pointers in src_table. + * @param src_table + * A pointer to an array of pointers. + * @param dest_table + * A pointer to an array of compressed pointers returned by this function. + * @param n + * The number of objects to compress, must be strictly positive. + * @param bit_shift + * Byte alignment of memory pointed to by the pointers allows for + * bits to be dropped from the offset and hence widen the memory region that + * can be covered. This controls how many bits are right shifted. + **/ +static __rte_always_inline void +rte_ptr_compress_32(void *ptr_base, void **src_table, + uint32_t *dest_table, unsigned int n, unsigned int bit_shift) +{ + unsigned int i = 0; +#if defined RTE_HAS_SVE_ACLE + svuint64_t v_src_table; + svuint64_t v_dest_table; + svbool_t pg = svwhilelt_b64(i, n); + do { + v_src_table = svld1_u64(pg, (uint64_t *)src_table + i); + v_dest_table = svsub_x(pg, v_src_table, (uint64_t)ptr_base); + v_dest_table = svlsr_x(pg, v_dest_table, bit_shift); + svst1w(pg, &dest_table[i], v_dest_table); + i += svcntd(); + pg = svwhilelt_b64(i, n); + } while (svptest_any(svptrue_b64(), pg)); +#elif defined __ARM_NEON + uint64_t ptr_diff; + uint64x2_t v_src_table; + uint64x2_t v_dest_table; + /* right shift is done by left shifting by negative int */ + int64x2_t v_shift = vdupq_n_s64(-bit_shift); + uint64x2_t v_ptr_base = vdupq_n_u64((uint64_t)ptr_base); + for (; i < (n & ~0x1); i += 2) { + v_src_table = vld1q_u64((const uint64_t *)src_table + i); + v_dest_table = vsubq_u64(v_src_table, v_ptr_base); + v_dest_table = vshlq_u64(v_dest_table, v_shift); + vst1_u32(dest_table + i, vqmovn_u64(v_dest_table)); + } + /* process leftover single item in case of odd number of n */ + if (unlikely(n & 0x1)) { + ptr_diff = RTE_PTR_DIFF(src_table[i], ptr_base); + dest_table[i] = (uint32_t) (ptr_diff >> bit_shift); + } +#else + uintptr_t ptr_diff; + for (; i < n; i++) { + ptr_diff = RTE_PTR_DIFF(src_table[i], ptr_base); + /* save extra bits that are redundant due to alignment */ + ptr_diff = ptr_diff >> bit_shift; + /* make sure no truncation will happen when casting */ + RTE_ASSERT(ptr_diff <= UINT32_MAX); + dest_table[i] = (uint32_t) ptr_diff; + } +#endif +} + +/** + * Decompress pointers from 32-bit offsets from base pointer. + * + * @param ptr_base + * A pointer which was used to calculate offsets in src_table. + * @param src_table + * A pointer to an array to compressed pointers. + * @param dest_table + * A pointer to an array of decompressed pointers returned by this function. + * @param n + * The number of objects to decompress, must be strictly positive. + * @param bit_shift + * Byte alignment of memory pointed to by the pointers allows for + * bits to be dropped from the offset and hence widen the memory region that + * can be covered. This controls how many bits are left shifted when pointers + * are recovered from the offsets. + **/ +static __rte_always_inline void +rte_ptr_decompress_32(void *ptr_base, uint32_t *src_table, + void **dest_table, unsigned int n, unsigned int bit_shift) +{ + unsigned int i = 0; +#if defined RTE_HAS_SVE_ACLE + svuint64_t v_src_table; + svuint64_t v_dest_table; + svbool_t pg = svwhilelt_b64(i, n); + do { + v_src_table = svld1uw_u64(pg, &src_table[i]); + v_src_table = svlsl_x(pg, v_src_table, bit_shift); + v_dest_table = svadd_x(pg, v_src_table, (uint64_t)ptr_base); + svst1(pg, (uint64_t *)dest_table + i, v_dest_table); + i += svcntd(); + pg = svwhilelt_b64(i, n); + } while (svptest_any(svptrue_b64(), pg)); +#elif defined __ARM_NEON + uint64_t ptr_diff; + uint64x2_t v_src_table; + uint64x2_t v_dest_table; + int64x2_t v_shift = vdupq_n_s64(bit_shift); + uint64x2_t v_ptr_base = vdupq_n_u64((uint64_t)ptr_base); + for (; i < (n & ~0x1); i += 2) { + v_src_table = vmovl_u32(vld1_u32(src_table + i)); + v_src_table = vshlq_u64(v_dest_table, v_shift); + v_dest_table = vaddq_u64(v_src_table, v_ptr_base); + vst1q_u64((uint64_t *)dest_table + i, v_dest_table); + } + /* process leftover single item in case of odd number of n */ + if (unlikely(n & 0x1)) { + ptr_diff = ((uint64_t) src_table[i]) << bit_shift; + dest_table[i] = RTE_PTR_ADD(ptr_base, ptr_diff); + } +#else + uintptr_t ptr_diff; + for (; i < n; i++) { + ptr_diff = ((uintptr_t) src_table[i]) << bit_shift; + dest_table[i] = RTE_PTR_ADD(ptr_base, ptr_diff); + } +#endif +} + +#ifdef __cplusplus +} +#endif + +#endif /* RTE_PTR_COMPRESS_H */ -- 2.25.1