From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4D3664416A; Thu, 6 Jun 2024 15:22:42 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 35AC0410E6; Thu, 6 Jun 2024 15:22:42 +0200 (CEST) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id F1381410E4 for ; Thu, 6 Jun 2024 15:22:40 +0200 (CEST) Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Vw4jM41y2z6H8L0; Thu, 6 Jun 2024 21:21:27 +0800 (CST) Received: from frapeml100005.china.huawei.com (unknown [7.182.85.132]) by mail.maildlp.com (Postfix) with ESMTPS id 15D851400E7; Thu, 6 Jun 2024 21:22:40 +0800 (CST) Received: from frapeml500007.china.huawei.com (7.182.85.172) by frapeml100005.china.huawei.com (7.182.85.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 6 Jun 2024 15:22:39 +0200 Received: from frapeml500007.china.huawei.com ([7.182.85.172]) by frapeml500007.china.huawei.com ([7.182.85.172]) with mapi id 15.01.2507.039; Thu, 6 Jun 2024 15:22:39 +0200 From: Konstantin Ananyev To: Paul Szczepanek , "dev@dpdk.org" CC: "mb@smartsharesystems.com" , "Honnappa Nagarahalli" , Kamalakshitha Aligeri , Nathan Brown , "Jack Bond-Preston" Subject: RE: [PATCH v11 3/6] ptr_compress: add pointer compression library Thread-Topic: [PATCH v11 3/6] ptr_compress: add pointer compression library Thread-Index: AQHarbWx9WBf8CJ1OUez113qwh0bOrG6yXSw Date: Thu, 6 Jun 2024 13:22:39 +0000 Message-ID: References: <20230927150854.3670391-2-paul.szczepanek@arm.com> <20240524083651.482541-1-paul.szczepanek@arm.com> <20240524083651.482541-4-paul.szczepanek@arm.com> In-Reply-To: <20240524083651.482541-4-paul.szczepanek@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.81.188.246] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > +/** > + * Compress pointers into 32-bit offsets from base pointer. > + * > + * @note It is programmer's responsibility to ensure the resulting offse= ts fit > + * into 32 bits. Alignment of the structures pointed to by the pointers = allows > + * us to drop bits from the offsets. This is controlled by the bit_shift > + * parameter. This means that if structures are aligned by 8 bytes they = must be > + * within 32GB of the base pointer. If there is no such alignment guaran= tee they > + * must be within 4GB. > + * > + * @param ptr_base > + * A pointer used to calculate offsets of pointers in src_table. > + * @param src_table > + * A pointer to an array of pointers. > + * @param dest_table > + * A pointer to an array of compressed pointers returned by this funct= ion. > + * @param n > + * The number of objects to compress, must be strictly positive. > + * @param bit_shift > + * Byte alignment of memory pointed to by the pointers allows for > + * bits to be dropped from the offset and hence widen the memory regio= n that > + * can be covered. This controls how many bits are right shifted. > + **/ > +static __rte_always_inline void > +rte_ptr_compress_32_shift(void *ptr_base, void **src_table, > + uint32_t *dest_table, size_t n, uint8_t bit_shift) Probably: void * const *src_table And on decompress: const uint32_t *src_table=20 > +{ > + size_t i =3D 0; > +#if defined RTE_HAS_SVE_ACLE && !defined RTE_ARCH_ARMv8_AARCH32 > + svuint64_t v_ptr_table; > + do { > + svbool_t pg =3D svwhilelt_b64(i, n); > + v_ptr_table =3D svld1_u64(pg, (uint64_t *)src_table + i); > + v_ptr_table =3D svsub_x(pg, v_ptr_table, (uint64_t)ptr_base); > + v_ptr_table =3D svlsr_x(pg, v_ptr_table, bit_shift); > + svst1w(pg, &dest_table[i], v_ptr_table); > + i +=3D svcntd(); > + } while (i < n); > +#elif defined __ARM_NEON && !defined RTE_ARCH_ARMv8_AARCH32 > + uint64_t ptr_diff; > + uint64x2_t v_ptr_table; > + /* right shift is done by left shifting by negative int */ > + int64x2_t v_shift =3D vdupq_n_s64(-bit_shift); > + uint64x2_t v_ptr_base =3D vdupq_n_u64((uint64_t)ptr_base); > + const size_t n_even =3D n & ~0x1; > + for (; i < n_even; i +=3D 2) { > + v_ptr_table =3D vld1q_u64((const uint64_t *)src_table + i); > + v_ptr_table =3D vsubq_u64(v_ptr_table, v_ptr_base); > + v_ptr_table =3D vshlq_u64(v_ptr_table, v_shift); > + vst1_u32(dest_table + i, vqmovn_u64(v_ptr_table)); > + } > + /* process leftover single item in case of odd number of n */ > + if (unlikely(n & 0x1)) { > + ptr_diff =3D RTE_PTR_DIFF(src_table[i], ptr_base); > + dest_table[i] =3D (uint32_t) (ptr_diff >> bit_shift); > + } > +#else > + uintptr_t ptr_diff; > + for (; i < n; i++) { > + ptr_diff =3D RTE_PTR_DIFF(src_table[i], ptr_base); > + ptr_diff =3D ptr_diff >> bit_shift; > + RTE_ASSERT(ptr_diff <=3D UINT32_MAX); > + dest_table[i] =3D (uint32_t) ptr_diff; > + } > +#endif > +} > +