From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CE1F944183; Fri, 7 Jun 2024 17:10:13 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B2911427E8; Fri, 7 Jun 2024 17:10:13 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id C1A7940150 for ; Fri, 7 Jun 2024 17:10:11 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9E79315A1; Fri, 7 Jun 2024 08:10:35 -0700 (PDT) Received: from ampere-altra-2-1.usa.Arm.com (ampere-altra-2-1.usa.arm.com [10.118.91.158]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 17A673F792; Fri, 7 Jun 2024 08:10:11 -0700 (PDT) From: Paul Szczepanek To: dev@dpdk.org Cc: mb@smartsharesystems.com, Paul Szczepanek Subject: [PATCH v14 0/6] add pointer compression API Date: Fri, 7 Jun 2024 15:09:54 +0000 Message-Id: <20240607151000.98562-1-paul.szczepanek@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230927150854.3670391-1-paul.szczepanek@arm.com> References: <20230927150854.3670391-1-paul.szczepanek@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patchset is proposing adding a new header only library with utility functions that allow compression of arrays of pointers. Since this is a header only library a patch needed to be added to amend the build system to allow adding libraries without source files. When passing caches full of pointers between threads, memory containing the pointers is copied multiple times which is especially costly between cores. A compression method will allow us to shrink the memory size copied. The compression takes advantage of the fact that pointers are usually located in a limited memory region. We can compress them by converting them to offsets from a base memory address. Offsets can be stored in fewer bytes (dictated by the memory region size and alignment of the pointer). For example: an 8 byte aligned pointer which is part of a 32GB memory pool can be stored in 4 bytes. The API is very generic and does not assume mempool pointers, any pointer can be passed in. Compression is based on few and fast operations and especially with vector instructions leveraged creates minimal overhead. The API accepts and returns arrays because the overhead means it only is worth it when done in bulk. Test is added that shows potential performance gain from compression. In this test an array of pointers is passed through a ring between two cores. It shows the gain which is dependent on the bulk operation size. In this synthetic test run on ampere altra a substantial (up to 25%) performance gain is seen if done in bulk size larger than 32. At 32 it breaks even and lower sizes create a small (less than 5%) slowdown due to overhead. In a more realistic mock application running the l3 forwarding dpdk example that works in pipeline mode on two cores this translated into a ~5% throughput increase on an ampere altra. v2: * addressed review comments (style, explanations and typos) * lowered bulk iterations closer to original numbers to keep runtime short * fixed pointer size warning on 32-bit arch v3: * added 16-bit versions of compression functions and tests * added documentation of these new utility functions in the EAL guide v4: * added unit test * fix bug in NEON implementation of 32-bit decompress v5: * disable NEON and SVE implementation on AARCH32 due to wrong pointer size v6: * added example usage to commit message of the initial commit v7: * rebase to remove clashing mailmap changes v8: * put ptr compress into its own library * add depends-on tag * remove copyright bumps * typos v9 * added MAINTAINERS entries, release notes, doc indexes etc. * added patch for build system to allow header only library v10 * fixed problem with meson build adding shared deps to static deps v11 * added mempool functions to get information about memory range and alignment * added tests for the new mempool functions * added macros to help find the parameters for compression functions * minor improvement in the SVE compression code * amended documentation to reflect these changes v12 * added doxygen and prefixes to macros * use rte_bitops for clz and ctz * added unit tests to verify macros * fixed incorrect letter case in docs v13 * added contiguous parameter to rte_mempool_get_mem_range * made rte_mempool_get_mem_range parameters optional v14 * encapsulated parameters of rte_mempool_get_mem_range in a struct * added consts to function parameters where appropriate Paul Szczepanek (6): lib: allow libraries with no sources mempool: add functions to get extra mempool info ptr_compress: add pointer compression library test: add pointer compress tests to ring perf test docs: add pointer compression guide test: add unit test for ptr compression MAINTAINERS | 6 + app/test/meson.build | 21 +- app/test/test_mempool.c | 70 ++++ app/test/test_ptr_compress.c | 193 +++++++++++ app/test/test_ring.h | 94 ++++++ app/test/test_ring_perf.c | 352 ++++++++++++++------- doc/api/doxy-api-index.md | 1 + doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/ptr_compress_lib.rst | 160 ++++++++++ doc/guides/rel_notes/release_24_07.rst | 5 + lib/mempool/rte_mempool.c | 45 +++ lib/mempool/rte_mempool.h | 47 +++ lib/mempool/version.map | 3 + lib/meson.build | 17 + lib/ptr_compress/meson.build | 4 + lib/ptr_compress/rte_ptr_compress.h | 324 +++++++++++++++++++ 17 files changed, 1212 insertions(+), 132 deletions(-) create mode 100644 app/test/test_ptr_compress.c create mode 100644 doc/guides/prog_guide/ptr_compress_lib.rst create mode 100644 lib/ptr_compress/meson.build create mode 100644 lib/ptr_compress/rte_ptr_compress.h -- 2.25.1