From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 06CAC45563; Wed, 3 Jul 2024 19:13:28 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D04674278B; Wed, 3 Jul 2024 19:13:27 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 7201C40E0F for ; Wed, 3 Jul 2024 19:13:26 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B6FC6367; Wed, 3 Jul 2024 10:13:50 -0700 (PDT) Received: from ampere-altra-2-3.austin.arm.com (ampere-altra-2-3.austin.arm.com [10.118.14.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 91B1D3F762; Wed, 3 Jul 2024 10:13:25 -0700 (PDT) From: Yoan Picchi To: Cc: dev@dpdk.org, nd@arm.com, Yoan Picchi Subject: [PATCH v10 0/4] hash: add SVE support for bulk key lookup Date: Wed, 3 Jul 2024 17:13:11 +0000 Message-Id: <20240703171315.1470547-1-yoan.picchi@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231020165159.1649282-1-yoan.picchi@arm.com> References: <20231020165159.1649282-1-yoan.picchi@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patchset adds SVE support for the signature comparison in the cuckoo hash lookup and improves the existing NEON implementation. These optimizations required changes to the data format and signature of the relevant functions to support dense hitmasks (no padding) and having the primary and secondary hitmasks interleaved instead of being in their own array each. Benchmarking the cuckoo hash perf test, I observed this effect on speed: There are no significant changes on Intel (ran on Sapphire Rapids) Neon is up to 7-10% faster (ran on ampere altra) 128b SVE is about 3-5% slower than the optimized neon (ran on a graviton 3 cloud instance) 256b SVE is about 0-3% slower than the optimized neon (ran on a graviton 3 cloud instance) V2->V3: Remove a redundant if in the test Change a couple int to uint16_t in compare_signatures_dense Several codding-style fix V3->V4: Rebase V4->V5: Commit message V5->V6: Move the arch-specific code into new arch-specific files Isolate the data struture refactor from adding SVE V6->V7: Commit message Moved RTE_HASH_COMPARE_SVE to the last commit of the chain V7->V8: Commit message Typos and missing spaces V8->V9: Use __rte_unused instead of (void) Fix an indentation mistake V9->V10: Fix more formating and indentation Move the new compare signature file directly in hash instead of being in a new subdir Re-order includes Remove duplicated static check Move rte_hash_sig_compare_function's definition into a private header Yoan Picchi (4): hash: pack the hitmask for hash in bulk lookup hash: optimize compare signature for NEON test/hash: check bulk lookup of keys after collision hash: add SVE support for bulk key lookup .mailmap | 2 + app/test/test_hash.c | 99 ++++++++--- lib/hash/compare_signatures_arm_pvt.h | 117 +++++++++++++ lib/hash/compare_signatures_generic_pvt.h | 37 ++++ lib/hash/compare_signatures_x86_pvt.h | 49 ++++++ lib/hash/hash_sig_cmp_func_pvt.h | 20 +++ lib/hash/rte_cuckoo_hash.c | 197 ++++++++++++---------- lib/hash/rte_cuckoo_hash.h | 10 +- 8 files changed, 407 insertions(+), 124 deletions(-) create mode 100644 lib/hash/compare_signatures_arm_pvt.h create mode 100644 lib/hash/compare_signatures_generic_pvt.h create mode 100644 lib/hash/compare_signatures_x86_pvt.h create mode 100644 lib/hash/hash_sig_cmp_func_pvt.h -- 2.25.1