From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 6432AC6F6 for ; Fri, 26 Jun 2015 00:08:32 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 25 Jun 2015 15:08:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,680,1427785200"; d="scan'208";a="514345590" Received: from msvmon001.ims.intel.com ([10.125.148.15]) by FMSMGA003.fm.intel.com with ESMTP; 25 Jun 2015 15:08:30 -0700 Received: from sivswdev02.ir.intel.com (sivswdev02.ir.intel.com [10.237.217.46]) by msvmon001.ims.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id t5PM8ShG028732; Fri, 26 Jun 2015 01:08:29 +0300 Received: from sivswdev02.ir.intel.com (localhost [127.0.0.1]) by sivswdev02.ir.intel.com with ESMTP id t5PM5JvA007040; Thu, 25 Jun 2015 23:05:19 +0100 Received: (from pdelarax@localhost) by sivswdev02.ir.intel.com with id t5PM5J8X007036; Thu, 25 Jun 2015 23:05:19 +0100 From: Pablo de Lara To: dev@dpdk.org Date: Thu, 25 Jun 2015 23:05:08 +0100 Message-Id: <1435269919-7007-1-git-send-email-pablo.de.lara.guarch@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1433514804-7075-1-git-send-email-pablo.de.lara.guarch@intel.com> References: <1433514804-7075-1-git-send-email-pablo.de.lara.guarch@intel.com> Subject: [dpdk-dev] [PATCH v2 00/11] Cuckoo hash X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jun 2015 22:08:33 -0000 This patchset is to replace the existing hash library with a more efficient and functional approach, using the Cuckoo hash method to deal with collisions. This method is based on using two different hash functions to have two possible locations in the hash table where an entry can be. So, if a bucket is full, a new entry can push one of the items in that bucket to its alternative location, making space for itself. Advantages ~~~~~~~ - Offers the option to store more entries when the target bucket is full (unlike the previous implementation) - Memory efficient: for storing those entries, it is not necessary to request new memory, as the entries will be stored in the same table - Constant worst lookup time: in worst case scenario, it always takes the same time to look up an entry, as there are only two possible locations where an entry can be. - Storing data: user can store data in the hash table, unlike the previous implementation, but he can still use the old API This implementation tipically offers over 90% utilization. Notice that API has been extended, but old API remains. The main change in ABI is that rte_hash structure is now private and the deprecation of two macros. Changes in v2: - Fixed issue where table could not store maximum number of entries - Fixed issue where lookup burst could not be more than 32 (instead of 64) - Remove unnecessary macros and add other useful ones - Added missing library dependencies - Used directly rte_hash_secondary instead of rte_hash_alt - Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library - Renamed test_hash_perf.c temporarily to ease the view of the improved unit test - Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to make them private - Corrected copyright dates - Added an optimized function to compare keys that are multiple of 16 bytes - Improved the way to use primary/secondary signatures. Now both are stored in the bucket, so there is no need to calculate them if required. Also, there is no need to use the MSB of a signature to differenciate between an empty entry and signature 0, since we are storing both signatures, which cannot be both 0. - Removed rte_hash_rehash, as it was a very expensive operation. Therefore, the add function returns now -ENOSPC if key cannot be added because of a loop. - Prefetched new slot for new key in add function to improve performance. - Made doxygen comments more clear. - Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data, as we can use the lookup functions if we want to get the data before deleting. - Removed some unnecessary includes in rte_hash.h - Removed some unnecessary variables in rte_cuckoo_hash.c - Removed some unnecessary checks before creating a new hash table - Added documentation (in release notes and programmers guide) - Added new unit tests and replaced the performance one for hash tables Pablo de Lara (11): eal: add const in prefetch functions hash: move rte_hash structure to C file and make it internal test/hash: enhance hash unit tests test/hash: rename new hash perf unit test back to original name hash: replace existing hash library with cuckoo hash implementation hash: add new lookup_bulk_with_hash function hash: add new function rte_hash_reset hash: add new functionality to store data in hash table MAINTAINERS: claim responsability for hash library doc: announce ABI change of librte_hash doc: update hash documentation MAINTAINERS | 1 + app/test/test_hash.c | 189 +-- app/test/test_hash_perf.c | 906 ++++++------- doc/guides/prog_guide/hash_lib.rst | 77 +- doc/guides/rel_notes/abi.rst | 2 + .../common/include/arch/ppc_64/rte_prefetch.h | 6 +- .../common/include/arch/x86/rte_prefetch.h | 14 +- .../common/include/generic/rte_prefetch.h | 8 +- lib/librte_hash/Makefile | 8 +- lib/librte_hash/rte_cuckoo_hash.c | 1394 ++++++++++++++++++++ lib/librte_hash/rte_hash.c | 471 ------- lib/librte_hash/rte_hash.h | 274 +++- lib/librte_hash/rte_hash_version.map | 15 + 13 files changed, 2191 insertions(+), 1174 deletions(-) create mode 100644 lib/librte_hash/rte_cuckoo_hash.c delete mode 100644 lib/librte_hash/rte_hash.c -- 2.4.2