From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 322D3C3D2 for ; Fri, 10 Jul 2015 19:24:28 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 10 Jul 2015 10:24:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,448,1432623600"; d="scan'208";a="726563745" Received: from pdelarag-mobl.ger.corp.intel.com ([10.237.208.60]) by orsmga001.jf.intel.com with SMTP; 10 Jul 2015 10:24:25 -0700 Received: by (sSMTP sendmail emulation); Fri, 10 Jul 2015 18:24:24 +0100 From: Pablo de Lara To: dev@dpdk.org Date: Fri, 10 Jul 2015 18:24:17 +0100 Message-Id: <1436549064-5200-1-git-send-email-pablo.de.lara.guarch@intel.com> X-Mailer: git-send-email 2.4.5 In-Reply-To: <1435530330-10132-1-git-send-email-pablo.de.lara.guarch@intel.com> References: <1435530330-10132-1-git-send-email-pablo.de.lara.guarch@intel.com> Subject: [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2015 17:24:28 -0000 This patchset is to replace the existing hash library with a more efficient and functional approach, using the Cuckoo hash method to deal with collisions. This method is based on using two different hash functions to have two possible locations in the hash table where an entry can be. So, if a bucket is full, a new entry can push one of the items in that bucket to its alternative location, making space for itself. Advantages ~~~~~ - Offers the option to store more entries when the target bucket is full (unlike the previous implementation) - Memory efficient: for storing those entries, it is not necessary to request new memory, as the entries will be stored in the same table - Constant worst lookup time: in worst case scenario, it always takes the same time to look up an entry, as there are only two possible locations where an entry can be. - Storing data: user can store data in the hash table, unlike the previous implementation, but he can still use the old API This implementation tipically offers over 90% utilization. Notice that API has been extended, but old API remains. Check documentation included to know more about this new implementation (including how entries are distributed as table utilization increases). Changes in v4: - Unit tests enhancements are not part of this patchset anymore. - rte_hash structure has been made internal in another patch, so it is not part of this patchset anymore. - Add function to iterate through the hash table, as rte_hash structure has been made private. - Added extra_flag parameter in rte_hash_parameter to be able to add new parameters in the future without breaking the ABI - Remove proposed lookup_bulk_with_hash function, as it is not of much use with the existing hash functions (there are no vector hash functions). - User can store 8-byte integer or pointer as data, instead of variable size data, as discussed in the mailing list. Changes in v3: - Now user can store variable size data, instead of 32 or 64-bit size data, using the new parameter "data_len" in rte_hash_parameters - Add lookup_bulk_with_hash function in performance unit tests - Add new functions that handle data in performance unit tests - Remove duplicates in performance unit tests - Fix rte_hash_reset, which was not reseting the last entry Changes in v2: - Fixed issue where table could not store maximum number of entries - Fixed issue where lookup burst could not be more than 32 (instead of 64) - Remove unnecessary macros and add other useful ones - Added missing library dependencies - Used directly rte_hash_secondary instead of rte_hash_alt - Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library - Renamed test_hash_perf.c temporarily to ease the view of the improved unit test - Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to make them private - Corrected copyright dates - Added an optimized function to compare keys that are multiple of 16 bytes - Improved the way to use primary/secondary signatures. Now both are stored in the bucket, so there is no need to calculate them if required. Also, there is no need to use the MSB of a signature to differenciate between an empty entry and signature 0, since we are storing both signatures, which cannot be both 0. - Removed rte_hash_rehash, as it was a very expensive operation. Therefore, the add function returns now -ENOSPC if key cannot be added because of a loop. - Prefetched new slot for new key in add function to improve performance. - Made doxygen comments more clear. - Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data, as we can use the lookup functions if we want to get the data before deleting. - Removed some unnecessary includes in rte_hash.h - Removed some unnecessary variables in rte_cuckoo_hash.c - Removed some unnecessary checks before creating a new hash table - Added documentation (in release notes and programmers guide) - Added new unit tests and replaced the performance one for hash tables Pablo de Lara (7): hash: replace existing hash library with cuckoo hash implementation hash: add new function rte_hash_reset hash: add new functionality to store data in hash table hash: add iterate function MAINTAINERS: claim responsability for hash library doc: announce ABI change of librte_hash doc: update hash documentation MAINTAINERS | 1 + app/test/test_hash.c | 189 +++--- app/test/test_hash_perf.c | 305 ++++++--- doc/guides/prog_guide/hash_lib.rst | 138 +++- doc/guides/prog_guide/index.rst | 4 + doc/guides/rel_notes/abi.rst | 1 + lib/librte_hash/Makefile | 8 +- lib/librte_hash/rte_cuckoo_hash.c | 1194 ++++++++++++++++++++++++++++++++++ lib/librte_hash/rte_hash.c | 499 -------------- lib/librte_hash/rte_hash.h | 198 +++++- lib/librte_hash/rte_hash_version.map | 15 + 11 files changed, 1812 insertions(+), 740 deletions(-) create mode 100644 lib/librte_hash/rte_cuckoo_hash.c delete mode 100644 lib/librte_hash/rte_hash.c -- 2.4.2