From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 5E9474CB3 for ; Tue, 2 Oct 2018 03:40:14 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2018 18:40:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,329,1534834800"; d="scan'208";a="91287626" Received: from skx-yipeng.jf.intel.com ([10.54.81.175]) by fmsmga002.fm.intel.com with ESMTP; 01 Oct 2018 18:40:13 -0700 From: Yipeng Wang To: bruce.richardson@intel.com Cc: konstantin.ananyev@intel.com, dev@dpdk.org, yipeng1.wang@intel.com, honnappa.nagarahalli@arm.com, sameh.gobriel@intel.com Date: Mon, 1 Oct 2018 11:34:58 -0700 Message-Id: <1538418902-154892-1-git-send-email-yipeng1.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537993618-92630-1-git-send-email-yipeng1.wang@intel.com> References: <1537993618-92630-1-git-send-email-yipeng1.wang@intel.com> Subject: [dpdk-dev] [PATCH v5 0/4] hash: add extendable bucket and partial key hashing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 01:40:16 -0000 This patch set made two major optimizations over the current rte_hash library. First, it adds Extendable Bucket Table feature: a new structure that can accommodate keys that failed to get inserted into the main hash table due to the unlikely event of excessive hash collisions. The hash table buckets will get extended using a linked list to host these keys. This new design will guarantee insertion of 100% of the keys for a given hash table size with minimal overhead. A new flag value is added for user to indicate if the extendable bucket feature should be enabled or not. The linked list buckets is similar concept to the extendable bucket hash table in packet framework. In details, for insertion, the linked buckets will be used to store the keys that fail to get in the primary and the secondary bucket and the cuckoo path could not find an empty location for the maximum path length (small probability). For lookup, the key is checked first in the primary, then the secondary, then if the secondary is extended the linked list is traversed for a possible match. Second, the patch set changes the current hashing algorithm to be "partial-key hashing". Partial-key hashing is the concept from Bin Fan, et al.'s paper "MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing". Instead of storing both 32-bit signature and alternative signature in the bucket, we only store a small 16-bit signature and calculate the alternative bucket index by XORing the signature with the current bucket index. This doubles the hash table memory efficiency since now one bucket only occupies one cache line instead of two in the original design. v4->v5: 1. hash: for the first commit, move back the lock and read "position" in the while condition as Honnappa suggested. 2. hash: minor coding style change (Honnappa) and commit message typo fix. 3. Add Review-by from Honnappa. v3->v4: 1. hash: Revise commit message to be more clear for "utilization" (Honnappa) 2. hash: in delete key function, return bucket change to use rte_ring_sp_enqueue instead of rte_ring_mp_enqueue, since it is already protected inside locks. 3. hash: update rte_hash_iterate comments (Honnappa) 4. hash: Add a new commit to fix race condition in the rte_hash_iterate (Honnappa) 5. hash/test: during utilization test, double check rte_hash_cnt returns correct value (Honnappa) 6. hash: for partial-key-hashing commit, break the get_buckets_index function into three. It may make future extension easier (Honnappa) 7. hash: change the comment for typedef uint32_t hash_sig_t to be more clear to users (Honnappa) v2->v3: The first four commits were separated from this patch set as another independent patch set: https://mails.dpdk.org/archives/dev/2018-September/113118.html 1. hash: move snprintf for ext_ring name under the ext_table condition. 2. hash: fix memory leak by freeing ext_buckets in rte_hash_free. 3. hash: after failing cuckoo path, search not only ext buckets, but also the secondary bucket first to see if there may be an empty location now. 4. hash: totally rewrote the key deleting function logic. If the deleted key was not in the last bucket of the linked list when ext table enabled, the last entry in the linked list will be placed in the vacant slot from the deleted key. The purpose is to compact the entries in the linked list to be more close to the main table. This is to make sure that not many extendable buckets are wasted with only one or two entries after some time of running, also benefit lookup speed. 5. Other minor coding style/comments improvements. V1->V2: 1. hash: Rewrite rte_hash_get_last_bkt to be more concise. 2. hash: Reorder the rte_hash struct to align cache line better. 3. test: Minor changes in auto test to add key insertion failure check during iteration test. 4. test: Add new commit to fix read-write test non-consecutive core issue. 4. hash: Add a new commit to remove unnecessary code introduced by previous patches. 5. hash: Comments improvement and coding style improvements over multiple places. Signed-off-by: Yipeng Wang Yipeng Wang (4): hash: fix race condition in iterate hash: add extendable bucket feature test/hash: implement extendable bucket hash test hash: use partial-key hashing lib/librte_hash/rte_cuckoo_hash.c | 580 ++++++++++++++++++++++++++++---------- lib/librte_hash/rte_cuckoo_hash.h | 11 +- lib/librte_hash/rte_hash.h | 8 +- test/test/test_hash.c | 159 ++++++++++- test/test/test_hash_perf.c | 114 ++++++-- 5 files changed, 677 insertions(+), 195 deletions(-) -- 2.7.4