From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f173.google.com (mail-wi0-f173.google.com [209.85.212.173]) by dpdk.org (Postfix) with ESMTP id ABFD45A86 for ; Mon, 13 Jul 2015 00:48:08 +0200 (CEST) Received: by wicmz13 with SMTP id mz13so47759535wic.0 for ; Sun, 12 Jul 2015 15:48:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:organization :user-agent:in-reply-to:references:mime-version :content-transfer-encoding:content-type; bh=1Z9/rkhfqFsLBOkLtHBSCAUaJfrET8uXBuW/er8ljVk=; b=Igw8ZWD/+pK8n2jdYP3Y4YOKjvgzW5jH1cA0sKm297Mu0e6REw1GaltwJ9oNeZ7CeQ PrxXWi+iRqsYnEnAF2oKfjduKRwyIxsS4hbragzYlJBujU9YBwT3AovECRk97Oe1Ejaa 3y6PyHKtd1OBRM0s7/UgGHhMI7uqfizm33+JjyF9sL24QVlMsMM6vEX0Vc3C+ZmIEP9U YpHmZfXtwtCasfVu4vnsBKDQed8R9kX97CR8gPyA5p4m9MZG+lcjIOvKlYHP1xkEWsxE bXzJ+RKUDWTaw+MEWsS78ov4sWKnJ91HN9bU/XB5ARnUKM5i8He0ky9t0Zs3DiOGi6G9 pF6A== X-Gm-Message-State: ALoCoQke0MsgPiyyRmlXHmQafiBdeVXtODiZUbOZbnj6h9Fd36/C57CsG7PlBJItD6eAcYVvdAoU X-Received: by 10.194.203.138 with SMTP id kq10mr63617887wjc.124.1436741288563; Sun, 12 Jul 2015 15:48:08 -0700 (PDT) Received: from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136]) by smtp.gmail.com with ESMTPSA id x5sm11044287wif.21.2015.07.12.15.48.07 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Jul 2015 15:48:07 -0700 (PDT) From: Thomas Monjalon To: Pablo de Lara Date: Mon, 13 Jul 2015 00:46:56 +0200 Message-ID: <2632377.THyBr7JfX0@xps13> Organization: 6WIND User-Agent: KMail/4.14.8 (Linux/4.0.4-2-ARCH; KDE/4.14.8; x86_64; ; ) In-Reply-To: <1436573936-15956-1-git-send-email-pablo.de.lara.guarch@intel.com> References: <1436571020-16252-1-git-send-email-pablo.de.lara.guarch@intel.com> <1436573936-15956-1-git-send-email-pablo.de.lara.guarch@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Jul 2015 22:48:08 -0000 2015-07-11 01:18, Pablo de Lara: > This patchset is to replace the existing hash library with > a more efficient and functional approach, using the Cuckoo hash > method to deal with collisions. This method is based on using > two different hash functions to have two possible locations > in the hash table where an entry can be. > So, if a bucket is full, a new entry can push one of the items > in that bucket to its alternative location, making space for itself. > > Advantages > ~~ > - Offers the option to store more entries when the target bucket is full > (unlike the previous implementation) > - Memory efficient: for storing those entries, it is not necessary to > request new memory, as the entries will be stored in the same table > - Constant worst lookup time: in worst case scenario, it always takes > the same time to look up an entry, as there are only two possible locations > where an entry can be. > - Storing data: user can store data in the hash table, unlike the > previous implementation, but he can still use the old API > > This implementation typically offers over 90% utilization. > Notice that API has been extended, but old API remains. > Check documentation included to know more about this new implementation > (including how entries are distributed as table utilization increases). > > Changes in v7: > - Fix inaccurate documentation > > Changes in v6: > - Replace datatype for functions from uintptr_t to void * > > Changes in v5: > - Fix 32-bit compilation issues > > Changes in v4: > - Unit tests enhancements are not part of this patchset anymore. > - rte_hash structure has been made internal in another patch, > so it is not part of this patchset anymore. > - Add function to iterate through the hash table, as rte_hash > structure has been made private. > - Added extra_flag parameter in rte_hash_parameter to be able > to add new parameters in the future without breaking the ABI > - Remove proposed lookup_bulk_with_hash function, as it is > not of much use with the existing hash functions > (there are no vector hash functions). > - User can store 8-byte integer or pointer as data, instead > of variable size data, as discussed in the mailing list. > > Changes in v3: > > - Now user can store variable size data, instead of 32 or 64-bit size data, > using the new parameter "data_len" in rte_hash_parameters > - Add lookup_bulk_with_hash function in performance unit tests > - Add new functions that handle data in performance unit tests > - Remove duplicates in performance unit tests > - Fix rte_hash_reset, which was not resetting the last entry > > Changes in v2: > > - Fixed issue where table could not store maximum number of entries > - Fixed issue where lookup burst could not be more than 32 (instead of 64) > - Remove unnecessary macros and add other useful ones > - Added missing library dependencies > - Used directly rte_hash_secondary instead of rte_hash_alt > - Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library > - Renamed test_hash_perf.c temporarily to ease the view of the improved unit test > - Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to > make them private > - Corrected copyright dates > - Added an optimized function to compare keys that are multiple of 16 bytes > - Improved the way to use primary/secondary signatures. Now both are stored in > the bucket, so there is no need to calculate them if required. > Also, there is no need to use the MSB of a signature to differenciate between > an empty entry and signature 0, since we are storing both signatures, > which cannot be both 0. > - Removed rte_hash_rehash, as it was a very expensive operation. > Therefore, the add function returns now -ENOSPC if key cannot be added > because of a loop. > - Prefetched new slot for new key in add function to improve performance. > - Made doxygen comments more clear. > - Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data, > as we can use the lookup functions if we want to get the data before deleting. > - Removed some unnecessary includes in rte_hash.h > - Removed some unnecessary variables in rte_cuckoo_hash.c > - Removed some unnecessary checks before creating a new hash table > - Added documentation (in release notes and programmers guide) > - Added new unit tests and replaced the performance one for hash tables > > Series Acked-by: Bruce Richardson Applied, thanks Some bugs/formatting were fixed in fly. Some remaining comments may be addressed in further patches.