DPDK patches and discussions
 help / color / mirror / Atom feed
From: Pablo de Lara <pablo.de.lara.guarch@intel.com>
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v5 0/7] Cuckoo hash - part 3 of Cuckoo hash
Date: Fri, 10 Jul 2015 22:57:40 +0100	[thread overview]
Message-ID: <1436565467-13328-1-git-send-email-pablo.de.lara.guarch@intel.com> (raw)
In-Reply-To: <1436549064-5200-1-git-send-email-pablo.de.lara.guarch@intel.com>

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v5:
- Fix documentation
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not reseting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  305 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1812 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

  parent reply	other threads:[~2015-07-10 21:58 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-05 14:33 [dpdk-dev] [PATCH 0/6] " Pablo de Lara
2015-06-05 14:33 ` [dpdk-dev] [PATCH 1/6] eal: add const in prefetch functions Pablo de Lara
2015-06-16 15:10   ` Bruce Richardson
2015-06-05 14:33 ` [dpdk-dev] [PATCH 2/6] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-06-17 15:31   ` Bruce Richardson
2015-06-18  9:50   ` Bruce Richardson
2015-06-05 14:33 ` [dpdk-dev] [PATCH 3/6] hash: add new lookup_bulk_with_hash function Pablo de Lara
2015-06-05 14:33 ` [dpdk-dev] [PATCH 4/6] hash: add new functions rte_hash_rehash and rte_hash_reset Pablo de Lara
2015-06-05 14:33 ` [dpdk-dev] [PATCH 5/6] hash: add new functionality to store data in hash table Pablo de Lara
2015-06-05 14:33 ` [dpdk-dev] [PATCH 6/6] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-06-16 13:44 ` [dpdk-dev] [PATCH 0/6] Cuckoo hash Thomas Monjalon
2015-06-16 21:44   ` De Lara Guarch, Pablo
2015-06-25 22:05 ` [dpdk-dev] [PATCH v2 00/11] " Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 01/11] eal: add const in prefetch functions Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 02/11] hash: move rte_hash structure to C file and make it internal Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 03/11] test/hash: enhance hash unit tests Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 04/11] test/hash: rename new hash perf unit test back to original name Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 05/11] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 06/11] hash: add new lookup_bulk_with_hash function Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 07/11] hash: add new function rte_hash_reset Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 08/11] hash: add new functionality to store data in hash table Pablo de Lara
2015-06-26 16:49     ` Stephen Hemminger
2015-06-28 22:23       ` De Lara Guarch, Pablo
2015-06-30  7:36         ` Stephen Hemminger
2015-07-10 10:39         ` Bruce Richardson
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 09/11] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 10/11] doc: announce ABI change of librte_hash Pablo de Lara
2015-06-25 22:05   ` [dpdk-dev] [PATCH v2 11/11] doc: update hash documentation Pablo de Lara
2015-06-28 22:25   ` [dpdk-dev] [PATCH v3 00/11] Cuckoo hash Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 01/11] eal: add const in prefetch functions Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 02/11] hash: move rte_hash structure to C file and make it internal Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 03/11] test/hash: enhance hash unit tests Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 04/11] test/hash: rename new hash perf unit test back to original name Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 05/11] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 06/11] hash: add new lookup_bulk_with_hash function Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 07/11] hash: add new function rte_hash_reset Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 08/11] hash: add new functionality to store data in hash table Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 09/11] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 10/11] doc: announce ABI change of librte_hash Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 11/11] doc: update hash documentation Pablo de Lara
2015-07-08 23:23     ` [dpdk-dev] [PATCH v3 00/11] Cuckoo hash Thomas Monjalon
2015-07-09  8:02       ` Bruce Richardson
2015-07-10 17:24     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 2/7] hash: add new function rte_hash_reset Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 3/7] hash: add new functionality to store data in hash table Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 4/7] hash: add iterate function Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 5/7] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 17:24       ` [dpdk-dev] [PATCH v4 7/7] doc: update hash documentation Pablo de Lara
2015-07-10 20:52       ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
2015-07-10 21:57       ` Pablo de Lara [this message]
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 2/7] hash: add new function rte_hash_reset Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 3/7] hash: add new functionality to store data in hash table Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 4/7] hash: add iterate function Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 5/7] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 21:57         ` [dpdk-dev] [PATCH v5 7/7] doc: update hash documentation Pablo de Lara
2015-07-10 22:52         ` [dpdk-dev] [PATCH v5 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
2015-07-10 23:30         ` [dpdk-dev] [PATCH v6 " Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 2/7] hash: add new function rte_hash_reset Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 3/7] hash: add new functionality to store data in hash table Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 4/7] hash: add iterate function Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 5/7] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 23:30           ` [dpdk-dev] [PATCH v6 7/7] doc: update hash documentation Pablo de Lara
2015-07-11  0:18           ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-12 22:29               ` Thomas Monjalon
2015-07-13 16:11                 ` Bruce Richardson
2015-07-13 16:14                   ` Bruce Richardson
2015-07-13 16:20                     ` Thomas Monjalon
2015-07-13 16:26                       ` Bruce Richardson
2015-07-16  9:39               ` Tony Lu
2015-07-16 20:42                 ` De Lara Guarch, Pablo
2015-07-17  3:34                   ` Tony Lu
2015-07-17  7:34                     ` De Lara Guarch, Pablo
2015-07-17  7:58                       ` Tony Lu
2015-07-17  9:06                         ` De Lara Guarch, Pablo
2015-07-17 14:39                           ` Tony Lu
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 2/7] hash: add new function rte_hash_reset Pablo de Lara
2015-07-12 22:16               ` Thomas Monjalon
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 3/7] hash: add new functionality to store data in hash table Pablo de Lara
2015-07-12 22:00               ` Thomas Monjalon
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 4/7] hash: add iterate function Pablo de Lara
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 5/7] MAINTAINERS: claim responsability for hash library Pablo de Lara
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-12 22:38               ` Thomas Monjalon
2015-07-11  0:18             ` [dpdk-dev] [PATCH v7 7/7] doc: update hash documentation Pablo de Lara
2015-07-12 22:46             ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1436565467-13328-1-git-send-email-pablo.de.lara.guarch@intel.com \
    --to=pablo.de.lara.guarch@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).