DPDK patches and discussions
 help / color / mirror / Atom feed
From: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
To: dev@dpdk.org
Cc: konstantin.ananyev@intel.com, yipeng1.wang@intel.com,
	sameh.gobriel@intel.com, bruce.richardson@intel.com,
	john.mcnamara@intel.com
Subject: [dpdk-dev] [PATCH v3 2/4] hash: add documentation for k32v64 hash library
Date: Wed, 15 Apr 2020 19:17:25 +0100	[thread overview]
Message-ID: <34e8fe21cbfd9cbf7d5c208564180274ee63d518.1586974411.git.vladimir.medvedkin@intel.com> (raw)
In-Reply-To: <cover.1586974408.git.vladimir.medvedkin@intel.com>
In-Reply-To: <cover.1586974408.git.vladimir.medvedkin@intel.com>

Add programmers guide and doxygen API for k32v64 hash library

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
 doc/api/doxy-api-index.md                 |  1 +
 doc/guides/prog_guide/index.rst           |  1 +
 doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++++++++++++++++++++++++++
 3 files changed, 68 insertions(+)
 create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..ed3e8d7 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -121,6 +121,7 @@ The public API headers are grouped by topics:
   [jhash]              (@ref rte_jhash.h),
   [thash]              (@ref rte_thash.h),
   [FBK hash]           (@ref rte_fbk_hash.h),
+  [K32V64 hash]        (@ref rte_k32v64_hash.h),
   [CRC hash]           (@ref rte_hash_crc.h)
 
 - **classification**
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index fb250ab..ac56da5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
     link_bonding_poll_mode_drv_lib
     timer_lib
     hash_lib
+    k32v64_hash_lib
     efd_lib
     member_lib
     lpm_lib
diff --git a/doc/guides/prog_guide/k32v64_hash_lib.rst b/doc/guides/prog_guide/k32v64_hash_lib.rst
new file mode 100644
index 0000000..238033a
--- /dev/null
+++ b/doc/guides/prog_guide/k32v64_hash_lib.rst
@@ -0,0 +1,66 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation.
+
+.. _k32v64_hash_Library:
+
+K32V64 Hash Library
+===================
+
+This hash library implementation is intended to be better optimized for 32-bit keys when compared to existing Cuckoo hash-based rte_hash implementation. Current rte_fbk implementation is pretty fast but it has a number of drawbacks such as 2 byte values and limited collision resolving capabilities. rte_hash (which is based on Cuckoo hash algorithm) doesn't have these drawbacks, but it comes at a cost of lower performance compared to rte_fbk.
+
+The following flow illustrates the source of performance penalties of Cuckoo hash:
+
+*  Loading two buckets at once (extra memory consumption)
+*  Сomparing signatures first (extra step before key comparison)
+*  If signature comparison hits, get a key index, find memory location with a key itself, and get the key (memory pressure and indirection)
+*  Using indirect call to memcmp() to compare two uint32_t (function call overhead)
+
+K32V64 hash table doesn't have the drawbacks associated with rte_fbk while offering the same performance as rte_fbk. The bucket contains 4 consecutive keys which can be compared very quickly, and subsequent keys are kept in a linked list.
+
+The main disadvantage compared to rte_hash is performance degradation with high average table utilization due to chain resolving for 5th and subsequent collisions.
+
+To estimate the probability of 5th collision we can use "birthday paradox" approach. We can figure out the number of insertions (can be treated as a load factor) that will likely yield a 50% probability of 5th collision for a given number of buckets.
+
+It could be calculated with an asymptotic formula from [1]:
+
+E(n, k) ~= (k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n -> inf
+
+,where
+
+k - level of collision
+
+n - number of buckets
+
+Г - gamma function
+
+So, for k = 5 (5th collision), and given the fact that number of buckets is a power of 2, we can simplify formula:
+
+E(n) = 2.392 * 2^(m * 4/5) , where number of buckets n = 2^m
+
+.. note::
+
+   You can calculate it by yourself using Wolfram Alpha [2]. For example for 8k buckets:
+
+   solve ((k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n = 8192, k = 5)
+
+
+API Overview
+-----------------
+
+The main configuration parameters for the hash table are:
+
+*  Total number of hash entries in the table
+*  Socket id
+
+K32V64 is "hash function-less", so user must specify precalculated hash value for every key. The main methods exported by the Hash Library are:
+
+*   Add entry with key and precomputed hash: The key, precomputed hash and value are provided as input.
+*   Delete entry with key and precomputed hash: The key and precomputed hash are provided as input.
+*   Lookup entry with key and precomputed hash: The key, precomputed hash and a pointer to expected value are provided as input. If an entry with the specified key is found in the hash table (i.e. lookup hit), then the value associated with the key will be written to the memory specified by the pointer, and the function will return 0; otherwise (i.e. a lookup miss) a negative value is returned, and memory described by the pointer is not modified.
+
+References
+----------
+
+[1] M.S. Klamkin and D.J. Newman, Extensions of the Birthday Surprise
+
+[2] https://www.wolframalpha.com/
-- 
2.7.4


  parent reply	other threads:[~2020-04-15 18:17 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 1/3] hash: add dwk hash library Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 2/3] test: add dwk hash autotests Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 3/3] test: add dwk perf tests Vladimir Medvedkin
2020-03-16 14:39 ` [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Morten Brørup
2020-03-16 18:27   ` Medvedkin, Vladimir
2020-03-16 19:32     ` Stephen Hemminger
2020-03-17 19:52       ` Wang, Yipeng1
2020-03-26 17:28         ` Medvedkin, Vladimir
2020-03-31 19:55           ` Thomas Monjalon
2020-03-31 21:17             ` Honnappa Nagarahalli
2020-04-01 18:37               ` Medvedkin, Vladimir
2020-04-01 18:28             ` Medvedkin, Vladimir
2020-03-16 19:33     ` Morten Brørup
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
2020-04-15 18:17   ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
2020-04-15 18:51     ` Mattias Rönnblom
2020-04-16 10:18       ` Medvedkin, Vladimir
2020-04-16 11:40         ` Mattias Rönnblom
2020-04-17  0:21           ` Wang, Yipeng1
2020-04-23 16:19             ` Ananyev, Konstantin
2020-05-08 20:08             ` Medvedkin, Vladimir
2020-04-16  9:39     ` Thomas Monjalon
2020-04-16 14:02       ` Medvedkin, Vladimir
2020-04-16 14:38         ` Thomas Monjalon
2020-05-08 19:58     ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
2020-06-16 16:37       ` Thomas Monjalon
2021-03-24 21:28       ` Thomas Monjalon
2021-03-25 12:03         ` Medvedkin, Vladimir
2023-06-12 16:11           ` Stephen Hemminger
2020-05-08 19:58     ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
2020-06-23 15:44       ` Ananyev, Konstantin
2020-06-23 23:06         ` Ananyev, Konstantin
2020-06-25 19:56           ` Medvedkin, Vladimir
2020-06-25 19:49         ` Medvedkin, Vladimir
2020-06-24  1:19       ` Wang, Yipeng1
2020-06-25 20:26         ` Medvedkin, Vladimir
2020-06-25  4:27       ` Honnappa Nagarahalli
2020-05-08 19:58     ` [dpdk-dev] [PATCH v4 2/4] hash: add documentation for " Vladimir Medvedkin
2020-05-08 19:58     ` [dpdk-dev] [PATCH v4 3/4] test: add kv hash autotests Vladimir Medvedkin
2020-05-08 19:58     ` [dpdk-dev] [PATCH v4 4/4] test: add kv perf tests Vladimir Medvedkin
2020-04-15 18:17   ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-15 18:59     ` Mattias Rönnblom
2020-04-16 10:23       ` Medvedkin, Vladimir
2020-04-23 13:31     ` Ananyev, Konstantin
2020-05-08 20:14       ` Medvedkin, Vladimir
2020-04-29 21:29     ` Honnappa Nagarahalli
2020-05-08 20:38       ` Medvedkin, Vladimir
2020-04-15 18:17   ` Vladimir Medvedkin [this message]
2020-04-15 18:17   ` [dpdk-dev] [PATCH v3 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-15 18:17   ` [dpdk-dev] [PATCH v3 4/4] test: add k32v64 perf tests Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-08 23:23   ` Ananyev, Konstantin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 2/4] hash: add documentation for " Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 4/4] test: add k32v64 perf tests Vladimir Medvedkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34e8fe21cbfd9cbf7d5c208564180274ee63d518.1586974411.git.vladimir.medvedkin@intel.com \
    --to=vladimir.medvedkin@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=john.mcnamara@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=sameh.gobriel@intel.com \
    --cc=yipeng1.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).