* [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
@ 2020-03-16 13:38 Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 1/3] hash: add dwk hash library Vladimir Medvedkin
` (8 more replies)
0 siblings, 9 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-03-16 13:38 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Currently DPDK has a special implementation of a hash table for
4 byte keys which is called FBK hash. Unfortunately its main drawback
is that it only supports 2 byte values.
The new implementation called DWK (double word key) hash
supports 8 byte values, which is enough to store a pointer.
It would also be nice to get feedback on whether to leave the old FBK
and new DWK implementations, or whether to deprecate the old one?
Vladimir Medvedkin (3):
hash: add dwk hash library
test: add dwk hash autotests
test: add dwk perf tests
app/test/Makefile | 1 +
app/test/meson.build | 1 +
app/test/test_dwk_hash.c | 229 +++++++++++++++++++++++++++++
app/test/test_hash_perf.c | 81 +++++++++++
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 4 +-
lib/librte_hash/meson.build | 5 +-
lib/librte_hash/rte_dwk_hash.c | 271 +++++++++++++++++++++++++++++++++++
lib/librte_hash/rte_dwk_hash.h | 178 +++++++++++++++++++++++
lib/librte_hash/rte_hash_version.map | 5 +
10 files changed, 773 insertions(+), 4 deletions(-)
create mode 100644 app/test/test_dwk_hash.c
create mode 100644 lib/librte_hash/rte_dwk_hash.c
create mode 100644 lib/librte_hash/rte_dwk_hash.h
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH 1/3] hash: add dwk hash library
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
@ 2020-03-16 13:38 ` Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 2/3] test: add dwk hash autotests Vladimir Medvedkin
` (7 subsequent siblings)
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-03-16 13:38 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
DWK hash is a Double Word Key(uint32_t) hash table.
The value is uint64_t.
This table is hash function agnostic so user must provide
precalculated hash signature for add/delete/lookup operations.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 4 +-
lib/librte_hash/meson.build | 5 +-
lib/librte_hash/rte_dwk_hash.c | 271 +++++++++++++++++++++++++++++++++++
lib/librte_hash/rte_dwk_hash.h | 178 +++++++++++++++++++++++
lib/librte_hash/rte_hash_version.map | 5 +
6 files changed, 461 insertions(+), 4 deletions(-)
create mode 100644 lib/librte_hash/rte_dwk_hash.c
create mode 100644 lib/librte_hash/rte_dwk_hash.h
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a8c02e4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
librte_net librte_hash librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
-DEPDIRS-librte_hash := librte_eal librte_ring
+DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
index 9b36097..9944963 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -8,13 +8,14 @@ LIB = librte_hash.a
CFLAGS += -O3 -DALLOW_EXPERIMENTAL_API
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_ring
+LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
EXPORT_MAP := rte_hash_version.map
# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_dwk_hash.c
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
@@ -27,5 +28,6 @@ endif
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_dwk_hash.h
include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index bce11ad..42c89ab 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -3,13 +3,14 @@
headers = files('rte_crc_arm64.h',
'rte_fbk_hash.h',
+ 'rte_dwk_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
'rte_jhash.h',
'rte_thash.h')
-sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
-deps += ['ring']
+sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_dwk_hash.c')
+deps += ['ring', 'mempool']
# rte ring reset is not yet part of stable API
allow_experimental_apis = true
diff --git a/lib/librte_hash/rte_dwk_hash.c b/lib/librte_hash/rte_dwk_hash.c
new file mode 100644
index 0000000..8c1dadd
--- /dev/null
+++ b/lib/librte_hash/rte_dwk_hash.c
@@ -0,0 +1,271 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <string.h>
+
+#include <rte_eal_memconfig.h>
+#include <rte_errno.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_tailq.h>
+
+#include <rte_dwk_hash.h>
+
+TAILQ_HEAD(rte_dwk_hash_list, rte_tailq_entry);
+
+static struct rte_tailq_elem rte_dwk_hash_tailq = {
+ .name = "RTE_DWK_HASH",
+};
+
+EAL_REGISTER_TAILQ(rte_dwk_hash_tailq);
+
+#define VALID_KEY_MSK ((1 << RTE_DWK_KEYS_PER_BUCKET) - 1)
+
+int
+rte_dwk_hash_add(struct rte_dwk_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value)
+{
+ uint32_t bucket;
+ int i, idx, ret;
+ uint8_t msk;
+ struct rte_dwk_ext_ent *tmp, *ent, *prev = NULL;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+ /* Search key in table. Update value if exists */
+ for (i = 0; i < RTE_DWK_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ table->t[bucket].val[i] = value;
+ return 0;
+ }
+ }
+
+ if (!SLIST_EMPTY(&table->t[bucket].head)) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ ent->val = value;
+ return 0;
+ }
+ }
+ }
+
+ msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
+ if (msk) {
+ idx = __builtin_ctz(msk);
+ table->t[bucket].key[idx] = key;
+ table->t[bucket].val[idx] = value;
+ rte_smp_wmb();
+ table->t[bucket].key_mask |= 1 << idx;
+ return 0;
+ }
+
+ ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
+ if (ret < 0)
+ return ret;
+
+ SLIST_NEXT(ent, next) = NULL;
+ ent->key = key;
+ ent->val = value;
+ rte_smp_wmb();
+ SLIST_FOREACH(tmp, &table->t[bucket].head, next)
+ prev = tmp;
+
+ if (prev == NULL)
+ SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
+ else
+ SLIST_INSERT_AFTER(prev, ent, next);
+
+ return 0;
+}
+
+int
+rte_dwk_hash_delete(struct rte_dwk_hash_table *table, uint32_t key,
+ uint32_t hash)
+{
+ uint32_t bucket;
+ int i;
+ struct rte_dwk_ext_ent *ent;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+
+ for (i = 0; i < RTE_DWK_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ ent = SLIST_FIRST(&table->t[bucket].head);
+ if (ent) {
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ table->t[bucket].key[i] = ent->key;
+ table->t[bucket].val[i] = ent->val;
+ SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ } else
+ table->t[bucket].key_mask &= ~(1 << i);
+ if (ent)
+ rte_mempool_put(table->ext_ent_pool, ent);
+ return 0;
+ }
+ }
+
+ SLIST_FOREACH(ent, &table->t[bucket].head, next)
+ if (ent->key == key)
+ break;
+
+ if (ent == NULL)
+ return -ENOENT;
+
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ SLIST_REMOVE(&table->t[bucket].head, ent, rte_dwk_ext_ent, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ rte_mempool_put(table->ext_ent_pool, ent);
+
+ return 0;
+}
+
+struct rte_dwk_hash_table *
+rte_dwk_hash_find_existing(const char *name)
+{
+ struct rte_dwk_hash_table *h = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_dwk_hash_list *dwk_hash_list;
+
+ dwk_hash_list = RTE_TAILQ_CAST(rte_dwk_hash_tailq.head,
+ rte_dwk_hash_list);
+
+ rte_mcfg_tailq_read_lock();
+ TAILQ_FOREACH(te, dwk_hash_list, next) {
+ h = (struct rte_dwk_hash_table *) te->data;
+ if (strncmp(name, h->name, RTE_DWK_HASH_NAMESIZE) == 0)
+ break;
+ }
+ rte_mcfg_tailq_read_unlock();
+ if (te == NULL) {
+ rte_errno = ENOENT;
+ return NULL;
+ }
+ return h;
+}
+
+struct rte_dwk_hash_table *
+rte_dwk_hash_create(const struct rte_dwk_hash_params *params)
+{
+ char hash_name[RTE_DWK_HASH_NAMESIZE];
+ struct rte_dwk_hash_table *ht = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_dwk_hash_list *dwk_hash_list;
+ uint32_t mem_size, nb_buckets, max_ent;
+ int ret;
+ struct rte_mempool *mp;
+
+ if ((params == NULL) || (params->name == NULL) ||
+ (params->entries == 0)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ dwk_hash_list = RTE_TAILQ_CAST(rte_dwk_hash_tailq.head,
+ rte_dwk_hash_list);
+
+ ret = snprintf(hash_name, sizeof(hash_name), "DWK_%s", params->name);
+ if (ret < 0 || ret >= RTE_DWK_HASH_NAMESIZE) {
+ rte_errno = ENAMETOOLONG;
+ return NULL;
+ }
+
+ max_ent = rte_align32pow2(params->entries);
+ nb_buckets = max_ent / RTE_DWK_KEYS_PER_BUCKET;
+ mem_size = sizeof(struct rte_dwk_hash_table) +
+ sizeof(struct rte_dwk_hash_bucket) * nb_buckets;
+
+ mp = rte_mempool_create(hash_name, max_ent,
+ sizeof(struct rte_dwk_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
+ params->socket_id, 0);
+
+ if (mp == NULL)
+ return NULL;
+
+ rte_mcfg_tailq_write_lock();
+ TAILQ_FOREACH(te, dwk_hash_list, next) {
+ ht = (struct rte_dwk_hash_table *) te->data;
+ if (strncmp(params->name, ht->name,
+ RTE_DWK_HASH_NAMESIZE) == 0)
+ break;
+ }
+ ht = NULL;
+ if (te != NULL) {
+ rte_errno = EEXIST;
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ te = rte_zmalloc("DWK_HASH_TAILQ_ENTRY", sizeof(*te), 0);
+ if (te == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ ht = rte_zmalloc_socket(hash_name, mem_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+ if (ht == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
+ rte_free(te);
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ memcpy(ht->name, hash_name, sizeof(ht->name));
+ ht->max_ent = max_ent;
+ ht->bucket_msk = nb_buckets - 1;
+ ht->ext_ent_pool = mp;
+
+ te->data = (void *)ht;
+ TAILQ_INSERT_TAIL(dwk_hash_list, te, next);
+
+exit:
+ rte_mcfg_tailq_write_unlock();
+
+ return ht;
+}
+
+void
+rte_dwk_hash_free(struct rte_dwk_hash_table *ht)
+{
+ struct rte_tailq_entry *te;
+ struct rte_dwk_hash_list *dwk_hash_list;
+
+ if (ht == NULL)
+ return;
+
+ dwk_hash_list = RTE_TAILQ_CAST(rte_dwk_hash_tailq.head,
+ rte_dwk_hash_list);
+
+ rte_mcfg_tailq_write_lock();
+
+ /* find out tailq entry */
+ TAILQ_FOREACH(te, dwk_hash_list, next) {
+ if (te->data == (void *) ht)
+ break;
+ }
+
+
+ if (te == NULL) {
+ rte_mcfg_tailq_write_unlock();
+ return;
+ }
+
+ TAILQ_REMOVE(dwk_hash_list, te, next);
+
+ rte_mcfg_tailq_write_unlock();
+
+ rte_mempool_free(ht->ext_ent_pool);
+ rte_free(ht);
+ rte_free(te);
+}
+
diff --git a/lib/librte_hash/rte_dwk_hash.h b/lib/librte_hash/rte_dwk_hash.h
new file mode 100644
index 0000000..98e21ee
--- /dev/null
+++ b/lib/librte_hash/rte_dwk_hash.h
@@ -0,0 +1,178 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_DWK_HASH_H_
+#define _RTE_DWK_HASH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_compat.h>
+#include <rte_atomic.h>
+#include <rte_mempool.h>
+
+#define RTE_DWK_HASH_NAMESIZE 32
+#define RTE_DWK_KEYS_PER_BUCKET 4
+#define RTE_DWK_WRITE_IN_PROGRESS 1
+
+struct rte_dwk_hash_params {
+ const char *name;
+ uint32_t entries;
+ int socket_id;
+};
+
+struct rte_dwk_ext_ent {
+ SLIST_ENTRY(rte_dwk_ext_ent) next;
+ uint32_t key;
+ uint64_t val;
+};
+
+struct rte_dwk_hash_bucket {
+ uint32_t key[RTE_DWK_KEYS_PER_BUCKET];
+ uint64_t val[RTE_DWK_KEYS_PER_BUCKET];
+ uint8_t key_mask;
+ rte_atomic32_t cnt;
+ SLIST_HEAD(rte_dwk_list_head, rte_dwk_ext_ent) head;
+} __rte_cache_aligned;
+
+struct rte_dwk_hash_table {
+ char name[RTE_DWK_HASH_NAMESIZE]; /**< Name of the hash. */
+ uint32_t nb_ent;
+ uint32_t max_ent;
+ uint32_t bucket_msk;
+ struct rte_mempool *ext_ent_pool;
+ __extension__ struct rte_dwk_hash_bucket t[0];
+};
+
+static inline int
+rte_dwk_hash_lookup(struct rte_dwk_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ uint64_t val = 0;
+ struct rte_dwk_ext_ent *ent;
+ int32_t cnt;
+ int i, found = 0;
+ uint32_t bucket = hash & table->bucket_msk;
+
+ do {
+ do
+ cnt = rte_atomic32_read(&table->t[bucket].cnt);
+ while (unlikely(cnt & RTE_DWK_WRITE_IN_PROGRESS));
+
+ for (i = 0; i < RTE_DWK_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask &
+ (1 << i))) {
+ val = table->t[bucket].val[i];
+ found = 1;
+ break;
+ }
+ }
+
+ if (unlikely((found == 0) &&
+ (!SLIST_EMPTY(&table->t[bucket].head)))) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ val = ent->val;
+ found = 1;
+ break;
+ }
+ }
+ }
+
+ } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
+
+ if (found == 1) {
+ *value = val;
+ return 0;
+ } else
+ return -ENOENT;
+}
+
+/**
+ * Add a key to an existing hash table with hash value.
+ * This operation is not multi-thread safe
+ * and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to add the key to.
+ * @param key
+ * Key to add to the hash table.
+ * @param value
+ * Value to associate with key.
+ * @param hash
+ * Hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_dwk_hash_add(struct rte_dwk_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value);
+
+/**
+ * Remove a key with a given hash value from an existing hash table.
+ * This operation is not multi-thread
+ * safe and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to remove the key from.
+ * @param key
+ * Key to remove from the hash table.
+ * @param hash
+ * hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_dwk_hash_delete(struct rte_dwk_hash_table *table, uint32_t key,
+ uint32_t hash);
+
+
+/**
+ * Performs a lookup for an existing hash table, and returns a pointer to
+ * the table if found.
+ *
+ * @param name
+ * Name of the hash table to find
+ *
+ * @return
+ * pointer to hash table structure or NULL on error with rte_errno
+ * set appropriately.
+ */
+__rte_experimental
+struct rte_dwk_hash_table *
+rte_dwk_hash_find_existing(const char *name);
+
+/**
+ * Create a new hash table for use with four byte keys.
+ *
+ * @param params
+ * Parameters used in creation of hash table.
+ *
+ * @return
+ * Pointer to hash table structure that is used in future hash table
+ * operations, or NULL on error with rte_errno set appropriately.
+ */
+__rte_experimental
+struct rte_dwk_hash_table *
+rte_dwk_hash_create(const struct rte_dwk_hash_params *params);
+
+/**
+ * Free all memory used by a hash table.
+ *
+ * @param table
+ * Hash table to deallocate.
+ */
+__rte_experimental
+void
+rte_dwk_hash_free(struct rte_dwk_hash_table *table);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DWK_HASH_H_ */
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index a8fbbc3..65d2bda 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -35,4 +35,9 @@ EXPERIMENTAL {
rte_hash_free_key_with_position;
rte_hash_max_key_id;
+ rte_dwk_hash_create;
+ rte_dwk_hash_find_existing;
+ rte_dwk_hash_free;
+ rte_dwk_hash_add;
+ rte_dwk_hash_delete;
};
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH 2/3] test: add dwk hash autotests
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 1/3] hash: add dwk hash library Vladimir Medvedkin
@ 2020-03-16 13:38 ` Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 3/3] test: add dwk perf tests Vladimir Medvedkin
` (6 subsequent siblings)
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-03-16 13:38 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add autotests for rte_dwk_hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/Makefile | 1 +
app/test/meson.build | 1 +
app/test/test_dwk_hash.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 231 insertions(+)
create mode 100644 app/test/test_dwk_hash.c
diff --git a/app/test/Makefile b/app/test/Makefile
index 1f080d1..6f3a461 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -73,6 +73,7 @@ SRCS-y += test_bitmap.c
SRCS-y += test_reciprocal_division.c
SRCS-y += test_reciprocal_division_perf.c
SRCS-y += test_fbarray.c
+SRCS-y += test_dwk_hash.c
SRCS-y += test_external_mem.c
SRCS-y += test_rand_perf.c
diff --git a/app/test/meson.build b/app/test/meson.build
index 0a2ce71..2433be3 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -45,6 +45,7 @@ test_sources = files('commands.c',
'test_eventdev.c',
'test_external_mem.c',
'test_fbarray.c',
+ 'test_dwk_hash.c',
'test_fib.c',
'test_fib_perf.c',
'test_fib6.c',
diff --git a/app/test/test_dwk_hash.c b/app/test/test_dwk_hash.c
new file mode 100644
index 0000000..bf47331
--- /dev/null
+++ b/app/test/test_dwk_hash.c
@@ -0,0 +1,229 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_lcore.h>
+#include <rte_dwk_hash.h>
+#include <rte_hash_crc.h>
+
+#include "test.h"
+
+typedef int32_t (*rte_dwk_hash_test)(void);
+
+static int32_t test_create_invalid(void);
+static int32_t test_multiple_create(void);
+static int32_t test_free_null(void);
+static int32_t test_add_del_invalid(void);
+static int32_t test_basic(void);
+
+#define MAX_ENT (1 << 22)
+
+/*
+ * Check that rte_dwk_hash_create fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test_create_invalid(void)
+{
+ struct rte_dwk_hash_table *dwk_hash = NULL;
+ struct rte_dwk_hash_params config;
+
+ /* rte_dwk_hash_create: dwk_hash name == NULL */
+ config.name = NULL;
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.name = "test_dwk_hash";
+
+ /* rte_dwk_hash_create: config == NULL */
+ dwk_hash = rte_dwk_hash_create(NULL);
+ RTE_TEST_ASSERT(dwk_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+
+ /* socket_id < -1 is invalid */
+ config.socket_id = -2;
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.socket_id = rte_socket_id();
+
+ /* rte_dwk_hash_create: entries = 0 */
+ config.entries = 0;
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.entries = MAX_ENT;
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Create dwk_hash table then delete dwk_hash table 10 times
+ * Use a slightly different rules size each time
+ */
+#include <rte_errno.h>
+int32_t
+test_multiple_create(void)
+{
+ struct rte_dwk_hash_table *dwk_hash = NULL;
+ struct rte_dwk_hash_params config;
+ int32_t i;
+
+
+ for (i = 0; i < 100; i++) {
+ config.name = "test_dwk_hash";
+ config.socket_id = -1;
+ config.entries = MAX_ENT - i;
+
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash != NULL,
+ "Failed to create dwk hash\n");
+ rte_dwk_hash_free(dwk_hash);
+ }
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call rte_dwk_hash_free for NULL pointer user input.
+ * Note: free has no return and therefore it is impossible
+ * to check for failure but this test is added to
+ * increase function coverage metrics and to validate that
+ * freeing null does not crash.
+ */
+int32_t
+test_free_null(void)
+{
+ struct rte_dwk_hash_table *dwk_hash = NULL;
+ struct rte_dwk_hash_params config;
+
+ config.name = "test_dwk";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash != NULL, "Failed to create dwk hash\n");
+
+ rte_dwk_hash_free(dwk_hash);
+ rte_dwk_hash_free(NULL);
+ return TEST_SUCCESS;
+}
+
+/*
+ * Check that rte_dwk_hash_add fails gracefully for
+ * incorrect user input arguments
+ */
+int32_t
+test_add_del_invalid(void)
+{
+ uint32_t key = 10;
+ uint64_t val = 20;
+ int ret;
+
+ /* rte_dwk_hash_add: dwk_hash == NULL */
+ ret = rte_dwk_hash_add(NULL, key, rte_hash_crc_4byte(key, 0), val);
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ /* rte_dwk_hash_delete: dwk_hash == NULL */
+ ret = rte_dwk_hash_delete(NULL, key, rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call add, lookup and delete for a single rule
+ */
+int32_t
+test_basic(void)
+{
+ struct rte_dwk_hash_table *dwk_hash = NULL;
+ struct rte_dwk_hash_params config;
+ uint32_t key = 10;
+ uint64_t value = 20;
+ uint64_t ret_val = 0;
+ int ret;
+
+ config.name = "test_dwk";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ dwk_hash = rte_dwk_hash_create(&config);
+ RTE_TEST_ASSERT(dwk_hash != NULL, "Failed to create dwk hash\n");
+
+ ret = rte_dwk_hash_lookup(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ ret = rte_dwk_hash_delete(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -ENOENT, "Delete return incorrect result\n");
+
+ ret = rte_dwk_hash_add(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0), value);
+ RTE_TEST_ASSERT(ret == 0, "Can not add key into the table\n");
+
+ ret = rte_dwk_hash_lookup(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(((ret == 0) && (value == ret_val)),
+ "Lookup return incorrect result\n");
+
+ ret = rte_dwk_hash_delete(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == 0, "Can not delete key from table\n");
+
+ ret = rte_dwk_hash_lookup(dwk_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ rte_dwk_hash_free(dwk_hash);
+
+ return TEST_SUCCESS;
+}
+
+static struct unit_test_suite dwk_hash_tests = {
+ .suite_name = "dwk_hash autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_create_invalid),
+ TEST_CASE(test_free_null),
+ TEST_CASE(test_add_del_invalid),
+ TEST_CASE(test_basic),
+ TEST_CASES_END()
+ }
+};
+
+static struct unit_test_suite dwk_hash_slow_tests = {
+ .suite_name = "dwk_hash slow autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_multiple_create),
+ TEST_CASES_END()
+ }
+};
+
+/*
+ * Do all unit tests.
+ */
+static int
+test_dwk_hash(void)
+{
+ return unit_test_suite_runner(&dwk_hash_tests);
+}
+
+static int
+test_slow_dwk_hash(void)
+{
+ return unit_test_suite_runner(&dwk_hash_slow_tests);
+}
+
+REGISTER_TEST_COMMAND(dwk_hash_autotest, test_dwk_hash);
+REGISTER_TEST_COMMAND(dwk_hash_slow_autotest, test_slow_dwk_hash);
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH 3/3] test: add dwk perf tests
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 1/3] hash: add dwk hash library Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 2/3] test: add dwk hash autotests Vladimir Medvedkin
@ 2020-03-16 13:38 ` Vladimir Medvedkin
2020-03-16 14:39 ` [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Morten Brørup
` (5 subsequent siblings)
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-03-16 13:38 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add performance tests for rte_dwk_hash.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/test_hash_perf.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 81 insertions(+)
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index a438eae..f616af1 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -12,8 +12,10 @@
#include <rte_hash_crc.h>
#include <rte_jhash.h>
#include <rte_fbk_hash.h>
+#include <rte_dwk_hash.h>
#include <rte_random.h>
#include <rte_string_fns.h>
+#include <rte_hash_crc.h>
#include "test.h"
@@ -29,6 +31,8 @@
#define NUM_SHUFFLES 10
#define BURST_SIZE 16
+#define CRC_INIT_VAL 0xdeadbeef
+
enum operations {
ADD = 0,
LOOKUP,
@@ -669,6 +673,80 @@ fbk_hash_perf_test(void)
}
static int
+dwk_hash_perf_test(void)
+{
+ struct rte_dwk_hash_params params = {
+ .name = "dwk_hash_test",
+ .entries = ENTRIES,
+ .socket_id = rte_socket_id(),
+ };
+ struct rte_dwk_hash_table *handle = NULL;
+ uint32_t *keys = NULL;
+ unsigned indexes[TEST_SIZE];
+ uint64_t tmp_val;
+ uint64_t lookup_time = 0;
+ unsigned added = 0;
+ uint32_t key;
+ uint16_t val;
+ unsigned i, j;
+ int ret = 0;
+
+ handle = rte_dwk_hash_create(¶ms);
+ if (handle == NULL) {
+ printf("Error creating table\n");
+ return -1;
+ }
+
+ keys = rte_zmalloc(NULL, ENTRIES * sizeof(*keys), 0);
+ if (keys == NULL) {
+ printf("fbk hash: memory allocation for key store failed\n");
+ return -1;
+ }
+
+ /* Generate random keys and values. */
+ for (i = 0; i < ENTRIES; i++) {
+ key = (uint32_t)rte_rand();
+ val = rte_rand();
+
+ if (rte_dwk_hash_add(handle, key, rte_hash_crc_4byte(key,
+ CRC_INIT_VAL), val) == 0) {
+ keys[added] = key;
+ added++;
+ }
+ }
+
+ for (i = 0; i < TEST_ITERATIONS; i++) {
+ uint64_t begin;
+ uint64_t end;
+
+ /* Generate random indexes into keys[] array. */
+ for (j = 0; j < TEST_SIZE; j++)
+ indexes[j] = rte_rand() % added;
+
+ begin = rte_rdtsc();
+ /* Do lookups */
+ for (j = 0; j < TEST_SIZE; j++)
+ ret += rte_dwk_hash_lookup(handle,
+ keys[indexes[j]],
+ rte_hash_crc_4byte(keys[indexes[j]],
+ CRC_INIT_VAL), &tmp_val);
+
+ end = rte_rdtsc();
+ lookup_time += (double)(end - begin);
+ }
+
+ printf("\n\n *** DWK Hash function performance test results ***\n");
+ if (ret == 0)
+ printf("Number of ticks per lookup = %g\n",
+ (double)lookup_time /
+ ((double)TEST_ITERATIONS * (double)TEST_SIZE));
+
+ rte_dwk_hash_free(handle);
+
+ return 0;
+}
+
+static int
test_hash_perf(void)
{
unsigned int with_pushes, with_locks;
@@ -695,6 +773,9 @@ test_hash_perf(void)
if (fbk_hash_perf_test() < 0)
return -1;
+ if (dwk_hash_perf_test() < 0)
+ return -1;
+
return 0;
}
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (2 preceding siblings ...)
2020-03-16 13:38 ` [dpdk-dev] [PATCH 3/3] test: add dwk perf tests Vladimir Medvedkin
@ 2020-03-16 14:39 ` Morten Brørup
2020-03-16 18:27 ` Medvedkin, Vladimir
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
` (4 subsequent siblings)
8 siblings, 1 reply; 56+ messages in thread
From: Morten Brørup @ 2020-03-16 14:39 UTC (permalink / raw)
To: Vladimir Medvedkin, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, Suanming Mou, Olivier Matz, Xueming(Steven) Li,
Andrew Rybchenko, Asaf Penso, Ori Kam
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir Medvedkin
> Sent: Monday, March 16, 2020 2:38 PM
>
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called DWK (double word key) hash
> supports 8 byte values, which is enough to store a pointer.
>
> It would also be nice to get feedback on whether to leave the old FBK
> and new DWK implementations, or whether to deprecate the old one?
<rant on>
Who comes up with these names?!?
FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean the same. Could you use 32 somewhere in the name instead, like in int32_t, instead of using a growing list of creative synonyms for the same thing? Pretty please, with a cherry on top!
And if the value size is fixed too, perhaps the name should also indicate the value size.
<rant off>
It's a shame we don't have C++ class templates available in DPDK...
In other news, Mellanox has sent an RFC for an "indexed memory pool" library [1] to conserve memory by using uintXX_t instead of pointers, so perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to 16 bit values in FBK and 64 bit in DWK) would be nice combination with that library.
[1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
Med venlig hilsen / kind regards
- Morten Brørup
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-16 14:39 ` [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Morten Brørup
@ 2020-03-16 18:27 ` Medvedkin, Vladimir
2020-03-16 19:32 ` Stephen Hemminger
2020-03-16 19:33 ` Morten Brørup
0 siblings, 2 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-03-16 18:27 UTC (permalink / raw)
To: Morten Brørup, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, Suanming Mou, Olivier Matz, Xueming(Steven) Li,
Andrew Rybchenko, Asaf Penso, Ori Kam
Hi Morten,
On 16/03/2020 14:39, Morten Brørup wrote:
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir Medvedkin
>> Sent: Monday, March 16, 2020 2:38 PM
>>
>> Currently DPDK has a special implementation of a hash table for
>> 4 byte keys which is called FBK hash. Unfortunately its main drawback
>> is that it only supports 2 byte values.
>> The new implementation called DWK (double word key) hash
>> supports 8 byte values, which is enough to store a pointer.
>>
>> It would also be nice to get feedback on whether to leave the old FBK
>> and new DWK implementations, or whether to deprecate the old one?
> <rant on>
>
> Who comes up with these names?!?
>
> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean the same. Could you use 32 somewhere in the name instead, like in int32_t, instead of using a growing list of creative synonyms for the same thing? Pretty please, with a cherry on top!
That's true, at first I named it as fbk2, but then it was decided to
rename it "dwk", so that there was no confusion with the existing FBK
library. Naming suggestions are welcome!
>
> And if the value size is fixed too, perhaps the name should also indicate the value size.
>
> <rant off>
>
> It's a shame we don't have C++ class templates available in DPDK...
>
> In other news, Mellanox has sent an RFC for an "indexed memory pool" library [1] to conserve memory by using uintXX_t instead of pointers, so perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to 16 bit values in FBK and 64 bit in DWK) would be nice combination with that library.
>
> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
>
>
> Med venlig hilsen / kind regards
> - Morten Brørup
>
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-16 18:27 ` Medvedkin, Vladimir
@ 2020-03-16 19:32 ` Stephen Hemminger
2020-03-17 19:52 ` Wang, Yipeng1
2020-03-16 19:33 ` Morten Brørup
1 sibling, 1 reply; 56+ messages in thread
From: Stephen Hemminger @ 2020-03-16 19:32 UTC (permalink / raw)
To: Medvedkin, Vladimir
Cc: Morten Brørup, dev, konstantin.ananyev, yipeng1.wang,
sameh.gobriel, bruce.richardson, Suanming Mou, Olivier Matz,
Xueming(Steven) Li, Andrew Rybchenko, Asaf Penso, Ori Kam
On Mon, 16 Mar 2020 18:27:40 +0000
"Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
> Hi Morten,
>
>
> On 16/03/2020 14:39, Morten Brørup wrote:
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir Medvedkin
> >> Sent: Monday, March 16, 2020 2:38 PM
> >>
> >> Currently DPDK has a special implementation of a hash table for
> >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> >> is that it only supports 2 byte values.
> >> The new implementation called DWK (double word key) hash
> >> supports 8 byte values, which is enough to store a pointer.
> >>
> >> It would also be nice to get feedback on whether to leave the old FBK
> >> and new DWK implementations, or whether to deprecate the old one?
> > <rant on>
> >
> > Who comes up with these names?!?
> >
> > FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean the same. Could you use 32 somewhere in the name instead, like in int32_t, instead of using a growing list of creative synonyms for the same thing? Pretty please, with a cherry on top!
>
>
> That's true, at first I named it as fbk2, but then it was decided to
> rename it "dwk", so that there was no confusion with the existing FBK
> library. Naming suggestions are welcome!
>
> >
> > And if the value size is fixed too, perhaps the name should also indicate the value size.
> >
> > <rant off>
> >
> > It's a shame we don't have C++ class templates available in DPDK...
> >
> > In other news, Mellanox has sent an RFC for an "indexed memory pool" library [1] to conserve memory by using uintXX_t instead of pointers, so perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to 16 bit values in FBK and 64 bit in DWK) would be nice combination with that library.
> >
> > [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
> >
> >
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> >
Why is this different (or better) than existing rte_hash.
Having more flavors is not necessarily a good thing (except in Gelato)
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-16 18:27 ` Medvedkin, Vladimir
2020-03-16 19:32 ` Stephen Hemminger
@ 2020-03-16 19:33 ` Morten Brørup
1 sibling, 0 replies; 56+ messages in thread
From: Morten Brørup @ 2020-03-16 19:33 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, Suanming Mou, Olivier Matz, Xueming(Steven) Li,
Andrew Rybchenko, Asaf Penso, Ori Kam
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Medvedkin, Vladimir
> Sent: Monday, March 16, 2020 7:28 PM
>
> Hi Morten,
>
>
> On 16/03/2020 14:39, Morten Brørup wrote:
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir Medvedkin
> >> Sent: Monday, March 16, 2020 2:38 PM
> >>
> >> Currently DPDK has a special implementation of a hash table for
> >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> >> is that it only supports 2 byte values.
> >> The new implementation called DWK (double word key) hash
> >> supports 8 byte values, which is enough to store a pointer.
> >>
> >> It would also be nice to get feedback on whether to leave the old FBK
> >> and new DWK implementations, or whether to deprecate the old one?
> > <rant on>
> >
> > Who comes up with these names?!?
> >
> > FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean the
> same. Could you use 32 somewhere in the name instead, like in int32_t,
> instead of using a growing list of creative synonyms for the same thing?
> Pretty please, with a cherry on top!
>
>
> That's true, at first I named it as fbk2, but then it was decided to
> rename it "dwk", so that there was no confusion with the existing FBK
> library. Naming suggestions are welcome!
>
OK... let me suggest a prefix with both key and value sizes in the name:
Instead of rte_dwk_hash_xxx, use rte_hash_k32v64_xxx.
However, that suggests that it is a generic hash... and your hash algorithm certainly has other properties too; so perhaps a specific name, saying something about the underlying algorithm, makes more sense after all. And the documentation will show its properties anyway.
Whatever you choose, Double Word is certainly not a good name.... a word is 32 bits and a double word is 64 bits in many non-Intel CPU architectures, including some supported by DPDK.
> >
> > And if the value size is fixed too, perhaps the name should also indicate
> the value size.
> >
> > <rant off>
> >
> > It's a shame we don't have C++ class templates available in DPDK...
> >
> > In other news, Mellanox has sent an RFC for an "indexed memory pool"
> library [1] to conserve memory by using uintXX_t instead of pointers, so
> perhaps a variant of a 32 bit key hash library with 32 bit values (in
> addition to 16 bit values in FBK and 64 bit in DWK) would be nice
> combination with that library.
> >
> > [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
> >
> >
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> >
> --
> Regards,
> Vladimir
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-16 19:32 ` Stephen Hemminger
@ 2020-03-17 19:52 ` Wang, Yipeng1
2020-03-26 17:28 ` Medvedkin, Vladimir
0 siblings, 1 reply; 56+ messages in thread
From: Wang, Yipeng1 @ 2020-03-17 19:52 UTC (permalink / raw)
To: Stephen Hemminger, Medvedkin, Vladimir
Cc: Morten Brørup, dev, Ananyev, Konstantin, Gobriel, Sameh,
Richardson, Bruce, Suanming Mou, Olivier Matz,
Xueming(Steven) Li, Andrew Rybchenko, Asaf Penso, Ori Kam
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, March 16, 2020 12:33 PM
> To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> Cc: Morten Brørup <mb@smartsharesystems.com>; dev@dpdk.org;
> Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
> <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>; Suanming Mou
> <suanmingm@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>;
> Xueming(Steven) Li <xuemingl@mellanox.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Asaf Penso <asafp@mellanox.com>; Ori Kam
> <orika@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
>
> On Mon, 16 Mar 2020 18:27:40 +0000
> "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
>
> > Hi Morten,
> >
> >
> > On 16/03/2020 14:39, Morten Brørup wrote:
> > >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
> > >> Medvedkin
> > >> Sent: Monday, March 16, 2020 2:38 PM
> > >>
> > >> Currently DPDK has a special implementation of a hash table for
> > >> 4 byte keys which is called FBK hash. Unfortunately its main
> > >> drawback is that it only supports 2 byte values.
> > >> The new implementation called DWK (double word key) hash supports 8
> > >> byte values, which is enough to store a pointer.
> > >>
> > >> It would also be nice to get feedback on whether to leave the old
> > >> FBK and new DWK implementations, or whether to deprecate the old
> one?
> > > <rant on>
> > >
> > > Who comes up with these names?!?
> > >
> > > FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
> the same. Could you use 32 somewhere in the name instead, like in int32_t,
> instead of using a growing list of creative synonyms for the same thing?
> Pretty please, with a cherry on top!
> >
> >
> > That's true, at first I named it as fbk2, but then it was decided to
> > rename it "dwk", so that there was no confusion with the existing FBK
> > library. Naming suggestions are welcome!
> >
> > >
> > > And if the value size is fixed too, perhaps the name should also indicate
> the value size.
> > >
> > > <rant off>
> > >
> > > It's a shame we don't have C++ class templates available in DPDK...
> > >
> > > In other news, Mellanox has sent an RFC for an "indexed memory pool"
> library [1] to conserve memory by using uintXX_t instead of pointers, so
> perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to
> 16 bit values in FBK and 64 bit in DWK) would be nice combination with that
> library.
> > >
> > > [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
> > >
> > >
> > > Med venlig hilsen / kind regards
> > > - Morten Brørup
> > >
>
> Why is this different (or better) than existing rte_hash.
> Having more flavors is not necessarily a good thing (except in Gelato)
[Wang, Yipeng]
Hi, Vladimir,
As Stephen mentioned, I think it is good idea to explain the benefit of this new type of hash table more explicitly such as
Specific use cases, differences with current rte_hash, and performance numbers, etc.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-17 19:52 ` Wang, Yipeng1
@ 2020-03-26 17:28 ` Medvedkin, Vladimir
2020-03-31 19:55 ` Thomas Monjalon
0 siblings, 1 reply; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-03-26 17:28 UTC (permalink / raw)
To: Wang, Yipeng1, Stephen Hemminger
Cc: Morten Brørup, dev, Ananyev, Konstantin, Gobriel, Sameh,
Richardson, Bruce, Suanming Mou, Olivier Matz,
Xueming(Steven) Li, Andrew Rybchenko, Asaf Penso, Ori Kam
Hi Yipeng, Stephen, all,
On 17/03/2020 19:52, Wang, Yipeng1 wrote:
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Sent: Monday, March 16, 2020 12:33 PM
>> To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
>> Cc: Morten Brørup <mb@smartsharesystems.com>; dev@dpdk.org;
>> Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
>> <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
>> Richardson, Bruce <bruce.richardson@intel.com>; Suanming Mou
>> <suanmingm@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>;
>> Xueming(Steven) Li <xuemingl@mellanox.com>; Andrew Rybchenko
>> <arybchenko@solarflare.com>; Asaf Penso <asafp@mellanox.com>; Ori Kam
>> <orika@mellanox.com>
>> Subject: Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
>>
>> On Mon, 16 Mar 2020 18:27:40 +0000
>> "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
>>
>>> Hi Morten,
>>>
>>>
>>> On 16/03/2020 14:39, Morten Brørup wrote:
>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
>>>>> Medvedkin
>>>>> Sent: Monday, March 16, 2020 2:38 PM
>>>>>
>>>>> Currently DPDK has a special implementation of a hash table for
>>>>> 4 byte keys which is called FBK hash. Unfortunately its main
>>>>> drawback is that it only supports 2 byte values.
>>>>> The new implementation called DWK (double word key) hash supports 8
>>>>> byte values, which is enough to store a pointer.
>>>>>
>>>>> It would also be nice to get feedback on whether to leave the old
>>>>> FBK and new DWK implementations, or whether to deprecate the old
>> one?
>>>> <rant on>
>>>>
>>>> Who comes up with these names?!?
>>>>
>>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
>> the same. Could you use 32 somewhere in the name instead, like in int32_t,
>> instead of using a growing list of creative synonyms for the same thing?
>> Pretty please, with a cherry on top!
>>>
>>> That's true, at first I named it as fbk2, but then it was decided to
>>> rename it "dwk", so that there was no confusion with the existing FBK
>>> library. Naming suggestions are welcome!
>>>
>>>> And if the value size is fixed too, perhaps the name should also indicate
>> the value size.
>>>> <rant off>
>>>>
>>>> It's a shame we don't have C++ class templates available in DPDK...
>>>>
>>>> In other news, Mellanox has sent an RFC for an "indexed memory pool"
>> library [1] to conserve memory by using uintXX_t instead of pointers, so
>> perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to
>> 16 bit values in FBK and 64 bit in DWK) would be nice combination with that
>> library.
>>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
>>>>
>>>>
>>>> Med venlig hilsen / kind regards
>>>> - Morten Brørup
>>>>
>> Why is this different (or better) than existing rte_hash.
>> Having more flavors is not necessarily a good thing (except in Gelato)
> [Wang, Yipeng]
> Hi, Vladimir,
> As Stephen mentioned, I think it is good idea to explain the benefit of this new type of hash table more explicitly such as
> Specific use cases, differences with current rte_hash, and performance numbers, etc.
The main reason for this new hash library is performance. As I mentioned
earlier, the current rte_fbk implementation is pretty fast but it has a
number of drawbacks such as 2 byte values and limited collision
resolving capabilities. On the other hand, rte_hash (cuckoo hash)
doesn't have this drawbacks but at the cost of lower performance
comparing to rte_fbk.
If I understand correctly, performance penalty are due to :
1. Load two buckets
2. First compare signatures
3. If signature comparison hits get a key index and find memory location
with a key itself and get the key
4. Using indirect call to memcmp() to compare two uint32_t.
The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks
while offers the same performance as rte_fbk.
Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4 byte
key size. Replacing it with a new implementation gives about 30% in
performance.
The main disadvantage comparing to rte_hash is some performance
degradation with high average table utilization due to chain resolving
for 5th and subsequent collision.
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-26 17:28 ` Medvedkin, Vladimir
@ 2020-03-31 19:55 ` Thomas Monjalon
2020-03-31 21:17 ` Honnappa Nagarahalli
2020-04-01 18:28 ` Medvedkin, Vladimir
0 siblings, 2 replies; 56+ messages in thread
From: Thomas Monjalon @ 2020-03-31 19:55 UTC (permalink / raw)
To: Medvedkin, Vladimir
Cc: Wang, Yipeng1, Stephen Hemminger, dev, Morten Brørup, dev,
Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce,
Suanming Mou, Olivier Matz, Xueming(Steven) Li, Andrew Rybchenko,
Asaf Penso, Ori Kam
26/03/2020 18:28, Medvedkin, Vladimir:
> Hi Yipeng, Stephen, all,
>
> On 17/03/2020 19:52, Wang, Yipeng1 wrote:
> > From: Stephen Hemminger <stephen@networkplumber.org>
> >> On Mon, 16 Mar 2020 18:27:40 +0000
> >> "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
> >>
> >>> Hi Morten,
> >>>
> >>>
> >>> On 16/03/2020 14:39, Morten Brørup wrote:
> >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
> >>>>> Medvedkin
> >>>>> Sent: Monday, March 16, 2020 2:38 PM
> >>>>>
> >>>>> Currently DPDK has a special implementation of a hash table for
> >>>>> 4 byte keys which is called FBK hash. Unfortunately its main
> >>>>> drawback is that it only supports 2 byte values.
> >>>>> The new implementation called DWK (double word key) hash supports 8
> >>>>> byte values, which is enough to store a pointer.
> >>>>>
> >>>>> It would also be nice to get feedback on whether to leave the old
> >>>>> FBK and new DWK implementations, or whether to deprecate the old
> >> one?
> >>>> <rant on>
> >>>>
> >>>> Who comes up with these names?!?
> >>>>
> >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
> >> the same. Could you use 32 somewhere in the name instead, like in int32_t,
> >> instead of using a growing list of creative synonyms for the same thing?
> >> Pretty please, with a cherry on top!
> >>>
> >>> That's true, at first I named it as fbk2, but then it was decided to
> >>> rename it "dwk", so that there was no confusion with the existing FBK
> >>> library. Naming suggestions are welcome!
> >>>
> >>>> And if the value size is fixed too, perhaps the name should also indicate
> >> the value size.
> >>>> <rant off>
> >>>>
> >>>> It's a shame we don't have C++ class templates available in DPDK...
> >>>>
> >>>> In other news, Mellanox has sent an RFC for an "indexed memory pool"
> >> library [1] to conserve memory by using uintXX_t instead of pointers, so
> >> perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to
> >> 16 bit values in FBK and 64 bit in DWK) would be nice combination with that
> >> library.
> >>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
Yes some work is in progress to propose a new memory allocator
for small objects of fixed size with small memory overhead.
> >> Why is this different (or better) than existing rte_hash.
> >> Having more flavors is not necessarily a good thing (except in Gelato)
> > [Wang, Yipeng]
> > Hi, Vladimir,
> > As Stephen mentioned, I think it is good idea to explain the benefit of this new type of hash table more explicitly such as
> > Specific use cases, differences with current rte_hash, and performance numbers, etc.
>
> The main reason for this new hash library is performance. As I mentioned
> earlier, the current rte_fbk implementation is pretty fast but it has a
> number of drawbacks such as 2 byte values and limited collision
> resolving capabilities. On the other hand, rte_hash (cuckoo hash)
> doesn't have this drawbacks but at the cost of lower performance
> comparing to rte_fbk.
>
> If I understand correctly, performance penalty are due to :
>
> 1. Load two buckets
>
> 2. First compare signatures
>
> 3. If signature comparison hits get a key index and find memory location
> with a key itself and get the key
>
> 4. Using indirect call to memcmp() to compare two uint32_t.
>
> The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks
> while offers the same performance as rte_fbk.
>
> Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4 byte
> key size. Replacing it with a new implementation gives about 30% in
> performance.
>
> The main disadvantage comparing to rte_hash is some performance
> degradation with high average table utilization due to chain resolving
> for 5th and subsequent collision.
Thanks for explaining.
Please, such information should added in the documentation:
doc/guides/prog_guide/hash_lib.rst
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-31 19:55 ` Thomas Monjalon
@ 2020-03-31 21:17 ` Honnappa Nagarahalli
2020-04-01 18:37 ` Medvedkin, Vladimir
2020-04-01 18:28 ` Medvedkin, Vladimir
1 sibling, 1 reply; 56+ messages in thread
From: Honnappa Nagarahalli @ 2020-03-31 21:17 UTC (permalink / raw)
To: thomas, Medvedkin, Vladimir
Cc: Wang, Yipeng1, Stephen Hemminger, dev, Morten Brørup, dev,
Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce,
Suanming Mou, Olivier Matz, Xueming(Steven) Li, Andrew Rybchenko,
Asaf Penso, Ori Kam, nd, Honnappa Nagarahalli, nd
<snip>
>
> 26/03/2020 18:28, Medvedkin, Vladimir:
> > Hi Yipeng, Stephen, all,
> >
> > On 17/03/2020 19:52, Wang, Yipeng1 wrote:
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > >> On Mon, 16 Mar 2020 18:27:40 +0000
> > >> "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
> > >>
> > >>> Hi Morten,
> > >>>
> > >>>
> > >>> On 16/03/2020 14:39, Morten Brørup wrote:
> > >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
> > >>>>> Medvedkin
> > >>>>> Sent: Monday, March 16, 2020 2:38 PM
> > >>>>>
> > >>>>> Currently DPDK has a special implementation of a hash table for
> > >>>>> 4 byte keys which is called FBK hash. Unfortunately its main
> > >>>>> drawback is that it only supports 2 byte values.
> > >>>>> The new implementation called DWK (double word key) hash
> > >>>>> supports 8 byte values, which is enough to store a pointer.
> > >>>>>
> > >>>>> It would also be nice to get feedback on whether to leave the
> > >>>>> old FBK and new DWK implementations, or whether to deprecate the
> > >>>>> old
> > >> one?
> > >>>> <rant on>
> > >>>>
> > >>>> Who comes up with these names?!?
> > >>>>
> > >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
> > >> the same. Could you use 32 somewhere in the name instead, like in
> > >> int32_t, instead of using a growing list of creative synonyms for the same
> thing?
> > >> Pretty please, with a cherry on top!
> > >>>
> > >>> That's true, at first I named it as fbk2, but then it was decided
> > >>> to rename it "dwk", so that there was no confusion with the
> > >>> existing FBK library. Naming suggestions are welcome!
> > >>>
> > >>>> And if the value size is fixed too, perhaps the name should also
> > >>>> indicate
> > >> the value size.
> > >>>> <rant off>
> > >>>>
> > >>>> It's a shame we don't have C++ class templates available in DPDK...
> > >>>>
> > >>>> In other news, Mellanox has sent an RFC for an "indexed memory
> pool"
> > >> library [1] to conserve memory by using uintXX_t instead of
> > >> pointers, so perhaps a variant of a 32 bit key hash library with 32
> > >> bit values (in addition to
> > >> 16 bit values in FBK and 64 bit in DWK) would be nice combination
> > >> with that library.
> > >>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
>
> Yes some work is in progress to propose a new memory allocator for small
> objects of fixed size with small memory overhead.
>
>
> > >> Why is this different (or better) than existing rte_hash.
> > >> Having more flavors is not necessarily a good thing (except in
> > >> Gelato)
> > > [Wang, Yipeng]
> > > Hi, Vladimir,
> > > As Stephen mentioned, I think it is good idea to explain the benefit
> > > of this new type of hash table more explicitly such as Specific use cases,
> differences with current rte_hash, and performance numbers, etc.
> >
> > The main reason for this new hash library is performance. As I
> > mentioned earlier, the current rte_fbk implementation is pretty fast
> > but it has a number of drawbacks such as 2 byte values and limited
> > collision resolving capabilities. On the other hand, rte_hash (cuckoo
> > hash) doesn't have this drawbacks but at the cost of lower performance
> > comparing to rte_fbk.
> >
> > If I understand correctly, performance penalty are due to :
> >
> > 1. Load two buckets
> >
> > 2. First compare signatures
> >
> > 3. If signature comparison hits get a key index and find memory
> > location with a key itself and get the key
> >
> > 4. Using indirect call to memcmp() to compare two uint32_t.
> >
> > The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks
> > while offers the same performance as rte_fbk.
> >
> > Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4
> > byte key size. Replacing it with a new implementation gives about 30%
> > in performance.
> >
> > The main disadvantage comparing to rte_hash is some performance
> > degradation with high average table utilization due to chain resolving
> > for 5th and subsequent collision.
rte_hash is linearly scalable across multiple cores for lookup due to lock-free algorithm. How is the scalability for the new algorithm?
>
> Thanks for explaining.
> Please, such information should added in the documentation:
> doc/guides/prog_guide/hash_lib.rst
>
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-31 19:55 ` Thomas Monjalon
2020-03-31 21:17 ` Honnappa Nagarahalli
@ 2020-04-01 18:28 ` Medvedkin, Vladimir
1 sibling, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-04-01 18:28 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Wang, Yipeng1, Stephen Hemminger, dev, Morten Brørup, dev,
Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce,
Suanming Mou, Olivier Matz, Xueming(Steven) Li, Andrew Rybchenko,
Asaf Penso, Ori Kam
Hi Thomas,
-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net>
Sent: Tuesday, March 31, 2020 8:56 PM
To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
Cc: Wang, Yipeng1 <yipeng1.wang@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org; Morten Brørup <mb@smartsharesystems.com>; dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; Suanming Mou <suanmingm@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>; Xueming(Steven) Li <xuemingl@mellanox.com>; Andrew Rybchenko <arybchenko@solarflare.com>; Asaf Penso <asafp@mellanox.com>; Ori Kam <orika@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
26/03/2020 18:28, Medvedkin, Vladimir:
> Hi Yipeng, Stephen, all,
>
> On 17/03/2020 19:52, Wang, Yipeng1 wrote:
> > From: Stephen Hemminger <stephen@networkplumber.org>
> >> On Mon, 16 Mar 2020 18:27:40 +0000
> >> "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
> >>
> >>> Hi Morten,
> >>>
> >>>
> >>> On 16/03/2020 14:39, Morten Brørup wrote:
> >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
> >>>>> Medvedkin
> >>>>> Sent: Monday, March 16, 2020 2:38 PM
> >>>>>
> >>>>> Currently DPDK has a special implementation of a hash table for
> >>>>> 4 byte keys which is called FBK hash. Unfortunately its main
> >>>>> drawback is that it only supports 2 byte values.
> >>>>> The new implementation called DWK (double word key) hash
> >>>>> supports 8 byte values, which is enough to store a pointer.
> >>>>>
> >>>>> It would also be nice to get feedback on whether to leave the
> >>>>> old FBK and new DWK implementations, or whether to deprecate the
> >>>>> old
> >> one?
> >>>> <rant on>
> >>>>
> >>>> Who comes up with these names?!?
> >>>>
> >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
> >> the same. Could you use 32 somewhere in the name instead, like in
> >> int32_t, instead of using a growing list of creative synonyms for the same thing?
> >> Pretty please, with a cherry on top!
> >>>
> >>> That's true, at first I named it as fbk2, but then it was decided
> >>> to rename it "dwk", so that there was no confusion with the
> >>> existing FBK library. Naming suggestions are welcome!
> >>>
> >>>> And if the value size is fixed too, perhaps the name should also
> >>>> indicate
> >> the value size.
> >>>> <rant off>
> >>>>
> >>>> It's a shame we don't have C++ class templates available in DPDK...
> >>>>
> >>>> In other news, Mellanox has sent an RFC for an "indexed memory pool"
> >> library [1] to conserve memory by using uintXX_t instead of
> >> pointers, so perhaps a variant of a 32 bit key hash library with 32
> >> bit values (in addition to
> >> 16 bit values in FBK and 64 bit in DWK) would be nice combination
> >> with that library.
> >>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
Yes some work is in progress to propose a new memory allocator for small objects of fixed size with small memory overhead.
> >> Why is this different (or better) than existing rte_hash.
> >> Having more flavors is not necessarily a good thing (except in
> >> Gelato)
> > [Wang, Yipeng]
> > Hi, Vladimir,
> > As Stephen mentioned, I think it is good idea to explain the benefit
> > of this new type of hash table more explicitly such as Specific use cases, differences with current rte_hash, and performance numbers, etc.
>
> The main reason for this new hash library is performance. As I
> mentioned earlier, the current rte_fbk implementation is pretty fast
> but it has a number of drawbacks such as 2 byte values and limited
> collision resolving capabilities. On the other hand, rte_hash (cuckoo
> hash) doesn't have this drawbacks but at the cost of lower performance
> comparing to rte_fbk.
>
> If I understand correctly, performance penalty are due to :
>
> 1. Load two buckets
>
> 2. First compare signatures
>
> 3. If signature comparison hits get a key index and find memory
> location with a key itself and get the key
>
> 4. Using indirect call to memcmp() to compare two uint32_t.
>
> The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks
> while offers the same performance as rte_fbk.
>
> Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4
> byte key size. Replacing it with a new implementation gives about 30%
> in performance.
>
> The main disadvantage comparing to rte_hash is some performance
> degradation with high average table utilization due to chain resolving
> for 5th and subsequent collision.
Thanks for explaining.
Please, such information should added in the documentation:
doc/guides/prog_guide/hash_lib.rst
I'm going to submit v2 this week, will add documentation update.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
2020-03-31 21:17 ` Honnappa Nagarahalli
@ 2020-04-01 18:37 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-04-01 18:37 UTC (permalink / raw)
To: Honnappa Nagarahalli, thomas
Cc: Wang, Yipeng1, Stephen Hemminger, dev, Morten Brørup, dev,
Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce,
Suanming Mou, Olivier Matz, Xueming(Steven) Li, Andrew Rybchenko,
Asaf Penso, Ori Kam, nd, nd
Hi Honnappa,
-----Original Message-----
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Sent: Tuesday, March 31, 2020 10:17 PM
To: thomas@monjalon.net; Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
Cc: Wang, Yipeng1 <yipeng1.wang@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org; Morten Brørup <mb@smartsharesystems.com>; dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; Suanming Mou <suanmingm@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>; Xueming(Steven) Li <xuemingl@mellanox.com>; Andrew Rybchenko <arybchenko@solarflare.com>; Asaf Penso <asafp@mellanox.com>; Ori Kam <orika@mellanox.com>; nd <nd@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
Subject: RE: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
<snip>
>
> 26/03/2020 18:28, Medvedkin, Vladimir:
> > Hi Yipeng, Stephen, all,
> >
> > On 17/03/2020 19:52, Wang, Yipeng1 wrote:
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > >> On Mon, 16 Mar 2020 18:27:40 +0000 "Medvedkin, Vladimir"
> > >> <vladimir.medvedkin@intel.com> wrote:
> > >>
> > >>> Hi Morten,
> > >>>
> > >>>
> > >>> On 16/03/2020 14:39, Morten Brørup wrote:
> > >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladimir
> > >>>>> Medvedkin
> > >>>>> Sent: Monday, March 16, 2020 2:38 PM
> > >>>>>
> > >>>>> Currently DPDK has a special implementation of a hash table
> > >>>>> for
> > >>>>> 4 byte keys which is called FBK hash. Unfortunately its main
> > >>>>> drawback is that it only supports 2 byte values.
> > >>>>> The new implementation called DWK (double word key) hash
> > >>>>> supports 8 byte values, which is enough to store a pointer.
> > >>>>>
> > >>>>> It would also be nice to get feedback on whether to leave the
> > >>>>> old FBK and new DWK implementations, or whether to deprecate
> > >>>>> the old
> > >> one?
> > >>>> <rant on>
> > >>>>
> > >>>> Who comes up with these names?!?
> > >>>>
> > >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to
> > >>>> mean
> > >> the same. Could you use 32 somewhere in the name instead, like in
> > >> int32_t, instead of using a growing list of creative synonyms for
> > >> the same
> thing?
> > >> Pretty please, with a cherry on top!
> > >>>
> > >>> That's true, at first I named it as fbk2, but then it was
> > >>> decided to rename it "dwk", so that there was no confusion with
> > >>> the existing FBK library. Naming suggestions are welcome!
> > >>>
> > >>>> And if the value size is fixed too, perhaps the name should
> > >>>> also indicate
> > >> the value size.
> > >>>> <rant off>
> > >>>>
> > >>>> It's a shame we don't have C++ class templates available in DPDK...
> > >>>>
> > >>>> In other news, Mellanox has sent an RFC for an "indexed memory
> pool"
> > >> library [1] to conserve memory by using uintXX_t instead of
> > >> pointers, so perhaps a variant of a 32 bit key hash library with
> > >> 32 bit values (in addition to
> > >> 16 bit values in FBK and 64 bit in DWK) would be nice combination
> > >> with that library.
> > >>>> [1]:
> > >>>> http://mails.dpdk.org/archives/dev/2019-October/147513.html
>
> Yes some work is in progress to propose a new memory allocator for
> small objects of fixed size with small memory overhead.
>
>
> > >> Why is this different (or better) than existing rte_hash.
> > >> Having more flavors is not necessarily a good thing (except in
> > >> Gelato)
> > > [Wang, Yipeng]
> > > Hi, Vladimir,
> > > As Stephen mentioned, I think it is good idea to explain the
> > > benefit of this new type of hash table more explicitly such as
> > > Specific use cases,
> differences with current rte_hash, and performance numbers, etc.
> >
> > The main reason for this new hash library is performance. As I
> > mentioned earlier, the current rte_fbk implementation is pretty fast
> > but it has a number of drawbacks such as 2 byte values and limited
> > collision resolving capabilities. On the other hand, rte_hash
> > (cuckoo
> > hash) doesn't have this drawbacks but at the cost of lower
> > performance comparing to rte_fbk.
> >
> > If I understand correctly, performance penalty are due to :
> >
> > 1. Load two buckets
> >
> > 2. First compare signatures
> >
> > 3. If signature comparison hits get a key index and find memory
> > location with a key itself and get the key
> >
> > 4. Using indirect call to memcmp() to compare two uint32_t.
> >
> > The new proposed 4 byte key hash table doesn't have rte_fbk
> > drawbacks while offers the same performance as rte_fbk.
> >
> > Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4
> > byte key size. Replacing it with a new implementation gives about
> > 30% in performance.
> >
> > The main disadvantage comparing to rte_hash is some performance
> > degradation with high average table utilization due to chain
> > resolving for 5th and subsequent collision.
rte_hash is linearly scalable across multiple cores for lookup due to lock-free algorithm. How is the scalability for the new algorithm?
This library is scalable as well. It uses almost the same lock-free algorithm. The only difference with cuckoo is that cuckoo in lock-free implementation uses single global "change_counter" for all the table, and the proposed implementation uses fine grained approach with "change_counter" per bucket. So it should be more scalable with frequent concurrent updates.
>
> Thanks for explaining.
> Please, such information should added in the documentation:
> doc/guides/prog_guide/hash_lib.rst
>
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v2 0/4] add new k32v64 hash table
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (3 preceding siblings ...)
2020-03-16 14:39 ` [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Morten Brørup
@ 2020-04-08 18:19 ` Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
` (4 more replies)
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library Vladimir Medvedkin
` (3 subsequent siblings)
8 siblings, 5 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-08 18:19 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Currently DPDK has a special implementation of a hash table for
4 byte keys which is called FBK hash. Unfortunately its main drawback
is that it only supports 2 byte values.
The new implementation called K32V64 hash
supports 4 byte keys and 8 byte associated values,
which is enough to store a pointer.
It would also be nice to get feedback on whether to leave the old FBK
and new k32v64 implementations or deprecate the old one?
v2:
- renamed from rte_dwk to rte_k32v64 as was suggested
- reworked lookup function, added inlined subroutines
- added avx512 key comparizon routine
- added documentation
- added statistic counters for total entries and extended entries(linked list)
Vladimir Medvedkin (4):
hash: add k32v64 hash library
hash: add documentation for k32v64 hash library
test: add k32v64 hash autotests
test: add k32v64 perf tests
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 ++
app/test/meson.build | 3 +
app/test/test_hash_perf.c | 83 +++++++++
app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++++
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 4 +-
lib/librte_hash/meson.build | 5 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_k32v64_hash.c | 279 ++++++++++++++++++++++++++++++
lib/librte_hash/rte_k32v64_hash.h | 214 +++++++++++++++++++++++
14 files changed, 901 insertions(+), 5 deletions(-)
create mode 100644 app/test/test_k32v64_hash.c
create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
create mode 100644 lib/librte_hash/rte_k32v64_hash.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.h
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (4 preceding siblings ...)
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
@ 2020-04-08 18:19 ` Vladimir Medvedkin
2020-04-08 23:23 ` Ananyev, Konstantin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 2/4] hash: add documentation for " Vladimir Medvedkin
` (2 subsequent siblings)
8 siblings, 1 reply; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-08 18:19 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
This table is hash function agnostic so user must provide
precalculated hash signature for add/delete/lookup operations.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 4 +-
lib/librte_hash/meson.build | 5 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_k32v64_hash.c | 279 +++++++++++++++++++++++++++++++++++
lib/librte_hash/rte_k32v64_hash.h | 214 +++++++++++++++++++++++++++
6 files changed, 505 insertions(+), 5 deletions(-)
create mode 100644 lib/librte_hash/rte_k32v64_hash.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.h
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a8c02e4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
librte_net librte_hash librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
-DEPDIRS-librte_hash := librte_eal librte_ring
+DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
index 9b36097..8339144 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -8,13 +8,14 @@ LIB = librte_hash.a
CFLAGS += -O3 -DALLOW_EXPERIMENTAL_API
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_ring
+LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
EXPORT_MAP := rte_hash_version.map
# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
@@ -27,5 +28,6 @@ endif
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index bce11ad..c6e0d93 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -3,13 +3,14 @@
headers = files('rte_crc_arm64.h',
'rte_fbk_hash.h',
+ 'rte_k32v64_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
'rte_jhash.h',
'rte_thash.h')
-sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
-deps += ['ring']
+sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_k32v64_hash.c')
+deps += ['ring', 'mempool']
# rte ring reset is not yet part of stable API
allow_experimental_apis = true
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index a8fbbc3..9a4f2f6 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -34,5 +34,9 @@ EXPERIMENTAL {
rte_hash_free_key_with_position;
rte_hash_max_key_id;
-
+ rte_k32v64_hash_create;
+ rte_k32v64_hash_find_existing;
+ rte_k32v64_hash_free;
+ rte_k32v64_hash_add;
+ rte_k32v64_hash_delete;
};
diff --git a/lib/librte_hash/rte_k32v64_hash.c b/lib/librte_hash/rte_k32v64_hash.c
new file mode 100644
index 0000000..b2e28c5
--- /dev/null
+++ b/lib/librte_hash/rte_k32v64_hash.c
@@ -0,0 +1,279 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <string.h>
+
+#include <rte_eal_memconfig.h>
+#include <rte_errno.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_tailq.h>
+
+#include <rte_k32v64_hash.h>
+
+TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
+
+static struct rte_tailq_elem rte_k32v64_hash_tailq = {
+ .name = "RTE_K32V64_HASH",
+};
+
+EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
+
+#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
+
+int
+rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value)
+{
+ uint32_t bucket;
+ int i, idx, ret;
+ uint8_t msk;
+ struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+ /* Search key in table. Update value if exists */
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ table->t[bucket].val[i] = value;
+ return 0;
+ }
+ }
+
+ if (!SLIST_EMPTY(&table->t[bucket].head)) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ ent->val = value;
+ return 0;
+ }
+ }
+ }
+
+ msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
+ if (msk) {
+ idx = __builtin_ctz(msk);
+ table->t[bucket].key[idx] = key;
+ table->t[bucket].val[idx] = value;
+ rte_smp_wmb();
+ table->t[bucket].key_mask |= 1 << idx;
+ table->nb_ent++;
+ return 0;
+ }
+
+ ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
+ if (ret < 0)
+ return ret;
+
+ SLIST_NEXT(ent, next) = NULL;
+ ent->key = key;
+ ent->val = value;
+ rte_smp_wmb();
+ SLIST_FOREACH(tmp, &table->t[bucket].head, next)
+ prev = tmp;
+
+ if (prev == NULL)
+ SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
+ else
+ SLIST_INSERT_AFTER(prev, ent, next);
+
+ table->nb_ent++;
+ table->nb_ext_ent++;
+ return 0;
+}
+
+int
+rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash)
+{
+ uint32_t bucket;
+ int i;
+ struct rte_k32v64_ext_ent *ent;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ ent = SLIST_FIRST(&table->t[bucket].head);
+ if (ent) {
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ table->t[bucket].key[i] = ent->key;
+ table->t[bucket].val[i] = ent->val;
+ SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ table->nb_ext_ent--;
+ } else
+ table->t[bucket].key_mask &= ~(1 << i);
+ if (ent)
+ rte_mempool_put(table->ext_ent_pool, ent);
+ table->nb_ent--;
+ return 0;
+ }
+ }
+
+ SLIST_FOREACH(ent, &table->t[bucket].head, next)
+ if (ent->key == key)
+ break;
+
+ if (ent == NULL)
+ return -ENOENT;
+
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ rte_mempool_put(table->ext_ent_pool, ent);
+
+ table->nb_ext_ent--;
+ table->nb_ent--;
+
+ return 0;
+}
+
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_find_existing(const char *name)
+{
+ struct rte_k32v64_hash_table *h = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ rte_mcfg_tailq_read_lock();
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ h = (struct rte_k32v64_hash_table *) te->data;
+ if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
+ break;
+ }
+ rte_mcfg_tailq_read_unlock();
+ if (te == NULL) {
+ rte_errno = ENOENT;
+ return NULL;
+ }
+ return h;
+}
+
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params)
+{
+ char hash_name[RTE_K32V64_HASH_NAMESIZE];
+ struct rte_k32v64_hash_table *ht = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+ uint32_t mem_size, nb_buckets, max_ent;
+ int ret;
+ struct rte_mempool *mp;
+
+ if ((params == NULL) || (params->name == NULL) ||
+ (params->entries == 0)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
+ if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
+ rte_errno = ENAMETOOLONG;
+ return NULL;
+ }
+
+ max_ent = rte_align32pow2(params->entries);
+ nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
+ mem_size = sizeof(struct rte_k32v64_hash_table) +
+ sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
+
+ mp = rte_mempool_create(hash_name, max_ent,
+ sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
+ params->socket_id, 0);
+
+ if (mp == NULL)
+ return NULL;
+
+ rte_mcfg_tailq_write_lock();
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ ht = (struct rte_k32v64_hash_table *) te->data;
+ if (strncmp(params->name, ht->name,
+ RTE_K32V64_HASH_NAMESIZE) == 0)
+ break;
+ }
+ ht = NULL;
+ if (te != NULL) {
+ rte_errno = EEXIST;
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
+ if (te == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ ht = rte_zmalloc_socket(hash_name, mem_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+ if (ht == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
+ rte_free(te);
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ memcpy(ht->name, hash_name, sizeof(ht->name));
+ ht->max_ent = max_ent;
+ ht->bucket_msk = nb_buckets - 1;
+ ht->ext_ent_pool = mp;
+
+ te->data = (void *)ht;
+ TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
+
+exit:
+ rte_mcfg_tailq_write_unlock();
+
+ return ht;
+}
+
+void
+rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht)
+{
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+
+ if (ht == NULL)
+ return;
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ rte_mcfg_tailq_write_lock();
+
+ /* find out tailq entry */
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ if (te->data == (void *) ht)
+ break;
+ }
+
+
+ if (te == NULL) {
+ rte_mcfg_tailq_write_unlock();
+ return;
+ }
+
+ TAILQ_REMOVE(k32v64_hash_list, te, next);
+
+ rte_mcfg_tailq_write_unlock();
+
+ rte_mempool_free(ht->ext_ent_pool);
+ rte_free(ht);
+ rte_free(te);
+}
+
diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
new file mode 100644
index 0000000..d25660c
--- /dev/null
+++ b/lib/librte_hash/rte_k32v64_hash.h
@@ -0,0 +1,214 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_K32V64_HASH_H_
+#define _RTE_K32V64_HASH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_compat.h>
+#include <rte_atomic.h>
+#include <rte_mempool.h>
+
+#include <immintrin.h>
+
+#define RTE_K32V64_HASH_NAMESIZE 32
+#define RTE_K32V64_KEYS_PER_BUCKET 4
+#define RTE_K32V64_WRITE_IN_PROGRESS 1
+
+struct rte_k32v64_hash_params {
+ const char *name;
+ uint32_t entries;
+ int socket_id;
+};
+
+struct rte_k32v64_ext_ent {
+ SLIST_ENTRY(rte_k32v64_ext_ent) next;
+ uint32_t key;
+ uint64_t val;
+};
+
+struct rte_k32v64_hash_bucket {
+ uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
+ uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
+ uint8_t key_mask;
+ rte_atomic32_t cnt;
+ SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
+} __rte_cache_aligned;
+
+struct rte_k32v64_hash_table {
+ char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
+ uint32_t nb_ent;
+ uint32_t nb_ext_ent;
+ uint32_t max_ent;
+ uint32_t bucket_msk;
+ struct rte_mempool *ext_ent_pool;
+ __extension__ struct rte_k32v64_hash_bucket t[0];
+};
+
+static inline int
+cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ int i;
+
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == bucket->key[i]) &&
+ (bucket->key_mask & (1 << i))) {
+ *val = bucket->val[i];
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+#ifdef __AVX512VL__
+static inline int
+cmp_keys_vec(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ __m128i keys, srch_key;
+ __mmask8 msk;
+
+ keys = _mm_load_si128((void *)bucket);
+ srch_key = _mm_set1_epi32(key);
+
+ msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
+ if (msk) {
+ *val = bucket->val[__builtin_ctz(msk)];
+ return 1;
+ }
+
+ return 0;
+}
+#endif
+
+static inline int
+rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ uint64_t val = 0;
+ struct rte_k32v64_ext_ent *ent;
+ int32_t cnt;
+ int i __rte_unused, found = 0;
+ uint32_t bucket = hash & table->bucket_msk;
+
+ do {
+ do
+ cnt = rte_atomic32_read(&table->t[bucket].cnt);
+ while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
+
+#ifdef __AVX512VL__
+ found = cmp_keys_vec(&table->t[bucket], key, &val);
+#else
+ found = cmp_keys(&table->t[bucket], key, &val);
+#endif
+ if (unlikely((found == 0) &&
+ (!SLIST_EMPTY(&table->t[bucket].head)))) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ val = ent->val;
+ found = 1;
+ break;
+ }
+ }
+ }
+
+ } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
+
+ if (found == 1) {
+ *value = val;
+ return 0;
+ } else
+ return -ENOENT;
+}
+
+/**
+ * Add a key to an existing hash table with hash value.
+ * This operation is not multi-thread safe
+ * and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to add the key to.
+ * @param key
+ * Key to add to the hash table.
+ * @param value
+ * Value to associate with key.
+ * @param hash
+ * Hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value);
+
+/**
+ * Remove a key with a given hash value from an existing hash table.
+ * This operation is not multi-thread
+ * safe and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to remove the key from.
+ * @param key
+ * Key to remove from the hash table.
+ * @param hash
+ * hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash);
+
+
+/**
+ * Performs a lookup for an existing hash table, and returns a pointer to
+ * the table if found.
+ *
+ * @param name
+ * Name of the hash table to find
+ *
+ * @return
+ * pointer to hash table structure or NULL on error with rte_errno
+ * set appropriately.
+ */
+__rte_experimental
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_find_existing(const char *name);
+
+/**
+ * Create a new hash table for use with four byte keys.
+ *
+ * @param params
+ * Parameters used in creation of hash table.
+ *
+ * @return
+ * Pointer to hash table structure that is used in future hash table
+ * operations, or NULL on error with rte_errno set appropriately.
+ */
+__rte_experimental
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
+
+/**
+ * Free all memory used by a hash table.
+ *
+ * @param table
+ * Hash table to deallocate.
+ */
+__rte_experimental
+void
+rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_K32V64_HASH_H_ */
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v2 2/4] hash: add documentation for k32v64 hash library
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (5 preceding siblings ...)
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library Vladimir Medvedkin
@ 2020-04-08 18:19 ` Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 4/4] test: add k32v64 perf tests Vladimir Medvedkin
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-08 18:19 UTC (permalink / raw)
To: dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, john.mcnamara
Add programmers guide and doxygen API for k32v64 hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++++++++++++++++++++++++++
3 files changed, 68 insertions(+)
create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..ed3e8d7 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -121,6 +121,7 @@ The public API headers are grouped by topics:
[jhash] (@ref rte_jhash.h),
[thash] (@ref rte_thash.h),
[FBK hash] (@ref rte_fbk_hash.h),
+ [K32V64 hash] (@ref rte_k32v64_hash.h),
[CRC hash] (@ref rte_hash_crc.h)
- **classification**
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index fb250ab..ac56da5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
link_bonding_poll_mode_drv_lib
timer_lib
hash_lib
+ k32v64_hash_lib
efd_lib
member_lib
lpm_lib
diff --git a/doc/guides/prog_guide/k32v64_hash_lib.rst b/doc/guides/prog_guide/k32v64_hash_lib.rst
new file mode 100644
index 0000000..44864bd
--- /dev/null
+++ b/doc/guides/prog_guide/k32v64_hash_lib.rst
@@ -0,0 +1,66 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2020 Intel Corporation.
+
+.. _k32v64_hash_Library:
+
+K32V64 Hash Library
+===================
+
+This hash library implementation is intended to be better optimized for 32-bit keys when compared to existing Cuckoo hash-based rte_hash implementation. Current rte_fbk implementation is pretty fast but it has a number of drawbacks such as 2 byte values and limited collision resolving capabilities. rte_hash (which is based on Cuckoo hash algorithm) doesn't have these drawbacks, but it comes at a cost of lower performance compared to rte_fbk.
+
+The following flow illustrates the source of performance penalties of Cuckoo hash:
+
+* Loading two buckets at once (extra memory consumption)
+* Сomparing signatures first (extra step before key comparison)
+* If signature comparison hits, get a key index, find memory location with a key itself, and get the key (memory pressure and indirection)
+* Using indirect call to memcmp() to compare two uint32_t (function call overhead)
+
+K32V64 hash table doesn't have the drawbacks associated with rte_fbk while offering the same performance as rte_fbk. The bucket contains 4 consecutive keys which can be compared very quickly, and subsequent keys are kept in a linked list.
+
+The main disadvantage compared to rte_hash is performance degradation with high average table utilization due to chain resolving for 5th and subsequent collisions.
+
+To estimate the probability of 5th collision we can use "birthday paradox" approach. We can figure out the number of insertions (can be treated as a load factor) that will likely yield a 50% probability of 5th collision for a given number of buckets.
+
+It could be calculated with an asymptotic formula from [1]:
+
+E(n, k) ~= (k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n -> inf
+
+,where
+
+k - level of collision
+
+n - number of buckets
+
+Г - gamma function
+
+So, for k = 5 (5th collision), and given the fact that number of buckets is a power of 2, we can simplify formula:
+
+E(n) = 2.392 * 2^(m * 4/5) , where number of buckets n = 2^m
+
+.. note::
+
+ You can calculate it by yourself using Wolfram Alpha [2]. For example for 8k buckets:
+
+ solve ((k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n = 8192, k = 5)
+
+
+API Overview
+------------
+
+The main configuration parameters for the hash table are:
+
+* Total number of hash entries in the table
+* Socket id
+
+K32V64 is "hash function-less", so user must specify precalculated hash value for every key. The main methods exported by the Hash Library are:
+
+* Add entry with key and precomputed hash: The key, precomputed hash and value are provided as input.
+* Delete entry with key and precomputed hash: The key and precomputed hash are provided as input.
+* Lookup entry with key and precomputed hash: The key, precomputed hash and a pointer to expected value are provided as input. If an entry with the specified key is found in the hash table (i.e. lookup hit), then the value associated with the key will be written to the memory specified by the pointer, and the function will return 0; otherwise (i.e. a lookup miss) a negative value is returned, and memory described by the pointer is not modified.
+
+References
+----------
+
+[1] M.S. Klamkin and D.J. Newman, Extensions of the Birthday Surprise
+
+[2] https://www.wolframalpha.com/
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v2 3/4] test: add k32v64 hash autotests
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (6 preceding siblings ...)
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 2/4] hash: add documentation for " Vladimir Medvedkin
@ 2020-04-08 18:19 ` Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 4/4] test: add k32v64 perf tests Vladimir Medvedkin
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-08 18:19 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add autotests for rte_k32v64_hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 +++
app/test/meson.build | 3 +
app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 245 insertions(+)
create mode 100644 app/test/test_k32v64_hash.c
diff --git a/app/test/Makefile b/app/test/Makefile
index 1f080d1..f132230 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -73,6 +73,7 @@ SRCS-y += test_bitmap.c
SRCS-y += test_reciprocal_division.c
SRCS-y += test_reciprocal_division_perf.c
SRCS-y += test_fbarray.c
+SRCS-y += test_k32v64_hash.c
SRCS-y += test_external_mem.c
SRCS-y += test_rand_perf.c
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 7b1d013..e7ec502 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -99,6 +99,18 @@
"Report": None,
},
{
+ "Name": "K32V64 hash autotest",
+ "Command": "k32v64_hash_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
+ "Name": "K32V64 hash autotest",
+ "Command": "k32v64_hash_slow_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
"Name": "LPM autotest",
"Command": "lpm_autotest",
"Func": default_autotest,
diff --git a/app/test/meson.build b/app/test/meson.build
index 0a2ce71..cb619ba 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -45,6 +45,7 @@ test_sources = files('commands.c',
'test_eventdev.c',
'test_external_mem.c',
'test_fbarray.c',
+ 'test_k32v64_hash.c',
'test_fib.c',
'test_fib_perf.c',
'test_fib6.c',
@@ -185,6 +186,7 @@ fast_test_names = [
'flow_classify_autotest',
'hash_autotest',
'interrupt_autotest',
+ 'k32v64_hash_autotest',
'logs_autotest',
'lpm_autotest',
'lpm6_autotest',
@@ -270,6 +272,7 @@ perf_test_names = [
'rand_perf_autotest',
'hash_readwrite_perf_autotest',
'hash_readwrite_lf_perf_autotest',
+ 'k32v64_hash_slow_autotest',
]
driver_test_names = [
diff --git a/app/test/test_k32v64_hash.c b/app/test/test_k32v64_hash.c
new file mode 100644
index 0000000..3cf3b8d
--- /dev/null
+++ b/app/test/test_k32v64_hash.c
@@ -0,0 +1,229 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_lcore.h>
+#include <rte_k32v64_hash.h>
+#include <rte_hash_crc.h>
+
+#include "test.h"
+
+typedef int32_t (*rte_k32v64_hash_test)(void);
+
+static int32_t test_create_invalid(void);
+static int32_t test_multiple_create(void);
+static int32_t test_free_null(void);
+static int32_t test_add_del_invalid(void);
+static int32_t test_basic(void);
+
+#define MAX_ENT (1 << 22)
+
+/*
+ * Check that rte_k32v64_hash_create fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test_create_invalid(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+
+ /* rte_k32v64_hash_create: k32v64_hash name == NULL */
+ config.name = NULL;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.name = "test_k32v64_hash";
+
+ /* rte_k32v64_hash_create: config == NULL */
+ k32v64_hash = rte_k32v64_hash_create(NULL);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+
+ /* socket_id < -1 is invalid */
+ config.socket_id = -2;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.socket_id = rte_socket_id();
+
+ /* rte_k32v64_hash_create: entries = 0 */
+ config.entries = 0;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.entries = MAX_ENT;
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Create k32v64_hash table then delete k32v64_hash table 10 times
+ * Use a slightly different rules size each time
+ */
+#include <rte_errno.h>
+int32_t
+test_multiple_create(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+ int32_t i;
+
+
+ for (i = 0; i < 100; i++) {
+ config.name = "test_k32v64_hash";
+ config.socket_id = -1;
+ config.entries = MAX_ENT - i;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL,
+ "Failed to create k32v64 hash\n");
+ rte_k32v64_hash_free(k32v64_hash);
+ }
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call rte_k32v64_hash_free for NULL pointer user input.
+ * Note: free has no return and therefore it is impossible
+ * to check for failure but this test is added to
+ * increase function coverage metrics and to validate that
+ * freeing null does not crash.
+ */
+int32_t
+test_free_null(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+
+ config.name = "test_k32v64";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL, "Failed to create k32v64 hash\n");
+
+ rte_k32v64_hash_free(k32v64_hash);
+ rte_k32v64_hash_free(NULL);
+ return TEST_SUCCESS;
+}
+
+/*
+ * Check that rte_k32v64_hash_add fails gracefully for
+ * incorrect user input arguments
+ */
+int32_t
+test_add_del_invalid(void)
+{
+ uint32_t key = 10;
+ uint64_t val = 20;
+ int ret;
+
+ /* rte_k32v64_hash_add: k32v64_hash == NULL */
+ ret = rte_k32v64_hash_add(NULL, key, rte_hash_crc_4byte(key, 0), val);
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ /* rte_k32v64_hash_delete: k32v64_hash == NULL */
+ ret = rte_k32v64_hash_delete(NULL, key, rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call add, lookup and delete for a single rule
+ */
+int32_t
+test_basic(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+ uint32_t key = 10;
+ uint64_t value = 20;
+ uint64_t ret_val = 0;
+ int ret;
+
+ config.name = "test_k32v64";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL, "Failed to create k32v64 hash\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ ret = rte_k32v64_hash_delete(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -ENOENT, "Delete return incorrect result\n");
+
+ ret = rte_k32v64_hash_add(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), value);
+ RTE_TEST_ASSERT(ret == 0, "Can not add key into the table\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(((ret == 0) && (value == ret_val)),
+ "Lookup return incorrect result\n");
+
+ ret = rte_k32v64_hash_delete(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == 0, "Can not delete key from table\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ rte_k32v64_hash_free(k32v64_hash);
+
+ return TEST_SUCCESS;
+}
+
+static struct unit_test_suite k32v64_hash_tests = {
+ .suite_name = "k32v64_hash autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_create_invalid),
+ TEST_CASE(test_free_null),
+ TEST_CASE(test_add_del_invalid),
+ TEST_CASE(test_basic),
+ TEST_CASES_END()
+ }
+};
+
+static struct unit_test_suite k32v64_hash_slow_tests = {
+ .suite_name = "k32v64_hash slow autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_multiple_create),
+ TEST_CASES_END()
+ }
+};
+
+/*
+ * Do all unit tests.
+ */
+static int
+test_k32v64_hash(void)
+{
+ return unit_test_suite_runner(&k32v64_hash_tests);
+}
+
+static int
+test_slow_k32v64_hash(void)
+{
+ return unit_test_suite_runner(&k32v64_hash_slow_tests);
+}
+
+REGISTER_TEST_COMMAND(k32v64_hash_autotest, test_k32v64_hash);
+REGISTER_TEST_COMMAND(k32v64_hash_slow_autotest, test_slow_k32v64_hash);
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v2 4/4] test: add k32v64 perf tests
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
` (7 preceding siblings ...)
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
@ 2020-04-08 18:19 ` Vladimir Medvedkin
8 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-08 18:19 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add performance tests for rte_k32v64_hash
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/test_hash_perf.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index a438eae..0a02445 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -12,8 +12,10 @@
#include <rte_hash_crc.h>
#include <rte_jhash.h>
#include <rte_fbk_hash.h>
+#include <rte_k32v64_hash.h>
#include <rte_random.h>
#include <rte_string_fns.h>
+#include <rte_hash_crc.h>
#include "test.h"
@@ -29,6 +31,8 @@
#define NUM_SHUFFLES 10
#define BURST_SIZE 16
+#define CRC_INIT_VAL 0xdeadbeef
+
enum operations {
ADD = 0,
LOOKUP,
@@ -669,6 +673,82 @@ fbk_hash_perf_test(void)
}
static int
+k32v64_hash_perf_test(void)
+{
+ struct rte_k32v64_hash_params params = {
+ .name = "k32v64_hash_test",
+ .entries = ENTRIES * 2,
+ .socket_id = rte_socket_id(),
+ };
+ struct rte_k32v64_hash_table *handle = NULL;
+ uint32_t *keys = NULL;
+ unsigned int indexes[TEST_SIZE];
+ uint64_t tmp_val;
+ uint64_t lookup_time = 0;
+ uint64_t begin;
+ uint64_t end;
+ unsigned int added = 0;
+ uint32_t key;
+ uint16_t val;
+ unsigned int i, j;
+ int ret = 0;
+
+ handle = rte_k32v64_hash_create(¶ms);
+ if (handle == NULL) {
+ printf("Error creating table\n");
+ return -1;
+ }
+
+ keys = rte_zmalloc(NULL, ENTRIES * sizeof(*keys), 0);
+ if (keys == NULL) {
+ printf("fbk hash: memory allocation for key store failed\n");
+ return -1;
+ }
+
+ /* Generate random keys and values. */
+ for (i = 0; i < ENTRIES; i++) {
+ key = (uint32_t)rte_rand();
+ val = rte_rand();
+
+ if (rte_k32v64_hash_add(handle, key, rte_hash_crc_4byte(key,
+ CRC_INIT_VAL), val) == 0) {
+ keys[added] = key;
+ added++;
+ }
+ }
+
+ for (i = 0; i < TEST_ITERATIONS; i++) {
+
+ /* Generate random indexes into keys[] array. */
+ for (j = 0; j < TEST_SIZE; j++)
+ indexes[j] = rte_rand() % added;
+
+ begin = rte_rdtsc();
+ /* Do lookups */
+
+ for (j = 0; j < TEST_SIZE; j++)
+ ret += rte_k32v64_hash_lookup(handle,
+ keys[indexes[j]],
+ rte_hash_crc_4byte(keys[indexes[j]],
+ CRC_INIT_VAL), &tmp_val);
+
+
+ end = rte_rdtsc();
+ lookup_time += (double)(end - begin);
+ }
+
+ printf("\n\n *** K32V64 Hash function performance test results ***\n");
+ if (ret == 0)
+ printf("Number of ticks per lookup = %g\n",
+ (double)lookup_time /
+ ((double)TEST_ITERATIONS * (double)TEST_SIZE));
+
+ rte_k32v64_hash_free(handle);
+
+ return 0;
+}
+
+static int
test_hash_perf(void)
{
unsigned int with_pushes, with_locks;
@@ -695,6 +775,9 @@ test_hash_perf(void)
if (fbk_hash_perf_test() < 0)
return -1;
+ if (k32v64_hash_perf_test() < 0)
+ return -1;
+
return 0;
}
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library Vladimir Medvedkin
@ 2020-04-08 23:23 ` Ananyev, Konstantin
0 siblings, 0 replies; 56+ messages in thread
From: Ananyev, Konstantin @ 2020-04-08 23:23 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Vladimir,
I didn't look at actual implementation (yet), just some
compatibility comments.
> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide
> precalculated hash signature for add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
> new file mode 100644
> index 0000000..d25660c
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.h
> @@ -0,0 +1,214 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_K32V64_HASH_H_
> +#define _RTE_K32V64_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#include <immintrin.h>
How that supposed to compile on non-X86 box?
> +
> +#define RTE_K32V64_HASH_NAMESIZE 32
> +#define RTE_K32V64_KEYS_PER_BUCKET 4
> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
> +
> +struct rte_k32v64_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> +};
> +
> +struct rte_k32v64_ext_ent {
> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct rte_k32v64_hash_bucket {
> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + rte_atomic32_t cnt;
> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
> +} __rte_cache_aligned;
> +
> +struct rte_k32v64_hash_table {
> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
> + uint32_t nb_ent;
> + uint32_t nb_ext_ent;
> + uint32_t max_ent;
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + __extension__ struct rte_k32v64_hash_bucket t[0];
> +};
> +
> +static inline int
> +cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +#ifdef __AVX512VL__
> +static inline int
> +cmp_keys_vec(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
What if you'll run it on IA cpu without avx512 support?
Think you need there some run-time selection to decide which function to use,
depending on the underlying HW.
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +#endif
> +
> +static inline int
> +rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + uint64_t val = 0;
> + struct rte_k32v64_ext_ent *ent;
> + int32_t cnt;
> + int i __rte_unused, found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> + do
> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
> +
> +#ifdef __AVX512VL__
> + found = cmp_keys_vec(&table->t[bucket], key, &val);
> +#else
> + found = cmp_keys(&table->t[bucket], key, &val);
> +#endif
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> +
> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe
> + * and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
> + * @param hash
> + * Hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread
> + * safe and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash);
> +
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a pointer to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_K32V64_HASH_H_ */
> --
> 2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
@ 2020-04-15 18:17 ` Vladimir Medvedkin
2020-04-15 18:51 ` Mattias Rönnblom
` (6 more replies)
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
` (3 subsequent siblings)
4 siblings, 7 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-15 18:17 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Currently DPDK has a special implementation of a hash table for
4 byte keys which is called FBK hash. Unfortunately its main drawback
is that it only supports 2 byte values.
The new implementation called K32V64 hash
supports 4 byte keys and 8 byte associated values,
which is enough to store a pointer.
It would also be nice to get feedback on whether to leave the old FBK
and new k32v64 implementations or deprecate the old one?
v3:
- added bulk lookup
- avx512 key comparizon is removed from .h
v2:
- renamed from rte_dwk to rte_k32v64 as was suggested
- reworked lookup function, added inlined subroutines
- added avx512 key comparizon routine
- added documentation
- added statistic counters for total entries and extended entries(linked list)
Vladimir Medvedkin (4):
hash: add k32v64 hash library
hash: add documentation for k32v64 hash library
test: add k32v64 hash autotests
test: add k32v64 perf tests
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 ++
app/test/meson.build | 3 +
app/test/test_hash_perf.c | 130 ++++++++++++
app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 13 +-
lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
lib/librte_hash/meson.build | 17 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_k32v64_hash.c | 315 ++++++++++++++++++++++++++++++
lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
15 files changed, 1058 insertions(+), 5 deletions(-)
create mode 100644 app/test/test_k32v64_hash.c
create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.h
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
@ 2020-04-15 18:17 ` Vladimir Medvedkin
2020-04-15 18:59 ` Mattias Rönnblom
` (2 more replies)
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 2/4] hash: add documentation for " Vladimir Medvedkin
` (2 subsequent siblings)
4 siblings, 3 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-15 18:17 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
This table is hash function agnostic so user must provide
precalculated hash signature for add/delete/lookup operations.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 13 +-
lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
lib/librte_hash/meson.build | 17 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_k32v64_hash.c | 315 +++++++++++++++++++++++++++++++++
lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++++
7 files changed, 615 insertions(+), 5 deletions(-)
create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.c
create mode 100644 lib/librte_hash/rte_k32v64_hash.h
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a8c02e4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
librte_net librte_hash librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
-DEPDIRS-librte_hash := librte_eal librte_ring
+DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
index ec9f864..023689d 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -8,13 +8,14 @@ LIB = librte_hash.a
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_ring
+LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
EXPORT_MAP := rte_hash_version.map
# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
@@ -27,5 +28,15 @@ endif
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
+
+CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1 | \
+grep -q __AVX512VL__ && echo 1)
+
+ifeq ($(CC_AVX512VL_SUPPORT), 1)
+ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
+ CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
+ CFLAGS_rte_k32v64_hash.o += -DCC_AVX512VL_SUPPORT
+endif
include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c b/lib/librte_hash/k32v64_hash_avx512vl.c
new file mode 100644
index 0000000..7c70dd2
--- /dev/null
+++ b/lib/librte_hash/k32v64_hash_avx512vl.c
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <rte_k32v64_hash.h>
+
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
+ uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
+
+static inline int
+k32v64_cmp_keys_avx512vl(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ __m128i keys, srch_key;
+ __mmask8 msk;
+
+ keys = _mm_load_si128((void *)bucket);
+ srch_key = _mm_set1_epi32(key);
+
+ msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
+ if (msk) {
+ *val = bucket->val[__builtin_ctz(msk)];
+ return 1;
+ }
+
+ return 0;
+}
+
+static inline int
+k32v64_hash_lookup_avx512vl(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ return __k32v64_hash_lookup(table, key, hash, value,
+ k32v64_cmp_keys_avx512vl);
+}
+
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
+ uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
+{
+ int ret, cnt = 0;
+ unsigned int i;
+
+ if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
+ (values == NULL)))
+ return -EINVAL;
+
+ for (i = 0; i < n; i++) {
+ ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
+ &values[i]);
+ if (ret == 0)
+ cnt++;
+ }
+ return cnt;
+}
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index 6ab46ae..8a37cc4 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -3,10 +3,23 @@
headers = files('rte_crc_arm64.h',
'rte_fbk_hash.h',
+ 'rte_k32v64_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
'rte_jhash.h',
'rte_thash.h')
-sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
-deps += ['ring']
+sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_k32v64_hash.c')
+deps += ['ring', 'mempool']
+
+if dpdk_conf.has('RTE_ARCH_X86')
+ if cc.has_argument('-mavx512vl')
+ avx512_tmplib = static_library('avx512_tmp',
+ 'k32v64_hash_avx512vl.c',
+ dependencies: static_rte_mempool,
+ c_args: cflags + ['-mavx512vl'])
+ objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
+ cflags += '-DCC_AVX512VL_SUPPORT'
+
+ endif
+endif
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index a8fbbc3..9a4f2f6 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -34,5 +34,9 @@ EXPERIMENTAL {
rte_hash_free_key_with_position;
rte_hash_max_key_id;
-
+ rte_k32v64_hash_create;
+ rte_k32v64_hash_find_existing;
+ rte_k32v64_hash_free;
+ rte_k32v64_hash_add;
+ rte_k32v64_hash_delete;
};
diff --git a/lib/librte_hash/rte_k32v64_hash.c b/lib/librte_hash/rte_k32v64_hash.c
new file mode 100644
index 0000000..7ed94b4
--- /dev/null
+++ b/lib/librte_hash/rte_k32v64_hash.c
@@ -0,0 +1,315 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <string.h>
+
+#include <rte_eal_memconfig.h>
+#include <rte_errno.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_tailq.h>
+
+#include <rte_k32v64_hash.h>
+
+TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
+
+static struct rte_tailq_elem rte_k32v64_hash_tailq = {
+ .name = "RTE_K32V64_HASH",
+};
+
+EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
+
+#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
+
+#ifdef CC_AVX512VL_SUPPORT
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
+ uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
+#endif
+
+static int
+k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
+ uint32_t *hashes, uint64_t *values, unsigned int n)
+{
+ int ret, cnt = 0;
+ unsigned int i;
+
+ if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
+ (values == NULL)))
+ return -EINVAL;
+
+ for (i = 0; i < n; i++) {
+ ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
+ &values[i]);
+ if (ret == 0)
+ cnt++;
+ }
+ return cnt;
+}
+
+static rte_k32v64_hash_bulk_lookup_t
+get_lookup_bulk_fn(void)
+{
+#ifdef CC_AVX512VL_SUPPORT
+ if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+ return k32v64_hash_bulk_lookup_avx512vl;
+#endif
+ return k32v64_hash_bulk_lookup;
+}
+
+int
+rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value)
+{
+ uint32_t bucket;
+ int i, idx, ret;
+ uint8_t msk;
+ struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+ /* Search key in table. Update value if exists */
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ table->t[bucket].val[i] = value;
+ return 0;
+ }
+ }
+
+ if (!SLIST_EMPTY(&table->t[bucket].head)) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ ent->val = value;
+ return 0;
+ }
+ }
+ }
+
+ msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
+ if (msk) {
+ idx = __builtin_ctz(msk);
+ table->t[bucket].key[idx] = key;
+ table->t[bucket].val[idx] = value;
+ rte_smp_wmb();
+ table->t[bucket].key_mask |= 1 << idx;
+ table->nb_ent++;
+ return 0;
+ }
+
+ ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
+ if (ret < 0)
+ return ret;
+
+ SLIST_NEXT(ent, next) = NULL;
+ ent->key = key;
+ ent->val = value;
+ rte_smp_wmb();
+ SLIST_FOREACH(tmp, &table->t[bucket].head, next)
+ prev = tmp;
+
+ if (prev == NULL)
+ SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
+ else
+ SLIST_INSERT_AFTER(prev, ent, next);
+
+ table->nb_ent++;
+ table->nb_ext_ent++;
+ return 0;
+}
+
+int
+rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash)
+{
+ uint32_t bucket;
+ int i;
+ struct rte_k32v64_ext_ent *ent;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ ent = SLIST_FIRST(&table->t[bucket].head);
+ if (ent) {
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ table->t[bucket].key[i] = ent->key;
+ table->t[bucket].val[i] = ent->val;
+ SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ table->nb_ext_ent--;
+ } else
+ table->t[bucket].key_mask &= ~(1 << i);
+ if (ent)
+ rte_mempool_put(table->ext_ent_pool, ent);
+ table->nb_ent--;
+ return 0;
+ }
+ }
+
+ SLIST_FOREACH(ent, &table->t[bucket].head, next)
+ if (ent->key == key)
+ break;
+
+ if (ent == NULL)
+ return -ENOENT;
+
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
+ rte_atomic32_inc(&table->t[bucket].cnt);
+ rte_mempool_put(table->ext_ent_pool, ent);
+
+ table->nb_ext_ent--;
+ table->nb_ent--;
+
+ return 0;
+}
+
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_find_existing(const char *name)
+{
+ struct rte_k32v64_hash_table *h = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ rte_mcfg_tailq_read_lock();
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ h = (struct rte_k32v64_hash_table *) te->data;
+ if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
+ break;
+ }
+ rte_mcfg_tailq_read_unlock();
+ if (te == NULL) {
+ rte_errno = ENOENT;
+ return NULL;
+ }
+ return h;
+}
+
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params)
+{
+ char hash_name[RTE_K32V64_HASH_NAMESIZE];
+ struct rte_k32v64_hash_table *ht = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+ uint32_t mem_size, nb_buckets, max_ent;
+ int ret;
+ struct rte_mempool *mp;
+
+ if ((params == NULL) || (params->name == NULL) ||
+ (params->entries == 0)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
+ if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
+ rte_errno = ENAMETOOLONG;
+ return NULL;
+ }
+
+ max_ent = rte_align32pow2(params->entries);
+ nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
+ mem_size = sizeof(struct rte_k32v64_hash_table) +
+ sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
+
+ mp = rte_mempool_create(hash_name, max_ent,
+ sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
+ params->socket_id, 0);
+
+ if (mp == NULL)
+ return NULL;
+
+ rte_mcfg_tailq_write_lock();
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ ht = (struct rte_k32v64_hash_table *) te->data;
+ if (strncmp(params->name, ht->name,
+ RTE_K32V64_HASH_NAMESIZE) == 0)
+ break;
+ }
+ ht = NULL;
+ if (te != NULL) {
+ rte_errno = EEXIST;
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
+ if (te == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ ht = rte_zmalloc_socket(hash_name, mem_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+ if (ht == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
+ rte_free(te);
+ rte_mempool_free(mp);
+ goto exit;
+ }
+
+ memcpy(ht->name, hash_name, sizeof(ht->name));
+ ht->max_ent = max_ent;
+ ht->bucket_msk = nb_buckets - 1;
+ ht->ext_ent_pool = mp;
+ ht->lookup = get_lookup_bulk_fn();
+
+ te->data = (void *)ht;
+ TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
+
+exit:
+ rte_mcfg_tailq_write_unlock();
+
+ return ht;
+}
+
+void
+rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht)
+{
+ struct rte_tailq_entry *te;
+ struct rte_k32v64_hash_list *k32v64_hash_list;
+
+ if (ht == NULL)
+ return;
+
+ k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
+ rte_k32v64_hash_list);
+
+ rte_mcfg_tailq_write_lock();
+
+ /* find out tailq entry */
+ TAILQ_FOREACH(te, k32v64_hash_list, next) {
+ if (te->data == (void *) ht)
+ break;
+ }
+
+
+ if (te == NULL) {
+ rte_mcfg_tailq_write_unlock();
+ return;
+ }
+
+ TAILQ_REMOVE(k32v64_hash_list, te, next);
+
+ rte_mcfg_tailq_write_unlock();
+
+ rte_mempool_free(ht->ext_ent_pool);
+ rte_free(ht);
+ rte_free(te);
+}
diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
new file mode 100644
index 0000000..b2c52e9
--- /dev/null
+++ b/lib/librte_hash/rte_k32v64_hash.h
@@ -0,0 +1,211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_K32V64_HASH_H_
+#define _RTE_K32V64_HASH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_compat.h>
+#include <rte_atomic.h>
+#include <rte_mempool.h>
+
+#define RTE_K32V64_HASH_NAMESIZE 32
+#define RTE_K32V64_KEYS_PER_BUCKET 4
+#define RTE_K32V64_WRITE_IN_PROGRESS 1
+
+struct rte_k32v64_hash_params {
+ const char *name;
+ uint32_t entries;
+ int socket_id;
+};
+
+struct rte_k32v64_ext_ent {
+ SLIST_ENTRY(rte_k32v64_ext_ent) next;
+ uint32_t key;
+ uint64_t val;
+};
+
+struct rte_k32v64_hash_bucket {
+ uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
+ uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
+ uint8_t key_mask;
+ rte_atomic32_t cnt;
+ SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
+} __rte_cache_aligned;
+
+struct rte_k32v64_hash_table;
+
+typedef int (*rte_k32v64_hash_bulk_lookup_t)
+(struct rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
+ uint64_t *values, unsigned int n);
+
+struct rte_k32v64_hash_table {
+ char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
+ uint32_t nb_ent; /**< Number of entities in the table*/
+ uint32_t nb_ext_ent; /**< Number of extended entities */
+ uint32_t max_ent; /**< Maximum number of entities */
+ uint32_t bucket_msk;
+ struct rte_mempool *ext_ent_pool;
+ rte_k32v64_hash_bulk_lookup_t lookup;
+ __extension__ struct rte_k32v64_hash_bucket t[0];
+};
+
+typedef int (*rte_k32v64_cmp_fn_t)
+(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
+
+static inline int
+__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ int i;
+
+ for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == bucket->key[i]) &&
+ (bucket->key_mask & (1 << i))) {
+ *val = bucket->val[i];
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static inline int
+__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f)
+{
+ uint64_t val = 0;
+ struct rte_k32v64_ext_ent *ent;
+ int32_t cnt;
+ int found = 0;
+ uint32_t bucket = hash & table->bucket_msk;
+
+ do {
+ do
+ cnt = rte_atomic32_read(&table->t[bucket].cnt);
+ while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
+
+ found = cmp_f(&table->t[bucket], key, &val);
+ if (unlikely((found == 0) &&
+ (!SLIST_EMPTY(&table->t[bucket].head)))) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ val = ent->val;
+ found = 1;
+ break;
+ }
+ }
+ }
+
+ } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
+
+ if (found == 1) {
+ *value = val;
+ return 0;
+ } else
+ return -ENOENT;
+}
+
+static inline int
+rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ return __k32v64_hash_lookup(table, key, hash, value, __k32v64_cmp_keys);
+}
+
+static inline int
+rte_k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table,
+ uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
+{
+ return table->lookup(table, keys, hashes, values, n);
+}
+
+/**
+ * Add a key to an existing hash table with hash value.
+ * This operation is not multi-thread safe
+ * and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to add the key to.
+ * @param key
+ * Key to add to the hash table.
+ * @param value
+ * Value to associate with key.
+ * @param hash
+ * Hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value);
+
+/**
+ * Remove a key with a given hash value from an existing hash table.
+ * This operation is not multi-thread
+ * safe and should only be called from one thread.
+ *
+ * @param ht
+ * Hash table to remove the key from.
+ * @param key
+ * Key to remove from the hash table.
+ * @param hash
+ * hash value associated with key.
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
+ uint32_t hash);
+
+
+/**
+ * Performs a lookup for an existing hash table, and returns a pointer to
+ * the table if found.
+ *
+ * @param name
+ * Name of the hash table to find
+ *
+ * @return
+ * pointer to hash table structure or NULL on error with rte_errno
+ * set appropriately.
+ */
+__rte_experimental
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_find_existing(const char *name);
+
+/**
+ * Create a new hash table for use with four byte keys.
+ *
+ * @param params
+ * Parameters used in creation of hash table.
+ *
+ * @return
+ * Pointer to hash table structure that is used in future hash table
+ * operations, or NULL on error with rte_errno set appropriately.
+ */
+__rte_experimental
+struct rte_k32v64_hash_table *
+rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
+
+/**
+ * Free all memory used by a hash table.
+ *
+ * @param table
+ * Hash table to deallocate.
+ */
+__rte_experimental
+void
+rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_K32V64_HASH_H_ */
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v3 2/4] hash: add documentation for k32v64 hash library
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
@ 2020-04-15 18:17 ` Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 4/4] test: add k32v64 perf tests Vladimir Medvedkin
4 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-15 18:17 UTC (permalink / raw)
To: dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, john.mcnamara
Add programmers guide and doxygen API for k32v64 hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++++++++++++++++++++++++++
3 files changed, 68 insertions(+)
create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..ed3e8d7 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -121,6 +121,7 @@ The public API headers are grouped by topics:
[jhash] (@ref rte_jhash.h),
[thash] (@ref rte_thash.h),
[FBK hash] (@ref rte_fbk_hash.h),
+ [K32V64 hash] (@ref rte_k32v64_hash.h),
[CRC hash] (@ref rte_hash_crc.h)
- **classification**
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index fb250ab..ac56da5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
link_bonding_poll_mode_drv_lib
timer_lib
hash_lib
+ k32v64_hash_lib
efd_lib
member_lib
lpm_lib
diff --git a/doc/guides/prog_guide/k32v64_hash_lib.rst b/doc/guides/prog_guide/k32v64_hash_lib.rst
new file mode 100644
index 0000000..238033a
--- /dev/null
+++ b/doc/guides/prog_guide/k32v64_hash_lib.rst
@@ -0,0 +1,66 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2020 Intel Corporation.
+
+.. _k32v64_hash_Library:
+
+K32V64 Hash Library
+===================
+
+This hash library implementation is intended to be better optimized for 32-bit keys when compared to existing Cuckoo hash-based rte_hash implementation. Current rte_fbk implementation is pretty fast but it has a number of drawbacks such as 2 byte values and limited collision resolving capabilities. rte_hash (which is based on Cuckoo hash algorithm) doesn't have these drawbacks, but it comes at a cost of lower performance compared to rte_fbk.
+
+The following flow illustrates the source of performance penalties of Cuckoo hash:
+
+* Loading two buckets at once (extra memory consumption)
+* Сomparing signatures first (extra step before key comparison)
+* If signature comparison hits, get a key index, find memory location with a key itself, and get the key (memory pressure and indirection)
+* Using indirect call to memcmp() to compare two uint32_t (function call overhead)
+
+K32V64 hash table doesn't have the drawbacks associated with rte_fbk while offering the same performance as rte_fbk. The bucket contains 4 consecutive keys which can be compared very quickly, and subsequent keys are kept in a linked list.
+
+The main disadvantage compared to rte_hash is performance degradation with high average table utilization due to chain resolving for 5th and subsequent collisions.
+
+To estimate the probability of 5th collision we can use "birthday paradox" approach. We can figure out the number of insertions (can be treated as a load factor) that will likely yield a 50% probability of 5th collision for a given number of buckets.
+
+It could be calculated with an asymptotic formula from [1]:
+
+E(n, k) ~= (k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n -> inf
+
+,where
+
+k - level of collision
+
+n - number of buckets
+
+Г - gamma function
+
+So, for k = 5 (5th collision), and given the fact that number of buckets is a power of 2, we can simplify formula:
+
+E(n) = 2.392 * 2^(m * 4/5) , where number of buckets n = 2^m
+
+.. note::
+
+ You can calculate it by yourself using Wolfram Alpha [2]. For example for 8k buckets:
+
+ solve ((k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n = 8192, k = 5)
+
+
+API Overview
+-----------------
+
+The main configuration parameters for the hash table are:
+
+* Total number of hash entries in the table
+* Socket id
+
+K32V64 is "hash function-less", so user must specify precalculated hash value for every key. The main methods exported by the Hash Library are:
+
+* Add entry with key and precomputed hash: The key, precomputed hash and value are provided as input.
+* Delete entry with key and precomputed hash: The key and precomputed hash are provided as input.
+* Lookup entry with key and precomputed hash: The key, precomputed hash and a pointer to expected value are provided as input. If an entry with the specified key is found in the hash table (i.e. lookup hit), then the value associated with the key will be written to the memory specified by the pointer, and the function will return 0; otherwise (i.e. a lookup miss) a negative value is returned, and memory described by the pointer is not modified.
+
+References
+----------
+
+[1] M.S. Klamkin and D.J. Newman, Extensions of the Birthday Surprise
+
+[2] https://www.wolframalpha.com/
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v3 3/4] test: add k32v64 hash autotests
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
` (2 preceding siblings ...)
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 2/4] hash: add documentation for " Vladimir Medvedkin
@ 2020-04-15 18:17 ` Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 4/4] test: add k32v64 perf tests Vladimir Medvedkin
4 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-15 18:17 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add autotests for rte_k32v64_hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 +++
app/test/meson.build | 3 +
app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 245 insertions(+)
create mode 100644 app/test/test_k32v64_hash.c
diff --git a/app/test/Makefile b/app/test/Makefile
index be53d33..c0e2a28 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -73,6 +73,7 @@ SRCS-y += test_bitmap.c
SRCS-y += test_reciprocal_division.c
SRCS-y += test_reciprocal_division_perf.c
SRCS-y += test_fbarray.c
+SRCS-y += test_k32v64_hash.c
SRCS-y += test_external_mem.c
SRCS-y += test_rand_perf.c
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 7b1d013..e7ec502 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -99,6 +99,18 @@
"Report": None,
},
{
+ "Name": "K32V64 hash autotest",
+ "Command": "k32v64_hash_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
+ "Name": "K32V64 hash autotest",
+ "Command": "k32v64_hash_slow_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
"Name": "LPM autotest",
"Command": "lpm_autotest",
"Func": default_autotest,
diff --git a/app/test/meson.build b/app/test/meson.build
index 04b59cf..9d1c965 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -45,6 +45,7 @@ test_sources = files('commands.c',
'test_eventdev.c',
'test_external_mem.c',
'test_fbarray.c',
+ 'test_k32v64_hash.c',
'test_fib.c',
'test_fib_perf.c',
'test_fib6.c',
@@ -187,6 +188,7 @@ fast_tests = [
['flow_classify_autotest', false],
['hash_autotest', true],
['interrupt_autotest', true],
+ ['k32v64_hash_autotest', true],
['logs_autotest', true],
['lpm_autotest', true],
['lpm6_autotest', true],
@@ -272,6 +274,7 @@ perf_test_names = [
'rand_perf_autotest',
'hash_readwrite_perf_autotest',
'hash_readwrite_lf_perf_autotest',
+ 'k32v64_hash_slow_autotest',
]
driver_test_names = [
diff --git a/app/test/test_k32v64_hash.c b/app/test/test_k32v64_hash.c
new file mode 100644
index 0000000..3cf3b8d
--- /dev/null
+++ b/app/test/test_k32v64_hash.c
@@ -0,0 +1,229 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_lcore.h>
+#include <rte_k32v64_hash.h>
+#include <rte_hash_crc.h>
+
+#include "test.h"
+
+typedef int32_t (*rte_k32v64_hash_test)(void);
+
+static int32_t test_create_invalid(void);
+static int32_t test_multiple_create(void);
+static int32_t test_free_null(void);
+static int32_t test_add_del_invalid(void);
+static int32_t test_basic(void);
+
+#define MAX_ENT (1 << 22)
+
+/*
+ * Check that rte_k32v64_hash_create fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test_create_invalid(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+
+ /* rte_k32v64_hash_create: k32v64_hash name == NULL */
+ config.name = NULL;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.name = "test_k32v64_hash";
+
+ /* rte_k32v64_hash_create: config == NULL */
+ k32v64_hash = rte_k32v64_hash_create(NULL);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+
+ /* socket_id < -1 is invalid */
+ config.socket_id = -2;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.socket_id = rte_socket_id();
+
+ /* rte_k32v64_hash_create: entries = 0 */
+ config.entries = 0;
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.entries = MAX_ENT;
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Create k32v64_hash table then delete k32v64_hash table 10 times
+ * Use a slightly different rules size each time
+ */
+#include <rte_errno.h>
+int32_t
+test_multiple_create(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+ int32_t i;
+
+
+ for (i = 0; i < 100; i++) {
+ config.name = "test_k32v64_hash";
+ config.socket_id = -1;
+ config.entries = MAX_ENT - i;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL,
+ "Failed to create k32v64 hash\n");
+ rte_k32v64_hash_free(k32v64_hash);
+ }
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call rte_k32v64_hash_free for NULL pointer user input.
+ * Note: free has no return and therefore it is impossible
+ * to check for failure but this test is added to
+ * increase function coverage metrics and to validate that
+ * freeing null does not crash.
+ */
+int32_t
+test_free_null(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+
+ config.name = "test_k32v64";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL, "Failed to create k32v64 hash\n");
+
+ rte_k32v64_hash_free(k32v64_hash);
+ rte_k32v64_hash_free(NULL);
+ return TEST_SUCCESS;
+}
+
+/*
+ * Check that rte_k32v64_hash_add fails gracefully for
+ * incorrect user input arguments
+ */
+int32_t
+test_add_del_invalid(void)
+{
+ uint32_t key = 10;
+ uint64_t val = 20;
+ int ret;
+
+ /* rte_k32v64_hash_add: k32v64_hash == NULL */
+ ret = rte_k32v64_hash_add(NULL, key, rte_hash_crc_4byte(key, 0), val);
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ /* rte_k32v64_hash_delete: k32v64_hash == NULL */
+ ret = rte_k32v64_hash_delete(NULL, key, rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call add, lookup and delete for a single rule
+ */
+int32_t
+test_basic(void)
+{
+ struct rte_k32v64_hash_table *k32v64_hash = NULL;
+ struct rte_k32v64_hash_params config;
+ uint32_t key = 10;
+ uint64_t value = 20;
+ uint64_t ret_val = 0;
+ int ret;
+
+ config.name = "test_k32v64";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+
+ k32v64_hash = rte_k32v64_hash_create(&config);
+ RTE_TEST_ASSERT(k32v64_hash != NULL, "Failed to create k32v64 hash\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ ret = rte_k32v64_hash_delete(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == -ENOENT, "Delete return incorrect result\n");
+
+ ret = rte_k32v64_hash_add(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), value);
+ RTE_TEST_ASSERT(ret == 0, "Can not add key into the table\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(((ret == 0) && (value == ret_val)),
+ "Lookup return incorrect result\n");
+
+ ret = rte_k32v64_hash_delete(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0));
+ RTE_TEST_ASSERT(ret == 0, "Can not delete key from table\n");
+
+ ret = rte_k32v64_hash_lookup(k32v64_hash, key,
+ rte_hash_crc_4byte(key, 0), &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Lookup return incorrect result\n");
+
+ rte_k32v64_hash_free(k32v64_hash);
+
+ return TEST_SUCCESS;
+}
+
+static struct unit_test_suite k32v64_hash_tests = {
+ .suite_name = "k32v64_hash autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_create_invalid),
+ TEST_CASE(test_free_null),
+ TEST_CASE(test_add_del_invalid),
+ TEST_CASE(test_basic),
+ TEST_CASES_END()
+ }
+};
+
+static struct unit_test_suite k32v64_hash_slow_tests = {
+ .suite_name = "k32v64_hash slow autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_multiple_create),
+ TEST_CASES_END()
+ }
+};
+
+/*
+ * Do all unit tests.
+ */
+static int
+test_k32v64_hash(void)
+{
+ return unit_test_suite_runner(&k32v64_hash_tests);
+}
+
+static int
+test_slow_k32v64_hash(void)
+{
+ return unit_test_suite_runner(&k32v64_hash_slow_tests);
+}
+
+REGISTER_TEST_COMMAND(k32v64_hash_autotest, test_k32v64_hash);
+REGISTER_TEST_COMMAND(k32v64_hash_slow_autotest, test_slow_k32v64_hash);
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v3 4/4] test: add k32v64 perf tests
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
` (3 preceding siblings ...)
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
@ 2020-04-15 18:17 ` Vladimir Medvedkin
4 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-04-15 18:17 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add performance tests for rte_k32v64_hash
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/test_hash_perf.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 130 insertions(+)
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index a438eae..f45a8d9 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -12,8 +12,10 @@
#include <rte_hash_crc.h>
#include <rte_jhash.h>
#include <rte_fbk_hash.h>
+#include <rte_k32v64_hash.h>
#include <rte_random.h>
#include <rte_string_fns.h>
+#include <rte_hash_crc.h>
#include "test.h"
@@ -29,6 +31,8 @@
#define NUM_SHUFFLES 10
#define BURST_SIZE 16
+#define CRC_INIT_VAL 0xdeadbeef
+
enum operations {
ADD = 0,
LOOKUP,
@@ -668,6 +672,129 @@ fbk_hash_perf_test(void)
return 0;
}
+static uint32_t *
+shuf_arr(uint32_t *arr, int n, int l)
+{
+ int i, j;
+ uint32_t tmp;
+ uint32_t *ret_arr;
+
+ ret_arr = rte_zmalloc(NULL, l * sizeof(uint32_t), 0);
+ for (i = 0; i < n; i++) {
+ j = rte_rand() % n;
+ tmp = arr[j];
+ arr[j] = arr[i];
+ arr[i] = tmp;
+ }
+ for (i = 0; i < l; i++)
+ ret_arr[i] = arr[i % n];
+
+ return ret_arr;
+}
+
+static int
+k32v64_hash_perf_test(void)
+{
+ struct rte_k32v64_hash_params params = {
+ .name = "k32v64_hash_test",
+ .entries = ENTRIES * 2,
+ .socket_id = rte_socket_id(),
+ };
+ struct rte_k32v64_hash_table *handle = NULL;
+ uint32_t *keys;
+ uint32_t *lookup_keys;
+ uint64_t tmp_val;
+ uint64_t lookup_time = 0;
+ uint64_t begin;
+ uint64_t end;
+ unsigned int added = 0;
+ uint32_t key;
+ uint16_t val;
+ unsigned int i, j, k;
+ int ret = 0;
+ uint32_t hashes[64];
+ uint64_t vals[64];
+
+ handle = rte_k32v64_hash_create(¶ms);
+ if (handle == NULL) {
+ printf("Error creating table\n");
+ return -1;
+ }
+
+ keys = rte_zmalloc(NULL, ENTRIES * sizeof(*keys), 0);
+ if (keys == NULL) {
+ printf("fbk hash: memory allocation for key store failed\n");
+ return -1;
+ }
+
+ /* Generate random keys and values. */
+ for (i = 0; i < ENTRIES; i++) {
+ key = (uint32_t)rte_rand();
+ val = rte_rand();
+
+ if (rte_k32v64_hash_add(handle, key, rte_hash_crc_4byte(key,
+ CRC_INIT_VAL), val) == 0) {
+ keys[added] = key;
+ added++;
+ }
+ }
+
+ lookup_keys = shuf_arr(keys, added, TEST_SIZE);
+
+ for (i = 0; i < TEST_ITERATIONS; i++) {
+
+ begin = rte_rdtsc();
+ /* Do lookups */
+
+ for (j = 0; j < TEST_SIZE; j++)
+ ret += rte_k32v64_hash_lookup(handle, lookup_keys[j],
+ rte_hash_crc_4byte(lookup_keys[j],
+ CRC_INIT_VAL), &tmp_val);
+
+ end = rte_rdtsc();
+ lookup_time += (double)(end - begin);
+ }
+
+ printf("\n\n *** K32V64 Hash function performance test results ***\n");
+ if (ret == 0)
+ printf("Number of ticks per lookup = %g\n",
+ (double)lookup_time /
+ ((double)TEST_ITERATIONS * (double)TEST_SIZE));
+
+ lookup_time = 0;
+ for (i = 0; i < TEST_ITERATIONS; i++) {
+
+ begin = rte_rdtsc();
+ /* Do lookups */
+
+ for (j = 0; j < TEST_SIZE; j += 64) {
+ for (k = 0; k < 64; k++) {
+ hashes[k] =
+ rte_hash_crc_4byte(lookup_keys[j + k],
+ CRC_INIT_VAL);
+ }
+
+ ret += rte_k32v64_hash_bulk_lookup(handle,
+ &lookup_keys[j], hashes, vals, 64);
+ }
+
+ end = rte_rdtsc();
+ lookup_time += (double)(end - begin);
+ }
+
+ printf("\n\n *** K32V64 Hash function performance test results ***\n");
+ if (ret != 0)
+ printf("Number of ticks per bulk lookup = %g\n",
+ (double)lookup_time /
+ ((double)TEST_ITERATIONS * (double)TEST_SIZE));
+
+ rte_free(keys);
+ rte_free(lookup_keys);
+ rte_k32v64_hash_free(handle);
+
+ return 0;
+}
+
static int
test_hash_perf(void)
{
@@ -695,6 +822,9 @@ test_hash_perf(void)
if (fbk_hash_perf_test() < 0)
return -1;
+ if (k32v64_hash_perf_test() < 0)
+ return -1;
+
return 0;
}
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
@ 2020-04-15 18:51 ` Mattias Rönnblom
2020-04-16 10:18 ` Medvedkin, Vladimir
2020-04-16 9:39 ` Thomas Monjalon
` (5 subsequent siblings)
6 siblings, 1 reply; 56+ messages in thread
From: Mattias Rönnblom @ 2020-04-15 18:51 UTC (permalink / raw)
To: Vladimir Medvedkin, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called K32V64 hash
> supports 4 byte keys and 8 byte associated values,
> which is enough to store a pointer.
>
> It would also be nice to get feedback on whether to leave the old FBK
> and new k32v64 implementations or deprecate the old one?
Do you think it would be feasible to support custom-sized values and
remain efficient, in a similar manner to how rte_ring_elem.h does things?
> v3:
> - added bulk lookup
> - avx512 key comparizon is removed from .h
>
> v2:
> - renamed from rte_dwk to rte_k32v64 as was suggested
> - reworked lookup function, added inlined subroutines
> - added avx512 key comparizon routine
> - added documentation
> - added statistic counters for total entries and extended entries(linked list)
>
> Vladimir Medvedkin (4):
> hash: add k32v64 hash library
> hash: add documentation for k32v64 hash library
> test: add k32v64 hash autotests
> test: add k32v64 perf tests
>
> app/test/Makefile | 1 +
> app/test/autotest_data.py | 12 ++
> app/test/meson.build | 3 +
> app/test/test_hash_perf.c | 130 ++++++++++++
> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
> doc/api/doxy-api-index.md | 1 +
> doc/guides/prog_guide/index.rst | 1 +
> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 13 +-
> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_k32v64_hash.c | 315 ++++++++++++++++++++++++++++++
> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
> 15 files changed, 1058 insertions(+), 5 deletions(-)
> create mode 100644 app/test/test_k32v64_hash.c
> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
@ 2020-04-15 18:59 ` Mattias Rönnblom
2020-04-16 10:23 ` Medvedkin, Vladimir
2020-04-23 13:31 ` Ananyev, Konstantin
2020-04-29 21:29 ` Honnappa Nagarahalli
2 siblings, 1 reply; 56+ messages in thread
From: Mattias Rönnblom @ 2020-04-15 18:59 UTC (permalink / raw)
To: Vladimir Medvedkin, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide
> precalculated hash signature for add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 13 +-
> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_k32v64_hash.c | 315 +++++++++++++++++++++++++++++++++
> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++++
> 7 files changed, 615 insertions(+), 5 deletions(-)
> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>
> diff --git a/lib/Makefile b/lib/Makefile
> index 46b91ae..a8c02e4 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
> librte_net librte_hash librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
> -DEPDIRS-librte_hash := librte_eal librte_ring
> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
> DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
> diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
> index ec9f864..023689d 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -8,13 +8,14 @@ LIB = librte_hash.a
>
> CFLAGS += -O3
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> -LDLIBS += -lrte_eal -lrte_ring
> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>
> EXPORT_MAP := rte_hash_version.map
>
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
>
> # install this header file
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
> @@ -27,5 +28,15 @@ endif
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
> +
> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1 | \
> +grep -q __AVX512VL__ && echo 1)
> +
> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
> + CFLAGS_rte_k32v64_hash.o += -DCC_AVX512VL_SUPPORT
> +endif
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c b/lib/librte_hash/k32v64_hash_avx512vl.c
> new file mode 100644
> index 0000000..7c70dd2
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <rte_k32v64_hash.h>
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +
> +static inline int
> +k32v64_cmp_keys_avx512vl(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +k32v64_hash_lookup_avx512vl(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> + k32v64_cmp_keys_avx512vl);
> +}
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
> index 6ab46ae..8a37cc4 100644
> --- a/lib/librte_hash/meson.build
> +++ b/lib/librte_hash/meson.build
> @@ -3,10 +3,23 @@
>
> headers = files('rte_crc_arm64.h',
> 'rte_fbk_hash.h',
> + 'rte_k32v64_hash.h',
> 'rte_hash_crc.h',
> 'rte_hash.h',
> 'rte_jhash.h',
> 'rte_thash.h')
>
> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
> -deps += ['ring']
> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_k32v64_hash.c')
> +deps += ['ring', 'mempool']
> +
> +if dpdk_conf.has('RTE_ARCH_X86')
> + if cc.has_argument('-mavx512vl')
> + avx512_tmplib = static_library('avx512_tmp',
> + 'k32v64_hash_avx512vl.c',
> + dependencies: static_rte_mempool,
> + c_args: cflags + ['-mavx512vl'])
> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
> + cflags += '-DCC_AVX512VL_SUPPORT'
> +
> + endif
> +endif
> diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
> index a8fbbc3..9a4f2f6 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -34,5 +34,9 @@ EXPERIMENTAL {
>
> rte_hash_free_key_with_position;
> rte_hash_max_key_id;
> -
> + rte_k32v64_hash_create;
> + rte_k32v64_hash_find_existing;
> + rte_k32v64_hash_free;
> + rte_k32v64_hash_add;
> + rte_k32v64_hash_delete;
> };
> diff --git a/lib/librte_hash/rte_k32v64_hash.c b/lib/librte_hash/rte_k32v64_hash.c
> new file mode 100644
> index 0000000..7ed94b4
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.c
> @@ -0,0 +1,315 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_k32v64_hash.h>
> +
> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
> + .name = "RTE_K32V64_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
> +
> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +#endif
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
> + uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +static rte_k32v64_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> + return k32v64_hash_bulk_lookup_avx512vl;
> +#endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + table->t[bucket].val[i] = value;
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + ent->val = value;
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + rte_smp_wmb();
> + table->t[bucket].key_mask |= 1 << idx;
> + table->nb_ent++;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + return 0;
> +}
> +
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash)
> +{
> + uint32_t bucket;
> + int i;
> + struct rte_k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->nb_ext_ent--;
> + } else
> + table->t[bucket].key_mask &= ~(1 << i);
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name)
> +{
> + struct rte_k32v64_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + h = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params)
> +{
> + char hash_name[RTE_K32V64_HASH_NAMESIZE];
> + struct rte_k32v64_hash_table *ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
> + if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
> + mem_size = sizeof(struct rte_k32v64_hash_table) +
> + sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + ht = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(params->name, ht->name,
> + RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + ht = NULL;
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
> + rte_free(te);
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + memcpy(ht->name, hash_name, sizeof(ht->name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->lookup = get_lookup_bulk_fn();
> +
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +}
> +
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht)
> +{
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(k32v64_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + rte_mempool_free(ht->ext_ent_pool);
> + rte_free(ht);
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
> new file mode 100644
> index 0000000..b2c52e9
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.h
> @@ -0,0 +1,211 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_K32V64_HASH_H_
> +#define _RTE_K32V64_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_K32V64_HASH_NAMESIZE 32
> +#define RTE_K32V64_KEYS_PER_BUCKET 4
> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
> +
> +struct rte_k32v64_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> +};
> +
> +struct rte_k32v64_ext_ent {
> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct rte_k32v64_hash_bucket {
> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + rte_atomic32_t cnt;
> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
> +} __rte_cache_aligned;
> +
> +struct rte_k32v64_hash_table;
> +
> +typedef int (*rte_k32v64_hash_bulk_lookup_t)
> +(struct rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
> + uint64_t *values, unsigned int n);
> +
> +struct rte_k32v64_hash_table {
> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
> + uint32_t nb_ent; /**< Number of entities in the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + rte_k32v64_hash_bulk_lookup_t lookup;
> + __extension__ struct rte_k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*rte_k32v64_cmp_fn_t)
> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f)
> +{
> + uint64_t val = 0;
> + struct rte_k32v64_ext_ent *ent;
> + int32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> + do
> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> +
> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +static inline int
> +rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value, __k32v64_cmp_keys);
> +}
> +
> +static inline int
> +rte_k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + return table->lookup(table, keys, hashes, values, n);
> +}
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe
> + * and should only be called from one thread.
> + *
Does this hash allow multiple readers, and at most one writer, for a
particular instance? If that's the case, it should probably be mentioned
somewhere.
Just specifying that a particular function is not MT safe doesn't say
anything if it's safe to call it in parallel with other, non-MT safe
functions. I assume you can't add and delete in parallel?
> + * @param ht
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
> + * @param hash
> + * Hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread
> + * safe and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash);
> +
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a pointer to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_K32V64_HASH_H_ */
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
2020-04-15 18:51 ` Mattias Rönnblom
@ 2020-04-16 9:39 ` Thomas Monjalon
2020-04-16 14:02 ` Medvedkin, Vladimir
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
` (4 subsequent siblings)
6 siblings, 1 reply; 56+ messages in thread
From: Thomas Monjalon @ 2020-04-16 9:39 UTC (permalink / raw)
To: Vladimir Medvedkin
Cc: dev, konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
15/04/2020 20:17, Vladimir Medvedkin:
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called K32V64 hash
> supports 4 byte keys and 8 byte associated values,
> which is enough to store a pointer.
>
> It would also be nice to get feedback on whether to leave the old FBK
> and new k32v64 implementations or deprecate the old one?
>
> v3:
> - added bulk lookup
> - avx512 key comparizon is removed from .h
>
> v2:
> - renamed from rte_dwk to rte_k32v64 as was suggested
> - reworked lookup function, added inlined subroutines
> - added avx512 key comparizon routine
> - added documentation
> - added statistic counters for total entries and extended entries(linked list)
Please use --in-reply-to so we can follow version changes
in the same email thread.
Also I am changing the states in patchwork as superseded.
Please remind updating status of old patches.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-15 18:51 ` Mattias Rönnblom
@ 2020-04-16 10:18 ` Medvedkin, Vladimir
2020-04-16 11:40 ` Mattias Rönnblom
0 siblings, 1 reply; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-04-16 10:18 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: Ananyev, Konstantin, Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Mattias,
-----Original Message-----
From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Sent: Wednesday, April 15, 2020 7:52 PM
To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called K32V64 hash supports 4 byte keys and 8
> byte associated values, which is enough to store a pointer.
>
> It would also be nice to get feedback on whether to leave the old FBK
> and new k32v64 implementations or deprecate the old one?
Do you think it would be feasible to support custom-sized values and remain efficient, in a similar manner to how rte_ring_elem.h does things?
I'm afraid it is not feasible. For the performance reason keys and corresponding values resides in single cache line so there are no extra memory for bigger values, such as 16B.
> v3:
> - added bulk lookup
> - avx512 key comparizon is removed from .h
>
> v2:
> - renamed from rte_dwk to rte_k32v64 as was suggested
> - reworked lookup function, added inlined subroutines
> - added avx512 key comparizon routine
> - added documentation
> - added statistic counters for total entries and extended entries(linked list)
>
> Vladimir Medvedkin (4):
> hash: add k32v64 hash library
> hash: add documentation for k32v64 hash library
> test: add k32v64 hash autotests
> test: add k32v64 perf tests
>
> app/test/Makefile | 1 +
> app/test/autotest_data.py | 12 ++
> app/test/meson.build | 3 +
> app/test/test_hash_perf.c | 130 ++++++++++++
> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
> doc/api/doxy-api-index.md | 1 +
> doc/guides/prog_guide/index.rst | 1 +
> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 13 +-
> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_k32v64_hash.c | 315 ++++++++++++++++++++++++++++++
> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
> 15 files changed, 1058 insertions(+), 5 deletions(-)
> create mode 100644 app/test/test_k32v64_hash.c
> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-15 18:59 ` Mattias Rönnblom
@ 2020-04-16 10:23 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-04-16 10:23 UTC (permalink / raw)
To: Mattias Rönnblom, dev
Cc: Ananyev, Konstantin, Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Mattias,
-----Original Message-----
From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Sent: Wednesday, April 15, 2020 8:00 PM
To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide
> precalculated hash signature for add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 13 +-
> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_k32v64_hash.c | 315 +++++++++++++++++++++++++++++++++
> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++++
> 7 files changed, 615 insertions(+), 5 deletions(-)
> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>
> diff --git a/lib/Makefile b/lib/Makefile index 46b91ae..a8c02e4 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
> librte_net librte_hash librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash
> := librte_eal librte_ring
> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
> DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
> ec9f864..023689d 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -8,13 +8,14 @@ LIB = librte_hash.a
>
> CFLAGS += -O3
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -LDLIBS += -lrte_eal
> -lrte_ring
> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>
> EXPORT_MAP := rte_hash_version.map
>
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
>
> # install this header file
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
> +28,15 @@ endif
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
> +
> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1
> +| \ grep -q __AVX512VL__ && echo 1)
> +
> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
> + CFLAGS_rte_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
> b/lib/librte_hash/k32v64_hash_avx512vl.c
> new file mode 100644
> index 0000000..7c70dd2
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation */
> +
> +#include <rte_k32v64_hash.h>
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +
> +static inline int
> +k32v64_cmp_keys_avx512vl(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +k32v64_hash_lookup_avx512vl(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> + k32v64_cmp_keys_avx512vl);
> +}
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
> index 6ab46ae..8a37cc4 100644
> --- a/lib/librte_hash/meson.build
> +++ b/lib/librte_hash/meson.build
> @@ -3,10 +3,23 @@
>
> headers = files('rte_crc_arm64.h',
> 'rte_fbk_hash.h',
> + 'rte_k32v64_hash.h',
> 'rte_hash_crc.h',
> 'rte_hash.h',
> 'rte_jhash.h',
> 'rte_thash.h')
>
> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps +=
> ['ring']
> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c',
> +'rte_k32v64_hash.c') deps += ['ring', 'mempool']
> +
> +if dpdk_conf.has('RTE_ARCH_X86')
> + if cc.has_argument('-mavx512vl')
> + avx512_tmplib = static_library('avx512_tmp',
> + 'k32v64_hash_avx512vl.c',
> + dependencies: static_rte_mempool,
> + c_args: cflags + ['-mavx512vl'])
> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
> + cflags += '-DCC_AVX512VL_SUPPORT'
> +
> + endif
> +endif
> diff --git a/lib/librte_hash/rte_hash_version.map
> b/lib/librte_hash/rte_hash_version.map
> index a8fbbc3..9a4f2f6 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -34,5 +34,9 @@ EXPERIMENTAL {
>
> rte_hash_free_key_with_position;
> rte_hash_max_key_id;
> -
> + rte_k32v64_hash_create;
> + rte_k32v64_hash_find_existing;
> + rte_k32v64_hash_free;
> + rte_k32v64_hash_add;
> + rte_k32v64_hash_delete;
> };
> diff --git a/lib/librte_hash/rte_k32v64_hash.c
> b/lib/librte_hash/rte_k32v64_hash.c
> new file mode 100644
> index 0000000..7ed94b4
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.c
> @@ -0,0 +1,315 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_k32v64_hash.h>
> +
> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
> + .name = "RTE_K32V64_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
> +
> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +#endif
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
> + uint32_t *hashes, uint64_t *values, unsigned int n) {
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +static rte_k32v64_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> + return k32v64_hash_bulk_lookup_avx512vl; #endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + table->t[bucket].val[i] = value;
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + ent->val = value;
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + rte_smp_wmb();
> + table->t[bucket].key_mask |= 1 << idx;
> + table->nb_ent++;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + return 0;
> +}
> +
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash)
> +{
> + uint32_t bucket;
> + int i;
> + struct rte_k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->nb_ext_ent--;
> + } else
> + table->t[bucket].key_mask &= ~(1 << i);
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name) {
> + struct rte_k32v64_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + h = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params) {
> + char hash_name[RTE_K32V64_HASH_NAMESIZE];
> + struct rte_k32v64_hash_table *ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
> + if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
> + mem_size = sizeof(struct rte_k32v64_hash_table) +
> + sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + ht = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(params->name, ht->name,
> + RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + ht = NULL;
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
> + rte_free(te);
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + memcpy(ht->name, hash_name, sizeof(ht->name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->lookup = get_lookup_bulk_fn();
> +
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +}
> +
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht) {
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(k32v64_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + rte_mempool_free(ht->ext_ent_pool);
> + rte_free(ht);
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_k32v64_hash.h
> b/lib/librte_hash/rte_k32v64_hash.h
> new file mode 100644
> index 0000000..b2c52e9
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.h
> @@ -0,0 +1,211 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation */
> +
> +#ifndef _RTE_K32V64_HASH_H_
> +#define _RTE_K32V64_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_K32V64_HASH_NAMESIZE 32
> +#define RTE_K32V64_KEYS_PER_BUCKET 4
> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
> +
> +struct rte_k32v64_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> +};
> +
> +struct rte_k32v64_ext_ent {
> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct rte_k32v64_hash_bucket {
> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + rte_atomic32_t cnt;
> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head; }
> +__rte_cache_aligned;
> +
> +struct rte_k32v64_hash_table;
> +
> +typedef int (*rte_k32v64_hash_bulk_lookup_t) (struct
> +rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
> + uint64_t *values, unsigned int n);
> +
> +struct rte_k32v64_hash_table {
> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
> + uint32_t nb_ent; /**< Number of entities in the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + rte_k32v64_hash_bulk_lookup_t lookup;
> + __extension__ struct rte_k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*rte_k32v64_cmp_fn_t)
> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f) {
> + uint64_t val = 0;
> + struct rte_k32v64_ext_ent *ent;
> + int32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> + do
> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> +
> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +static inline int
> +rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> +__k32v64_cmp_keys); }
> +
> +static inline int
> +rte_k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + return table->lookup(table, keys, hashes, values, n); }
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe
> + * and should only be called from one thread.
> + *
Does this hash allow multiple readers, and at most one writer, for a particular instance? If that's the case, it should probably be mentioned somewhere.
Just specifying that a particular function is not MT safe doesn't say anything if it's safe to call it in parallel with other, non-MT safe functions. I assume you can't add and delete in parallel?
Yes, this hash allows multiple readers and on writer. Writers must be serialized. I will add this information. Thanks!
> + * @param ht
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
> + * @param hash
> + * Hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread
> + * safe and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash);
> +
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a
> +pointer to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_K32V64_HASH_H_ */
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-16 10:18 ` Medvedkin, Vladimir
@ 2020-04-16 11:40 ` Mattias Rönnblom
2020-04-17 0:21 ` Wang, Yipeng1
0 siblings, 1 reply; 56+ messages in thread
From: Mattias Rönnblom @ 2020-04-16 11:40 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev
Cc: Ananyev, Konstantin, Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
On 2020-04-16 12:18, Medvedkin, Vladimir wrote:
> Hi Mattias,
>
> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: Wednesday, April 15, 2020 7:52 PM
> To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
>
> On 2020-04-15 20:17, Vladimir Medvedkin wrote:
>> Currently DPDK has a special implementation of a hash table for
>> 4 byte keys which is called FBK hash. Unfortunately its main drawback
>> is that it only supports 2 byte values.
>> The new implementation called K32V64 hash supports 4 byte keys and 8
>> byte associated values, which is enough to store a pointer.
>>
>> It would also be nice to get feedback on whether to leave the old FBK
>> and new k32v64 implementations or deprecate the old one?
>
> Do you think it would be feasible to support custom-sized values and remain efficient, in a similar manner to how rte_ring_elem.h does things?
>
> I'm afraid it is not feasible. For the performance reason keys and corresponding values resides in single cache line so there are no extra memory for bigger values, such as 16B.
Well, if you have a smaller value type (or key type) you would fit into
something less-than-a-cache line, and thus reduce your memory working
set further.
>> v3:
>> - added bulk lookup
>> - avx512 key comparizon is removed from .h
>>
>> v2:
>> - renamed from rte_dwk to rte_k32v64 as was suggested
>> - reworked lookup function, added inlined subroutines
>> - added avx512 key comparizon routine
>> - added documentation
>> - added statistic counters for total entries and extended entries(linked list)
>>
>> Vladimir Medvedkin (4):
>> hash: add k32v64 hash library
>> hash: add documentation for k32v64 hash library
>> test: add k32v64 hash autotests
>> test: add k32v64 perf tests
>>
>> app/test/Makefile | 1 +
>> app/test/autotest_data.py | 12 ++
>> app/test/meson.build | 3 +
>> app/test/test_hash_perf.c | 130 ++++++++++++
>> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
>> doc/api/doxy-api-index.md | 1 +
>> doc/guides/prog_guide/index.rst | 1 +
>> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
>> lib/Makefile | 2 +-
>> lib/librte_hash/Makefile | 13 +-
>> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
>> lib/librte_hash/meson.build | 17 +-
>> lib/librte_hash/rte_hash_version.map | 6 +-
>> lib/librte_hash/rte_k32v64_hash.c | 315 ++++++++++++++++++++++++++++++
>> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
>> 15 files changed, 1058 insertions(+), 5 deletions(-)
>> create mode 100644 app/test/test_k32v64_hash.c
>> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
>> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
>> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
>> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-16 9:39 ` Thomas Monjalon
@ 2020-04-16 14:02 ` Medvedkin, Vladimir
2020-04-16 14:38 ` Thomas Monjalon
0 siblings, 1 reply; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-04-16 14:02 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Hi Thomas,
On 16/04/2020 10:39, Thomas Monjalon wrote:
> 15/04/2020 20:17, Vladimir Medvedkin:
>> Currently DPDK has a special implementation of a hash table for
>> 4 byte keys which is called FBK hash. Unfortunately its main drawback
>> is that it only supports 2 byte values.
>> The new implementation called K32V64 hash
>> supports 4 byte keys and 8 byte associated values,
>> which is enough to store a pointer.
>>
>> It would also be nice to get feedback on whether to leave the old FBK
>> and new k32v64 implementations or deprecate the old one?
>>
>> v3:
>> - added bulk lookup
>> - avx512 key comparizon is removed from .h
>>
>> v2:
>> - renamed from rte_dwk to rte_k32v64 as was suggested
>> - reworked lookup function, added inlined subroutines
>> - added avx512 key comparizon routine
>> - added documentation
>> - added statistic counters for total entries and extended entries(linked list)
> Please use --in-reply-to so we can follow version changes
> in the same email thread.
> Also I am changing the states in patchwork as superseded.
> Please remind updating status of old patches.
>
Hmm, strange, I used --in-reply-to. Also in patchwork I can see
In-Reply-To: <cover.1586369591.git.vladimir.medvedkin@intel.com>
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-16 14:02 ` Medvedkin, Vladimir
@ 2020-04-16 14:38 ` Thomas Monjalon
0 siblings, 0 replies; 56+ messages in thread
From: Thomas Monjalon @ 2020-04-16 14:38 UTC (permalink / raw)
To: Medvedkin, Vladimir
Cc: dev, konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
16/04/2020 16:02, Medvedkin, Vladimir:
> Hi Thomas,
>
> On 16/04/2020 10:39, Thomas Monjalon wrote:
> > 15/04/2020 20:17, Vladimir Medvedkin:
> >> Currently DPDK has a special implementation of a hash table for
> >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> >> is that it only supports 2 byte values.
> >> The new implementation called K32V64 hash
> >> supports 4 byte keys and 8 byte associated values,
> >> which is enough to store a pointer.
> >>
> >> It would also be nice to get feedback on whether to leave the old FBK
> >> and new k32v64 implementations or deprecate the old one?
> >>
> >> v3:
> >> - added bulk lookup
> >> - avx512 key comparizon is removed from .h
> >>
> >> v2:
> >> - renamed from rte_dwk to rte_k32v64 as was suggested
> >> - reworked lookup function, added inlined subroutines
> >> - added avx512 key comparizon routine
> >> - added documentation
> >> - added statistic counters for total entries and extended entries(linked list)
> > Please use --in-reply-to so we can follow version changes
> > in the same email thread.
> > Also I am changing the states in patchwork as superseded.
> > Please remind updating status of old patches.
> >
>
> Hmm, strange, I used --in-reply-to. Also in patchwork I can see
>
> In-Reply-To: <cover.1586369591.git.vladimir.medvedkin@intel.com>
Indeed, sorry my bad.
I can see correct threading:
http://inbox.dpdk.org/dev/26233198-907c-cf1c-f5ef-154d54c2ed7f@intel.com/
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-16 11:40 ` Mattias Rönnblom
@ 2020-04-17 0:21 ` Wang, Yipeng1
2020-04-23 16:19 ` Ananyev, Konstantin
2020-05-08 20:08 ` Medvedkin, Vladimir
0 siblings, 2 replies; 56+ messages in thread
From: Wang, Yipeng1 @ 2020-04-17 0:21 UTC (permalink / raw)
To: Mattias Rönnblom, Medvedkin, Vladimir, dev, Dumitrescu, Cristian
Cc: Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce
> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: Thursday, April 16, 2020 4:41 AM
> To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
> <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
>
> On 2020-04-16 12:18, Medvedkin, Vladimir wrote:
> > Hi Mattias,
> >
> > -----Original Message-----
> > From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > Sent: Wednesday, April 15, 2020 7:52 PM
> > To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
> > <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
> > Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
> >
> > On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> >> Currently DPDK has a special implementation of a hash table for
> >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> >> is that it only supports 2 byte values.
> >> The new implementation called K32V64 hash supports 4 byte keys and 8
> >> byte associated values, which is enough to store a pointer.
> >>
> >> It would also be nice to get feedback on whether to leave the old FBK
> >> and new k32v64 implementations or deprecate the old one?
> >
> > Do you think it would be feasible to support custom-sized values and remain
> efficient, in a similar manner to how rte_ring_elem.h does things?
> >
> > I'm afraid it is not feasible. For the performance reason keys and
> corresponding values resides in single cache line so there are no extra memory
> for bigger values, such as 16B.
>
>
> Well, if you have a smaller value type (or key type) you would fit into
> something less-than-a-cache line, and thus reduce your memory working set
> further.
>
>
> >> v3:
> >> - added bulk lookup
> >> - avx512 key comparizon is removed from .h
> >>
> >> v2:
> >> - renamed from rte_dwk to rte_k32v64 as was suggested
> >> - reworked lookup function, added inlined subroutines
> >> - added avx512 key comparizon routine
> >> - added documentation
> >> - added statistic counters for total entries and extended
> >> entries(linked list)
> >>
> >> Vladimir Medvedkin (4):
> >> hash: add k32v64 hash library
> >> hash: add documentation for k32v64 hash library
> >> test: add k32v64 hash autotests
> >> test: add k32v64 perf tests
> >>
> >> app/test/Makefile | 1 +
> >> app/test/autotest_data.py | 12 ++
> >> app/test/meson.build | 3 +
> >> app/test/test_hash_perf.c | 130 ++++++++++++
> >> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
> >> doc/api/doxy-api-index.md | 1 +
> >> doc/guides/prog_guide/index.rst | 1 +
> >> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
> >> lib/Makefile | 2 +-
> >> lib/librte_hash/Makefile | 13 +-
> >> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> >> lib/librte_hash/meson.build | 17 +-
> >> lib/librte_hash/rte_hash_version.map | 6 +-
> >> lib/librte_hash/rte_k32v64_hash.c | 315
> ++++++++++++++++++++++++++++++
> >> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
> >> 15 files changed, 1058 insertions(+), 5 deletions(-)
> >> create mode 100644 app/test/test_k32v64_hash.c
> >> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
> >> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> >> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> >> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
> >>
[Wang, Yipeng]
Hi, Vladimir,
Thanks for responding with the use cases earlier.
I discussed with Sameh offline, here are some comments.
1. Since the proposed hash table also has some similarities to rte_table library used by packet framework,
have you tried it yet? Although it is mainly for packet framework, I believe you can use it independently as well.
It has implementations for special key value sizes.
I added Cristian for his comment.
2. We tend to agree with Mattias that it would be better if we have a more generic API name and with the same
API we can do multiple key/value size implementations.
This is to avoid adding new APIs in future to again handle different key/value
use cases. For example, we call it rte_kv_hash, and through the parameter struct we pass in a key-value size pair
we want to use.
Implementation-wise, we may only provide implementations for certain popular use cases (like the one you provided).
For other general use cases, people should go with the more flexible and generic cuckoo hash.
Then we should also merge the FBK under the new API.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-15 18:59 ` Mattias Rönnblom
@ 2020-04-23 13:31 ` Ananyev, Konstantin
2020-05-08 20:14 ` Medvedkin, Vladimir
2020-04-29 21:29 ` Honnappa Nagarahalli
2 siblings, 1 reply; 56+ messages in thread
From: Ananyev, Konstantin @ 2020-04-23 13:31 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Vladimir,
Apologies for late review.
My comments below.
> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide
> precalculated hash signature for add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
>
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.c
> @@ -0,0 +1,315 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_k32v64_hash.h>
> +
> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
> + .name = "RTE_K32V64_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
> +
> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +#endif
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
> + uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +static rte_k32v64_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> + return k32v64_hash_bulk_lookup_avx512vl;
> +#endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
I think for add you also need to do update bucket.cnt
at the start/end of updates (as you do for del).
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + table->t[bucket].val[i] = value;
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + ent->val = value;
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + rte_smp_wmb();
> + table->t[bucket].key_mask |= 1 << idx;
> + table->nb_ent++;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + return 0;
> +}
> +
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash)
> +{
> + uint32_t bucket;
> + int i;
> + struct rte_k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + rte_atomic32_inc(&table->t[bucket].cnt);
I know that right now rte_atomic32 uses _sync gcc builtins underneath,
so it should be safe.
But I think the proper way would be:
table->t[bucket].cnt++;
rte_smp_wmb();
or as alternative probably use C11 atomic ACQUIRE/RELEASE
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->nb_ext_ent--;
> + } else
> + table->t[bucket].key_mask &= ~(1 << i);
I think you protect that update with bucket.cnt.
From my perspective -a s a rule of thumb any update to the bucket/list
Should be within that transaction-start/transaction-end.
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name)
> +{
> + struct rte_k32v64_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + h = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params)
> +{
> + char hash_name[RTE_K32V64_HASH_NAMESIZE];
> + struct rte_k32v64_hash_table *ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
> + if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
> + mem_size = sizeof(struct rte_k32v64_hash_table) +
> + sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + ht = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(params->name, ht->name,
> + RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + ht = NULL;
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
> + rte_free(te);
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + memcpy(ht->name, hash_name, sizeof(ht->name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->lookup = get_lookup_bulk_fn();
> +
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +}
> +
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht)
> +{
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(k32v64_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + rte_mempool_free(ht->ext_ent_pool);
> + rte_free(ht);
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
> new file mode 100644
> index 0000000..b2c52e9
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.h
> @@ -0,0 +1,211 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_K32V64_HASH_H_
> +#define _RTE_K32V64_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_K32V64_HASH_NAMESIZE 32
> +#define RTE_K32V64_KEYS_PER_BUCKET 4
> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
> +
> +struct rte_k32v64_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> +};
> +
> +struct rte_k32v64_ext_ent {
> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct rte_k32v64_hash_bucket {
> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + rte_atomic32_t cnt;
> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
> +} __rte_cache_aligned;
> +
> +struct rte_k32v64_hash_table;
> +
> +typedef int (*rte_k32v64_hash_bulk_lookup_t)
> +(struct rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
> + uint64_t *values, unsigned int n);
> +
> +struct rte_k32v64_hash_table {
> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the hash. */
> + uint32_t nb_ent; /**< Number of entities in the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + rte_k32v64_hash_bulk_lookup_t lookup;
> + __extension__ struct rte_k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*rte_k32v64_cmp_fn_t)
> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f)
> +{
> + uint64_t val = 0;
> + struct rte_k32v64_ext_ent *ent;
> + int32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> + do
> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> +
> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
AFAIK atomic32_read is just a normal read op, so it can be reordered with other ops.
So this construction doesn't protect you from races.
What you probably need here:
do {
cnt1 = table->t[bucket].cnt;
rte_smp_rmb();
....
rte_smp_rmb();
cnt2 = table->t[bucket].cnt;
while (cnt1 != cnt2 || (cnt1 & RTE_K32V64_WRITE_IN_PROGRESS) != 0)
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-17 0:21 ` Wang, Yipeng1
@ 2020-04-23 16:19 ` Ananyev, Konstantin
2020-05-08 20:08 ` Medvedkin, Vladimir
1 sibling, 0 replies; 56+ messages in thread
From: Ananyev, Konstantin @ 2020-04-23 16:19 UTC (permalink / raw)
To: Wang, Yipeng1, Mattias Rönnblom, Medvedkin, Vladimir, dev,
Dumitrescu, Cristian
Cc: Gobriel, Sameh, Richardson, Bruce
Hi everyone,
> >
> > On 2020-04-16 12:18, Medvedkin, Vladimir wrote:
> > > Hi Mattias,
> > >
> > > -----Original Message-----
> > > From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > > Sent: Wednesday, April 15, 2020 7:52 PM
> > > To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
> > > <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
> > > Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
> > >
> > > On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> > >> Currently DPDK has a special implementation of a hash table for
> > >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> > >> is that it only supports 2 byte values.
> > >> The new implementation called K32V64 hash supports 4 byte keys and 8
> > >> byte associated values, which is enough to store a pointer.
> > >>
> > >> It would also be nice to get feedback on whether to leave the old FBK
> > >> and new k32v64 implementations or deprecate the old one?
> > >
> > > Do you think it would be feasible to support custom-sized values and remain
> > efficient, in a similar manner to how rte_ring_elem.h does things?
> > >
> > > I'm afraid it is not feasible. For the performance reason keys and
> > corresponding values resides in single cache line so there are no extra memory
> > for bigger values, such as 16B.
> >
> >
> > Well, if you have a smaller value type (or key type) you would fit into
> > something less-than-a-cache line, and thus reduce your memory working set
> > further.
> >
> >
> > >> v3:
> > >> - added bulk lookup
> > >> - avx512 key comparizon is removed from .h
> > >>
> > >> v2:
> > >> - renamed from rte_dwk to rte_k32v64 as was suggested
> > >> - reworked lookup function, added inlined subroutines
> > >> - added avx512 key comparizon routine
> > >> - added documentation
> > >> - added statistic counters for total entries and extended
> > >> entries(linked list)
> > >>
> > >> Vladimir Medvedkin (4):
> > >> hash: add k32v64 hash library
> > >> hash: add documentation for k32v64 hash library
> > >> test: add k32v64 hash autotests
> > >> test: add k32v64 perf tests
> > >>
> > >> app/test/Makefile | 1 +
> > >> app/test/autotest_data.py | 12 ++
> > >> app/test/meson.build | 3 +
> > >> app/test/test_hash_perf.c | 130 ++++++++++++
> > >> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
> > >> doc/api/doxy-api-index.md | 1 +
> > >> doc/guides/prog_guide/index.rst | 1 +
> > >> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
> > >> lib/Makefile | 2 +-
> > >> lib/librte_hash/Makefile | 13 +-
> > >> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> > >> lib/librte_hash/meson.build | 17 +-
> > >> lib/librte_hash/rte_hash_version.map | 6 +-
> > >> lib/librte_hash/rte_k32v64_hash.c | 315
> > ++++++++++++++++++++++++++++++
> > >> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
> > >> 15 files changed, 1058 insertions(+), 5 deletions(-)
> > >> create mode 100644 app/test/test_k32v64_hash.c
> > >> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
> > >> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> > >> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> > >> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
> > >>
> [Wang, Yipeng]
> Hi, Vladimir,
> Thanks for responding with the use cases earlier.
> I discussed with Sameh offline, here are some comments.
>
> 1. Since the proposed hash table also has some similarities to rte_table library used by packet framework,
> have you tried it yet? Although it is mainly for packet framework, I believe you can use it independently as well.
> It has implementations for special key value sizes.
> I added Cristian for his comment.
>
> 2. We tend to agree with Mattias that it would be better if we have a more generic API name and with the same
> API we can do multiple key/value size implementations.
> This is to avoid adding new APIs in future to again handle different key/value
> use cases. For example, we call it rte_kv_hash, and through the parameter struct we pass in a key-value size pair
> we want to use.
> Implementation-wise, we may only provide implementations for certain popular use cases (like the one you provided).
> For other general use cases, people should go with the more flexible and generic cuckoo hash.
> Then we should also merge the FBK under the new API.
From my perspective Vladimir work is not an attempt to introduce new API -
but to fix (extend) existing FBK one.
Right now there is a contradictory situation:
from one side with 4B keys hash tables are quite common, from other side
because of its limitations (2B value, no mechanism to resolve collisions) fbk_hash
is hardly usable in majority of real-world scenarios.
About making it more generic - we already have one generic rte_hash API,
why should we introduce an new one?
My take we should fix FBK hash and make it usable.
If it is not an option by some reason, then I think we
should deprecate and remove FBK hash at all,
and concentrate on making generic one work faster for particular scenarios.
Let say we can try to move what Vladimir did, under rte_hash API umbrella
for special config (4B key).
Konstantin
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-15 18:59 ` Mattias Rönnblom
2020-04-23 13:31 ` Ananyev, Konstantin
@ 2020-04-29 21:29 ` Honnappa Nagarahalli
2020-05-08 20:38 ` Medvedkin, Vladimir
2 siblings, 1 reply; 56+ messages in thread
From: Honnappa Nagarahalli @ 2020-04-29 21:29 UTC (permalink / raw)
To: Vladimir Medvedkin, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, nd, Honnappa Nagarahalli, nd
Hi Vladimir,
I am not sure which way the decision is made, but few comments inline. Please use C11 built-ins for synchronization.
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Vladimir Medvedkin
> Sent: Wednesday, April 15, 2020 1:17 PM
> To: dev@dpdk.org
> Cc: konstantin.ananyev@intel.com; yipeng1.wang@intel.com;
> sameh.gobriel@intel.com; bruce.richardson@intel.com
> Subject: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
>
> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide precalculated hash
> signature for add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 13 +-
> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_k32v64_hash.c | 315
> +++++++++++++++++++++++++++++++++
> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++++
> 7 files changed, 615 insertions(+), 5 deletions(-) create mode 100644
> lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_k32v64_hash.c create mode
> 100644 lib/librte_hash/rte_k32v64_hash.h
>
> diff --git a/lib/Makefile b/lib/Makefile index 46b91ae..a8c02e4 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev
> \
> librte_net librte_hash librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash :=
> librte_eal librte_ring
> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd :=
> librte_eal librte_ring librte_hash
> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
> ec9f864..023689d 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -8,13 +8,14 @@ LIB = librte_hash.a
>
> CFLAGS += -O3
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> -LDLIBS += -lrte_eal -lrte_ring
> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>
> EXPORT_MAP := rte_hash_version.map
>
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
>
> # install this header file
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
> +28,15 @@ endif SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include +=
> rte_jhash.h SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
> +
> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1
> |
> +\ grep -q __AVX512VL__ && echo 1)
> +
> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
> + CFLAGS_rte_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
> b/lib/librte_hash/k32v64_hash_avx512vl.c
> new file mode 100644
> index 0000000..7c70dd2
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <rte_k32v64_hash.h>
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +
> +static inline int
> +k32v64_cmp_keys_avx512vl(struct rte_k32v64_hash_bucket *bucket,
> uint32_t key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys,
> srch_key);
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +k32v64_hash_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> + k32v64_cmp_keys_avx512vl);
> +}
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n) {
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build index
> 6ab46ae..8a37cc4 100644
> --- a/lib/librte_hash/meson.build
> +++ b/lib/librte_hash/meson.build
> @@ -3,10 +3,23 @@
>
> headers = files('rte_crc_arm64.h',
> 'rte_fbk_hash.h',
> + 'rte_k32v64_hash.h',
> 'rte_hash_crc.h',
> 'rte_hash.h',
> 'rte_jhash.h',
> 'rte_thash.h')
>
> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps += ['ring']
> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c',
> +'rte_k32v64_hash.c') deps += ['ring', 'mempool']
> +
> +if dpdk_conf.has('RTE_ARCH_X86')
> + if cc.has_argument('-mavx512vl')
> + avx512_tmplib = static_library('avx512_tmp',
> + 'k32v64_hash_avx512vl.c',
> + dependencies: static_rte_mempool,
> + c_args: cflags + ['-mavx512vl'])
> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
> + cflags += '-DCC_AVX512VL_SUPPORT'
> +
> + endif
> +endif
> diff --git a/lib/librte_hash/rte_hash_version.map
> b/lib/librte_hash/rte_hash_version.map
> index a8fbbc3..9a4f2f6 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -34,5 +34,9 @@ EXPERIMENTAL {
>
> rte_hash_free_key_with_position;
> rte_hash_max_key_id;
> -
> + rte_k32v64_hash_create;
> + rte_k32v64_hash_find_existing;
> + rte_k32v64_hash_free;
> + rte_k32v64_hash_add;
> + rte_k32v64_hash_delete;
> };
> diff --git a/lib/librte_hash/rte_k32v64_hash.c
> b/lib/librte_hash/rte_k32v64_hash.c
> new file mode 100644
> index 0000000..7ed94b4
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.c
> @@ -0,0 +1,315 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_k32v64_hash.h>
> +
> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
> + .name = "RTE_K32V64_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
> +
> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +#endif
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t
> *keys,
> + uint32_t *hashes, uint64_t *values, unsigned int n) {
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +static rte_k32v64_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> + return k32v64_hash_bulk_lookup_avx512vl; #endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + table->t[bucket].val[i] = value;
The old value of val[i] might be getting used in the reader. It needs to be returned to the caller so that it can be put on the RCU defer queue to free later.
You might also want to use an atomic store to ensure the stores are not split.
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + ent->val = value;
Same here, need to return the old value.
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + rte_smp_wmb();
> + table->t[bucket].key_mask |= 1 << idx;
The barrier and the store can be replaced with a store-release in C11.
> + table->nb_ent++;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
Need C11 SLIST
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + return 0;
> +}
> +
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash)
This should return the value corresponding to the deleted key
> +{
> + uint32_t bucket;
> + int i;
> + struct rte_k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->t[bucket].key[i] = ent->key;
In this case, both key and value are changing and it is not atomic. There is a possibility that the lookup function will receive incorrect value of 'val[i]'. Suggest following the method described below.
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head,
> next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + table->nb_ext_ent--;
> + } else
Suggest changing this into a 2 step process:
1) Delete the entry from the fixed bucket (update key_mask)
2) Move the head of extended bucket to the fixed bucket
2a) Insert the key/value from the head into the deleted index and update the key_mask (this will ensure that the reader has the entry available while it is being removed from the extended bucket)
2b) increment the counter indicating the extended bucket is changing
2c) remove the head
2d) increment the counter indicating the extended bucket change is done
The above procedure will result in removing the spinning (resulting from step 2b) in the lookup function (comment is added in the lookup function), readers will not be blocked if the writer is scheduled out.
This logic is implemented in rte_hash using cuckoo hash, suggest to take a look for required memory orderings.
> + table->t[bucket].key_mask &= ~(1 << i);
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
Entry cannot be put to free list immediately as the readers are still using it.
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent,
> next);
> + rte_atomic32_inc(&table->t[bucket].cnt);
> + rte_mempool_put(table->ext_ent_pool, ent);
Entry might be getting used by the readers still.
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name) {
> + struct rte_k32v64_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + h = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE)
> == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params) {
> + char hash_name[RTE_K32V64_HASH_NAMESIZE];
> + struct rte_k32v64_hash_table *ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params-
> >name);
> + if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
> + mem_size = sizeof(struct rte_k32v64_hash_table) +
> + sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL,
> NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + ht = (struct rte_k32v64_hash_table *) te->data;
> + if (strncmp(params->name, ht->name,
> + RTE_K32V64_HASH_NAMESIZE) == 0)
> + break;
> + }
> + ht = NULL;
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
> + rte_free(te);
> + rte_mempool_free(mp);
> + goto exit;
> + }
> +
> + memcpy(ht->name, hash_name, sizeof(ht->name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->lookup = get_lookup_bulk_fn();
> +
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +}
> +
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht) {
> + struct rte_tailq_entry *te;
> + struct rte_k32v64_hash_list *k32v64_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
> + rte_k32v64_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(k32v64_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + rte_mempool_free(ht->ext_ent_pool);
> + rte_free(ht);
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_k32v64_hash.h
> b/lib/librte_hash/rte_k32v64_hash.h
> new file mode 100644
> index 0000000..b2c52e9
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.h
> @@ -0,0 +1,211 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_K32V64_HASH_H_
> +#define _RTE_K32V64_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_K32V64_HASH_NAMESIZE 32
> +#define RTE_K32V64_KEYS_PER_BUCKET 4
> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
> +
> +struct rte_k32v64_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> +};
> +
> +struct rte_k32v64_ext_ent {
> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct rte_k32v64_hash_bucket {
> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + rte_atomic32_t cnt;
> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head; }
> +__rte_cache_aligned;
> +
> +struct rte_k32v64_hash_table;
> +
> +typedef int (*rte_k32v64_hash_bulk_lookup_t) (struct
> +rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
> + uint64_t *values, unsigned int n);
> +
> +struct rte_k32v64_hash_table {
> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the
> hash. */
> + uint32_t nb_ent; /**< Number of entities in the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + rte_k32v64_hash_bulk_lookup_t lookup;
> + __extension__ struct rte_k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*rte_k32v64_cmp_fn_t)
> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
This load can happen speculatively.
You have to use the guard variable and payload concept here for reader-writer concurrency.
On the writer:
store -> key
store -> val
store_release -> key_mask (or there will be a rte_smp_wmb before this) 'key_mask' acts as the guard variable
On the reader:
load_acquire -> key_mask (or there will be a rte_smp_rmb after this)
if statement etc.....
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f) {
> + uint64_t val = 0;
> + struct rte_k32v64_ext_ent *ent;
> + int32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> + do
> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
The reader can hang here if the writer increments the counter and gets scheduled out. Using the method suggested above, you can remove this code.
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> +
> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
With the logic mentioned in delete function, the counter needs to be read at the beginning of the loop. Suggest looking at rte_hash-cuckoo hash for the memory ordering required for the counter.
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +static inline int
> +rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> +__k32v64_cmp_keys); }
> +
> +static inline int
> +rte_k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n) {
> + return table->lookup(table, keys, hashes, values, n); }
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe
> + * and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
> + * @param hash
> + * Hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread
> + * safe and should only be called from one thread.
> + *
> + * @param ht
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash);
> +
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a pointer
> +to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_k32v64_hash_table *
> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_K32V64_HASH_H_ */
> --
> 2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v4 0/4] add new kv hash table
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
2020-04-15 18:51 ` Mattias Rönnblom
2020-04-16 9:39 ` Thomas Monjalon
@ 2020-05-08 19:58 ` Vladimir Medvedkin
2020-06-16 16:37 ` Thomas Monjalon
2021-03-24 21:28 ` Thomas Monjalon
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
` (3 subsequent siblings)
6 siblings, 2 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-05-08 19:58 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Currently DPDK has a special implementation of a hash table for
4 byte keys which is called FBK hash. Unfortunately its main drawback
is that it only supports 2 byte values.
The new implementation called KV hash
supports 4 byte keys and 8 byte associated values,
which is enough to store a pointer.
v4:
- internal implementation is hided under universal API for any key and value sizes
- add and delete API now return old value
- add transaction counter modification to _add()
- transaction counter now modifies with c11 atomics
v3:
- added bulk lookup
- avx512 key comparizon is removed from .h
v2:
- renamed from rte_dwk to rte_k32v64 as was suggested
- reworked lookup function, added inlined subroutines
- added avx512 key comparizon routine
- added documentation
- added statistic counters for total entries and extended entries(linked list)
Vladimir Medvedkin (4):
hash: add kv hash library
hash: add documentation for kv hash library
test: add kv hash autotests
test: add kv perf tests
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 ++
app/test/meson.build | 3 +
app/test/test_hash_perf.c | 111 +++++++++++++
app/test/test_kv_hash.c | 242 ++++++++++++++++++++++++++++
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/kv_hash_lib.rst | 66 ++++++++
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 14 +-
lib/librte_hash/k32v64_hash.c | 277 +++++++++++++++++++++++++++++++++
lib/librte_hash/k32v64_hash.h | 98 ++++++++++++
lib/librte_hash/k32v64_hash_avx512vl.c | 59 +++++++
lib/librte_hash/meson.build | 17 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_kv_hash.c | 184 ++++++++++++++++++++++
lib/librte_hash/rte_kv_hash.h | 169 ++++++++++++++++++++
17 files changed, 1258 insertions(+), 5 deletions(-)
create mode 100644 app/test/test_kv_hash.c
create mode 100644 doc/guides/prog_guide/kv_hash_lib.rst
create mode 100644 lib/librte_hash/k32v64_hash.c
create mode 100644 lib/librte_hash/k32v64_hash.h
create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
create mode 100644 lib/librte_hash/rte_kv_hash.c
create mode 100644 lib/librte_hash/rte_kv_hash.h
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
` (2 preceding siblings ...)
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
@ 2020-05-08 19:58 ` Vladimir Medvedkin
2020-06-23 15:44 ` Ananyev, Konstantin
` (2 more replies)
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 2/4] hash: add documentation for " Vladimir Medvedkin
` (2 subsequent siblings)
6 siblings, 3 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-05-08 19:58 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
KV hash is a special optimized key-value storage for fixed
key and value sizes. At the moment it supports 32 bit keys
and 64 bit values. This table is hash function agnostic so
user must provide precalculated hash signature for
add/delete/lookup operations.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
lib/Makefile | 2 +-
lib/librte_hash/Makefile | 14 +-
lib/librte_hash/k32v64_hash.c | 277 +++++++++++++++++++++++++++++++++
lib/librte_hash/k32v64_hash.h | 98 ++++++++++++
lib/librte_hash/k32v64_hash_avx512vl.c | 59 +++++++
lib/librte_hash/meson.build | 17 +-
lib/librte_hash/rte_hash_version.map | 6 +-
lib/librte_hash/rte_kv_hash.c | 184 ++++++++++++++++++++++
lib/librte_hash/rte_kv_hash.h | 169 ++++++++++++++++++++
9 files changed, 821 insertions(+), 5 deletions(-)
create mode 100644 lib/librte_hash/k32v64_hash.c
create mode 100644 lib/librte_hash/k32v64_hash.h
create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
create mode 100644 lib/librte_hash/rte_kv_hash.c
create mode 100644 lib/librte_hash/rte_kv_hash.h
diff --git a/lib/Makefile b/lib/Makefile
index 9d24609..42769e9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
librte_net librte_hash librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
-DEPDIRS-librte_hash := librte_eal librte_ring
+DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile
index ec9f864..a0cdee9 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -8,13 +8,15 @@ LIB = librte_hash.a
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_ring
+LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
EXPORT_MAP := rte_hash_version.map
# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_kv_hash.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash.c
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
@@ -27,5 +29,15 @@ endif
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_kv_hash.h
+
+CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1 | \
+grep -q __AVX512VL__ && echo 1)
+
+ifeq ($(CC_AVX512VL_SUPPORT), 1)
+ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
+ CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
+ CFLAGS_k32v64_hash.o += -DCC_AVX512VL_SUPPORT
+endif
include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_hash/k32v64_hash.c b/lib/librte_hash/k32v64_hash.c
new file mode 100644
index 0000000..24cd63a
--- /dev/null
+++ b/lib/librte_hash/k32v64_hash.c
@@ -0,0 +1,277 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <string.h>
+
+#include <rte_errno.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+
+#include "k32v64_hash.h"
+
+static inline int
+k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ return __k32v64_hash_lookup(table, key, hash, value, __kv_cmp_keys);
+}
+
+static int
+k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
+ uint32_t *hashes, void *values_p, unsigned int n)
+{
+ struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
+ uint32_t *keys = keys_p;
+ uint64_t *values = values_p;
+ int ret, cnt = 0;
+ unsigned int i;
+
+ if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
+ (values == NULL)))
+ return -EINVAL;
+
+ for (i = 0; i < n; i++) {
+ ret = k32v64_hash_lookup(table, keys[i], hashes[i],
+ &values[i]);
+ if (ret == 0)
+ cnt++;
+ }
+ return cnt;
+}
+
+#ifdef CC_AVX512VL_SUPPORT
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
+ void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
+#endif
+
+static rte_kv_hash_bulk_lookup_t
+get_lookup_bulk_fn(void)
+{
+#ifdef CC_AVX512VL_SUPPORT
+ if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
+ return k32v64_hash_bulk_lookup_avx512vl;
+#endif
+ return k32v64_hash_bulk_lookup;
+}
+
+static int
+k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t value, uint64_t *old_value, int *found)
+{
+ uint32_t bucket;
+ int i, idx, ret;
+ uint8_t msk;
+ struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+ /* Search key in table. Update value if exists */
+ for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ *old_value = table->t[bucket].val[i];
+ *found = 1;
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_ACQUIRE);
+ table->t[bucket].val[i] = value;
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_RELEASE);
+ return 0;
+ }
+ }
+
+ if (!SLIST_EMPTY(&table->t[bucket].head)) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ *old_value = ent->val;
+ *found = 1;
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_ACQUIRE);
+ ent->val = value;
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_RELEASE);
+ return 0;
+ }
+ }
+ }
+
+ msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
+ if (msk) {
+ idx = __builtin_ctz(msk);
+ table->t[bucket].key[idx] = key;
+ table->t[bucket].val[idx] = value;
+ __atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
+ __ATOMIC_RELEASE);
+ table->nb_ent++;
+ *found = 0;
+ return 0;
+ }
+
+ ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
+ if (ret < 0)
+ return ret;
+
+ SLIST_NEXT(ent, next) = NULL;
+ ent->key = key;
+ ent->val = value;
+ rte_smp_wmb();
+ SLIST_FOREACH(tmp, &table->t[bucket].head, next)
+ prev = tmp;
+
+ if (prev == NULL)
+ SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
+ else
+ SLIST_INSERT_AFTER(prev, ent, next);
+
+ table->nb_ent++;
+ table->nb_ext_ent++;
+ *found = 0;
+ return 0;
+}
+
+static int
+k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *old_value)
+{
+ uint32_t bucket;
+ int i;
+ struct k32v64_ext_ent *ent;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ bucket = hash & table->bucket_msk;
+
+ for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == table->t[bucket].key[i]) &&
+ (table->t[bucket].key_mask & (1 << i))) {
+ *old_value = table->t[bucket].val[i];
+ ent = SLIST_FIRST(&table->t[bucket].head);
+ if (ent) {
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_ACQUIRE);
+ table->t[bucket].key[i] = ent->key;
+ table->t[bucket].val[i] = ent->val;
+ SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
+ __atomic_fetch_add(&table->t[bucket].cnt, 1,
+ __ATOMIC_RELEASE);
+ table->nb_ext_ent--;
+ } else
+ __atomic_and_fetch(&table->t[bucket].key_mask,
+ ~(1 << i), __ATOMIC_RELEASE);
+ if (ent)
+ rte_mempool_put(table->ext_ent_pool, ent);
+ table->nb_ent--;
+ return 0;
+ }
+ }
+
+ SLIST_FOREACH(ent, &table->t[bucket].head, next)
+ if (ent->key == key)
+ break;
+
+ if (ent == NULL)
+ return -ENOENT;
+
+ *old_value = ent->val;
+
+ __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
+ SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
+ __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
+ rte_mempool_put(table->ext_ent_pool, ent);
+
+ table->nb_ext_ent--;
+ table->nb_ent--;
+
+ return 0;
+}
+
+static int
+k32v64_modify(struct rte_kv_hash_table *table, void *key_p, uint32_t hash,
+ enum rte_kv_modify_op op, void *value_p, int *found)
+{
+ struct k32v64_hash_table *ht = (struct k32v64_hash_table *)table;
+ uint32_t *key = key_p;
+ uint64_t value;
+
+ if ((ht == NULL) || (key == NULL) || (value_p == NULL) ||
+ (found == NULL) || (op >= RTE_KV_MODIFY_OP_MAX)) {
+ return -EINVAL;
+ }
+
+ value = *(uint64_t *)value_p;
+ switch (op) {
+ case RTE_KV_MODIFY_ADD:
+ return k32v64_hash_add(ht, *key, hash, value, value_p, found);
+ case RTE_KV_MODIFY_DEL:
+ return k32v64_hash_delete(ht, *key, hash, value_p);
+ default:
+ break;
+ }
+
+ return -EINVAL;
+}
+
+struct rte_kv_hash_table *
+k32v64_hash_create(const struct rte_kv_hash_params *params)
+{
+ char hash_name[RTE_KV_HASH_NAMESIZE];
+ struct k32v64_hash_table *ht = NULL;
+ uint32_t mem_size, nb_buckets, max_ent;
+ int ret;
+ struct rte_mempool *mp;
+
+ if ((params == NULL) || (params->name == NULL) ||
+ (params->entries == 0)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params->name);
+ if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
+ rte_errno = ENAMETOOLONG;
+ return NULL;
+ }
+
+ max_ent = rte_align32pow2(params->entries);
+ nb_buckets = max_ent / K32V64_KEYS_PER_BUCKET;
+ mem_size = sizeof(struct k32v64_hash_table) +
+ sizeof(struct k32v64_hash_bucket) * nb_buckets;
+
+ mp = rte_mempool_create(hash_name, max_ent,
+ sizeof(struct k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
+ params->socket_id, 0);
+
+ if (mp == NULL)
+ return NULL;
+
+ ht = rte_zmalloc_socket(hash_name, mem_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+ if (ht == NULL) {
+ rte_mempool_free(mp);
+ return NULL;
+ }
+
+ memcpy(ht->pub.name, hash_name, sizeof(ht->pub.name));
+ ht->max_ent = max_ent;
+ ht->bucket_msk = nb_buckets - 1;
+ ht->ext_ent_pool = mp;
+ ht->pub.lookup = get_lookup_bulk_fn();
+ ht->pub.modify = k32v64_modify;
+
+ return (struct rte_kv_hash_table *)ht;
+}
+
+void
+k32v64_hash_free(struct rte_kv_hash_table *ht)
+{
+ if (ht == NULL)
+ return;
+
+ rte_mempool_free(((struct k32v64_hash_table *)ht)->ext_ent_pool);
+ rte_free(ht);
+}
diff --git a/lib/librte_hash/k32v64_hash.h b/lib/librte_hash/k32v64_hash.h
new file mode 100644
index 0000000..10061a5
--- /dev/null
+++ b/lib/librte_hash/k32v64_hash.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <rte_kv_hash.h>
+
+#define K32V64_KEYS_PER_BUCKET 4
+#define K32V64_WRITE_IN_PROGRESS 1
+#define VALID_KEY_MSK ((1 << K32V64_KEYS_PER_BUCKET) - 1)
+
+struct k32v64_ext_ent {
+ SLIST_ENTRY(k32v64_ext_ent) next;
+ uint32_t key;
+ uint64_t val;
+};
+
+struct k32v64_hash_bucket {
+ uint32_t key[K32V64_KEYS_PER_BUCKET];
+ uint64_t val[K32V64_KEYS_PER_BUCKET];
+ uint8_t key_mask;
+ uint32_t cnt;
+ SLIST_HEAD(k32v64_list_head, k32v64_ext_ent) head;
+} __rte_cache_aligned;
+
+struct k32v64_hash_table {
+ struct rte_kv_hash_table pub; /**< Public part */
+ uint32_t nb_ent; /**< Number of entities in the table*/
+ uint32_t nb_ext_ent; /**< Number of extended entities */
+ uint32_t max_ent; /**< Maximum number of entities */
+ uint32_t bucket_msk;
+ struct rte_mempool *ext_ent_pool;
+ __extension__ struct k32v64_hash_bucket t[0];
+};
+
+typedef int (*k32v64_cmp_fn_t)
+(struct k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
+
+static inline int
+__kv_cmp_keys(struct k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ int i;
+
+ for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
+ if ((key == bucket->key[i]) &&
+ (bucket->key_mask & (1 << i))) {
+ *val = bucket->val[i];
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static inline int
+__k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value, k32v64_cmp_fn_t cmp_f)
+{
+ uint64_t val = 0;
+ struct k32v64_ext_ent *ent;
+ uint32_t cnt;
+ int found = 0;
+ uint32_t bucket = hash & table->bucket_msk;
+
+ do {
+
+ do {
+ cnt = __atomic_load_n(&table->t[bucket].cnt,
+ __ATOMIC_ACQUIRE);
+ } while (unlikely(cnt & K32V64_WRITE_IN_PROGRESS));
+
+ found = cmp_f(&table->t[bucket], key, &val);
+ if (unlikely((found == 0) &&
+ (!SLIST_EMPTY(&table->t[bucket].head)))) {
+ SLIST_FOREACH(ent, &table->t[bucket].head, next) {
+ if (ent->key == key) {
+ val = ent->val;
+ found = 1;
+ break;
+ }
+ }
+ }
+ __atomic_thread_fence(__ATOMIC_RELEASE);
+ } while (unlikely(cnt != __atomic_load_n(&table->t[bucket].cnt,
+ __ATOMIC_RELAXED)));
+
+ if (found == 1) {
+ *value = val;
+ return 0;
+ } else
+ return -ENOENT;
+}
+
+struct rte_kv_hash_table *
+k32v64_hash_create(const struct rte_kv_hash_params *params);
+
+void
+k32v64_hash_free(struct rte_kv_hash_table *ht);
diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c b/lib/librte_hash/k32v64_hash_avx512vl.c
new file mode 100644
index 0000000..75cede5
--- /dev/null
+++ b/lib/librte_hash/k32v64_hash_avx512vl.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "k32v64_hash.h"
+
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void *keys_p,
+ uint32_t *hashes, void *values_p, unsigned int n);
+
+static inline int
+k32v64_cmp_keys_avx512vl(struct k32v64_hash_bucket *bucket, uint32_t key,
+ uint64_t *val)
+{
+ __m128i keys, srch_key;
+ __mmask8 msk;
+
+ keys = _mm_load_si128((void *)bucket);
+ srch_key = _mm_set1_epi32(key);
+
+ msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys, srch_key);
+ if (msk) {
+ *val = bucket->val[__builtin_ctz(msk)];
+ return 1;
+ }
+
+ return 0;
+}
+
+static inline int
+k32v64_hash_lookup_avx512vl(struct k32v64_hash_table *table, uint32_t key,
+ uint32_t hash, uint64_t *value)
+{
+ return __k32v64_hash_lookup(table, key, hash, value,
+ k32v64_cmp_keys_avx512vl);
+}
+
+int
+k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void *keys_p,
+ uint32_t *hashes, void *values_p, unsigned int n)
+{
+ struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
+ uint32_t *keys = keys_p;
+ uint64_t *values = values_p;
+ int ret, cnt = 0;
+ unsigned int i;
+
+ if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
+ (values == NULL)))
+ return -EINVAL;
+
+ for (i = 0; i < n; i++) {
+ ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
+ &values[i]);
+ if (ret == 0)
+ cnt++;
+ }
+ return cnt;
+}
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index 6ab46ae..0d014ea 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -3,10 +3,23 @@
headers = files('rte_crc_arm64.h',
'rte_fbk_hash.h',
+ 'rte_kv_hash.h',
'rte_hash_crc.h',
'rte_hash.h',
'rte_jhash.h',
'rte_thash.h')
-sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
-deps += ['ring']
+sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_kv_hash.c', 'k32v64_hash.c')
+deps += ['ring', 'mempool']
+
+if dpdk_conf.has('RTE_ARCH_X86')
+ if cc.has_argument('-mavx512vl')
+ avx512_tmplib = static_library('avx512_tmp',
+ 'k32v64_hash_avx512vl.c',
+ dependencies: static_rte_mempool,
+ c_args: cflags + ['-mavx512vl'])
+ objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
+ cflags += '-DCC_AVX512VL_SUPPORT'
+
+ endif
+endif
diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
index c2a9094..614e0a5 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -36,5 +36,9 @@ EXPERIMENTAL {
rte_hash_lookup_with_hash_bulk;
rte_hash_lookup_with_hash_bulk_data;
rte_hash_max_key_id;
-
+ rte_kv_hash_create;
+ rte_kv_hash_find_existing;
+ rte_kv_hash_free;
+ rte_kv_hash_add;
+ rte_kv_hash_delete;
};
diff --git a/lib/librte_hash/rte_kv_hash.c b/lib/librte_hash/rte_kv_hash.c
new file mode 100644
index 0000000..03df8db
--- /dev/null
+++ b/lib/librte_hash/rte_kv_hash.c
@@ -0,0 +1,184 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <string.h>
+
+#include <rte_eal_memconfig.h>
+#include <rte_errno.h>
+#include <rte_malloc.h>
+#include <rte_memory.h>
+#include <rte_tailq.h>
+
+#include <rte_kv_hash.h>
+#include "k32v64_hash.h"
+
+TAILQ_HEAD(rte_kv_hash_list, rte_tailq_entry);
+
+static struct rte_tailq_elem rte_kv_hash_tailq = {
+ .name = "RTE_KV_HASH",
+};
+
+EAL_REGISTER_TAILQ(rte_kv_hash_tailq);
+
+int
+rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
+ uint32_t hash, void *value, int *found)
+{
+ if (table == NULL)
+ return -EINVAL;
+
+ return table->modify(table, key, hash, RTE_KV_MODIFY_ADD,
+ value, found);
+}
+
+int
+rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
+ uint32_t hash, void *value)
+{
+ int found;
+
+ if (table == NULL)
+ return -EINVAL;
+
+ return table->modify(table, key, hash, RTE_KV_MODIFY_DEL,
+ value, &found);
+}
+
+struct rte_kv_hash_table *
+rte_kv_hash_find_existing(const char *name)
+{
+ struct rte_kv_hash_table *h = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_kv_hash_list *kv_hash_list;
+
+ kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
+ rte_kv_hash_list);
+
+ rte_mcfg_tailq_read_lock();
+ TAILQ_FOREACH(te, kv_hash_list, next) {
+ h = (struct rte_kv_hash_table *) te->data;
+ if (strncmp(name, h->name, RTE_KV_HASH_NAMESIZE) == 0)
+ break;
+ }
+ rte_mcfg_tailq_read_unlock();
+ if (te == NULL) {
+ rte_errno = ENOENT;
+ return NULL;
+ }
+ return h;
+}
+
+struct rte_kv_hash_table *
+rte_kv_hash_create(const struct rte_kv_hash_params *params)
+{
+ char hash_name[RTE_KV_HASH_NAMESIZE];
+ struct rte_kv_hash_table *ht, *tmp_ht = NULL;
+ struct rte_tailq_entry *te;
+ struct rte_kv_hash_list *kv_hash_list;
+ int ret;
+
+ if ((params == NULL) || (params->name == NULL) ||
+ (params->entries == 0) ||
+ (params->type >= RTE_KV_HASH_MAX)) {
+ rte_errno = EINVAL;
+ return NULL;
+ }
+
+ kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
+ rte_kv_hash_list);
+
+ ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params->name);
+ if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
+ rte_errno = ENAMETOOLONG;
+ return NULL;
+ }
+
+ switch (params->type) {
+ case RTE_KV_HASH_K32V64:
+ ht = k32v64_hash_create(params);
+ break;
+ default:
+ rte_errno = EINVAL;
+ return NULL;
+ }
+ if (ht == NULL)
+ return ht;
+
+ rte_mcfg_tailq_write_lock();
+ TAILQ_FOREACH(te, kv_hash_list, next) {
+ tmp_ht = (struct rte_kv_hash_table *) te->data;
+ if (strncmp(params->name, tmp_ht->name,
+ RTE_KV_HASH_NAMESIZE) == 0)
+ break;
+ }
+ if (te != NULL) {
+ rte_errno = EEXIST;
+ goto exit;
+ }
+
+ te = rte_zmalloc("KV_HASH_TAILQ_ENTRY", sizeof(*te), 0);
+ if (te == NULL) {
+ RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
+ goto exit;
+ }
+
+ ht->type = params->type;
+ te->data = (void *)ht;
+ TAILQ_INSERT_TAIL(kv_hash_list, te, next);
+
+ rte_mcfg_tailq_write_unlock();
+
+ return ht;
+
+exit:
+ rte_mcfg_tailq_write_unlock();
+ switch (params->type) {
+ case RTE_KV_HASH_K32V64:
+ k32v64_hash_free(ht);
+ break;
+ default:
+ break;
+ }
+ return NULL;
+}
+
+void
+rte_kv_hash_free(struct rte_kv_hash_table *ht)
+{
+ struct rte_tailq_entry *te;
+ struct rte_kv_hash_list *kv_hash_list;
+
+ if (ht == NULL)
+ return;
+
+ kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
+ rte_kv_hash_list);
+
+ rte_mcfg_tailq_write_lock();
+
+ /* find out tailq entry */
+ TAILQ_FOREACH(te, kv_hash_list, next) {
+ if (te->data == (void *) ht)
+ break;
+ }
+
+
+ if (te == NULL) {
+ rte_mcfg_tailq_write_unlock();
+ return;
+ }
+
+ TAILQ_REMOVE(kv_hash_list, te, next);
+
+ rte_mcfg_tailq_write_unlock();
+
+ switch (ht->type) {
+ case RTE_KV_HASH_K32V64:
+ k32v64_hash_free(ht);
+ break;
+ default:
+ break;
+ }
+ rte_free(te);
+}
diff --git a/lib/librte_hash/rte_kv_hash.h b/lib/librte_hash/rte_kv_hash.h
new file mode 100644
index 0000000..c0375d1
--- /dev/null
+++ b/lib/librte_hash/rte_kv_hash.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_KV_HASH_H_
+#define _RTE_KV_HASH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_compat.h>
+#include <rte_atomic.h>
+#include <rte_mempool.h>
+
+#define RTE_KV_HASH_NAMESIZE 32
+
+enum rte_kv_hash_type {
+ RTE_KV_HASH_K32V64,
+ RTE_KV_HASH_MAX
+};
+
+enum rte_kv_modify_op {
+ RTE_KV_MODIFY_ADD,
+ RTE_KV_MODIFY_DEL,
+ RTE_KV_MODIFY_OP_MAX
+};
+
+struct rte_kv_hash_params {
+ const char *name;
+ uint32_t entries;
+ int socket_id;
+ enum rte_kv_hash_type type;
+};
+
+struct rte_kv_hash_table;
+
+typedef int (*rte_kv_hash_bulk_lookup_t)
+(struct rte_kv_hash_table *table, void *keys, uint32_t *hashes,
+ void *values, unsigned int n);
+
+typedef int (*rte_kv_hash_modify_t)
+(struct rte_kv_hash_table *table, void *key, uint32_t hash,
+ enum rte_kv_modify_op op, void *value, int *found);
+
+struct rte_kv_hash_table {
+ char name[RTE_KV_HASH_NAMESIZE]; /**< Name of the hash. */
+ rte_kv_hash_bulk_lookup_t lookup;
+ rte_kv_hash_modify_t modify;
+ enum rte_kv_hash_type type;
+};
+
+/**
+ * Lookup bulk of keys.
+ * This function is multi-thread safe with regarding to other lookup threads.
+ *
+ * @param table
+ * Hash table to add the key to.
+ * @param keys
+ * Pointer to array of keys
+ * @param hashes
+ * Pointer to array of hash values associated with keys.
+ * @param values
+ * Pointer to array of value corresponded to keys
+ * If the key was not found the corresponding value remains intact.
+ * @param n
+ * Number of keys to lookup in batch.
+ * @return
+ * -EINVAL if there's an error, otherwise number of successful lookups.
+ */
+static inline int
+rte_kv_hash_bulk_lookup(struct rte_kv_hash_table *table,
+ void *keys, uint32_t *hashes, void *values, unsigned int n)
+{
+ return table->lookup(table, keys, hashes, values, n);
+}
+
+/**
+ * Add a key to an existing hash table with hash value.
+ * This operation is not multi-thread safe regarding to add/delete functions
+ * and should only be called from one thread.
+ * However it is safe to call it along with lookup.
+ *
+ * @param table
+ * Hash table to add the key to.
+ * @param key
+ * Key to add to the hash table.
+ * @param value
+ * Value to associate with key.
+ * @param hash
+ * Hash value associated with key.
+ * @found
+ * 0 if no previously added key was found
+ * 1 previously added key was found, old value associated with the key
+ * was written to *value
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
+ uint32_t hash, void *value, int *found);
+
+/**
+ * Remove a key with a given hash value from an existing hash table.
+ * This operation is not multi-thread safe regarding to add/delete functions
+ * and should only be called from one thread.
+ * However it is safe to call it along with lookup.
+ *
+ * @param table
+ * Hash table to remove the key from.
+ * @param key
+ * Key to remove from the hash table.
+ * @param hash
+ * hash value associated with key.
+ * @param value
+ * pointer to memory where the old value will be written to on success
+ * @return
+ * 0 if ok, or negative value on error.
+ */
+__rte_experimental
+int
+rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
+ uint32_t hash, void *value);
+
+/**
+ * Performs a lookup for an existing hash table, and returns a pointer to
+ * the table if found.
+ *
+ * @param name
+ * Name of the hash table to find
+ *
+ * @return
+ * pointer to hash table structure or NULL on error with rte_errno
+ * set appropriately.
+ */
+__rte_experimental
+struct rte_kv_hash_table *
+rte_kv_hash_find_existing(const char *name);
+
+/**
+ * Create a new hash table for use with four byte keys.
+ *
+ * @param params
+ * Parameters used in creation of hash table.
+ *
+ * @return
+ * Pointer to hash table structure that is used in future hash table
+ * operations, or NULL on error with rte_errno set appropriately.
+ */
+__rte_experimental
+struct rte_kv_hash_table *
+rte_kv_hash_create(const struct rte_kv_hash_params *params);
+
+/**
+ * Free all memory used by a hash table.
+ *
+ * @param table
+ * Hash table to deallocate.
+ */
+__rte_experimental
+void
+rte_kv_hash_free(struct rte_kv_hash_table *table);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_KV_HASH_H_ */
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v4 2/4] hash: add documentation for kv hash library
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
` (3 preceding siblings ...)
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
@ 2020-05-08 19:58 ` Vladimir Medvedkin
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 3/4] test: add kv hash autotests Vladimir Medvedkin
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 4/4] test: add kv perf tests Vladimir Medvedkin
6 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-05-08 19:58 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add programmers guide and doxygen API for kv hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
doc/api/doxy-api-index.md | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/kv_hash_lib.rst | 66 +++++++++++++++++++++++++++++++++++
3 files changed, 68 insertions(+)
create mode 100644 doc/guides/prog_guide/kv_hash_lib.rst
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 93f0d93..eade5f5 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -121,6 +121,7 @@ The public API headers are grouped by topics:
[jhash] (@ref rte_jhash.h),
[thash] (@ref rte_thash.h),
[FBK hash] (@ref rte_fbk_hash.h),
+ [KV hash] (@ref rte_kv_hash.h),
[CRC hash] (@ref rte_hash_crc.h)
- **classification**
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index f0ae3c1..28ddb2b 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -31,6 +31,7 @@ Programmer's Guide
link_bonding_poll_mode_drv_lib
timer_lib
hash_lib
+ kv_hash_lib
efd_lib
member_lib
lpm_lib
diff --git a/doc/guides/prog_guide/kv_hash_lib.rst b/doc/guides/prog_guide/kv_hash_lib.rst
new file mode 100644
index 0000000..44ed99a
--- /dev/null
+++ b/doc/guides/prog_guide/kv_hash_lib.rst
@@ -0,0 +1,66 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2020 Intel Corporation.
+
+.. _kv_hash_Library:
+
+KV Hash Library
+===================
+
+This hash library implementation is intended to be better optimized for some fixed key-value sizes when compared to existing Cuckoo hash-based rte_hash implementation. At the moment it supports 32-bit keys and 64 bit values. Current rte_fbk implementation is pretty fast but it has a number of drawbacks such as 2 byte values and limited collision resolving capabilities. rte_hash (which is based on Cuckoo hash algorithm) doesn't have these drawbacks, but it comes at a cost of lower performance compared to rte_fbk.
+
+The following flow illustrates the source of performance penalties of Cuckoo hash:
+
+* Loading two buckets at once (extra memory consumption)
+* Сomparing signatures first (extra step before key comparison)
+* If signature comparison hits, get a key index, find memory location with a key itself, and get the key (memory pressure and indirection)
+* Using indirect call to memcmp() to compare two uint32_t (function call overhead)
+
+KV hash table doesn't have the drawbacks associated with rte_fbk while offering the same performance as rte_fbk. The bucket contains 4 consecutive keys which can be compared very quickly, and subsequent keys are kept in a linked list.
+
+The main disadvantage compared to rte_hash is performance degradation with high average table utilization due to chain resolving for 5th and subsequent collisions.
+
+To estimate the probability of 5th collision we can use "birthday paradox" approach. We can figure out the number of insertions (can be treated as a load factor) that will likely yield a 50% probability of 5th collision for a given number of buckets.
+
+It could be calculated with an asymptotic formula from [1]:
+
+E(n, k) ~= (k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n -> inf
+
+,where
+
+k - level of collision
+
+n - number of buckets
+
+Г - gamma function
+
+So, for k = 5 (5th collision), and given the fact that number of buckets is a power of 2, we can simplify formula:
+
+E(n) = 2.392 * 2^(m * 4/5) , where number of buckets n = 2^m
+
+.. note::
+
+ You can calculate it by yourself using Wolfram Alpha [2]. For example for 8k buckets:
+
+ solve ((k!)^(1/k)*Γ(1 + 1/k)*n^(1-1/k), n = 8192, k = 5)
+
+
+API Overview
+-----------------
+
+The main configuration parameters for the hash table are:
+
+* Total number of hash entries in the table
+* Socket id
+
+KV hash is "hash function-less", so user must specify precalculated hash value for every key. The main methods exported by the Hash Library are:
+
+* Add entry with key and precomputed hash: The key, precomputed hash and value are provided as input.
+* Delete entry with key and precomputed hash: The key and precomputed hash are provided as input.
+* Lookup entry with key and precomputed hash: The key, precomputed hash and a pointer to expected value are provided as input. If an entry with the specified key is found in the hash table (i.e. lookup hit), then the value associated with the key will be written to the memory specified by the pointer, and the function will return 0; otherwise (i.e. a lookup miss) a negative value is returned, and memory described by the pointer is not modified.
+
+References
+----------
+
+[1] M.S. Klamkin and D.J. Newman, Extensions of the Birthday Surprise
+
+[2] https://www.wolframalpha.com/
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v4 3/4] test: add kv hash autotests
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
` (4 preceding siblings ...)
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 2/4] hash: add documentation for " Vladimir Medvedkin
@ 2020-05-08 19:58 ` Vladimir Medvedkin
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 4/4] test: add kv perf tests Vladimir Medvedkin
6 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-05-08 19:58 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add autotests for rte_kv_hash library
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/Makefile | 1 +
app/test/autotest_data.py | 12 +++
app/test/meson.build | 3 +
app/test/test_kv_hash.c | 242 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 258 insertions(+)
create mode 100644 app/test/test_kv_hash.c
diff --git a/app/test/Makefile b/app/test/Makefile
index 4e8ecab..298650f 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -73,6 +73,7 @@ SRCS-y += test_bitmap.c
SRCS-y += test_reciprocal_division.c
SRCS-y += test_reciprocal_division_perf.c
SRCS-y += test_fbarray.c
+SRCS-y += test_kv_hash.c
SRCS-y += test_external_mem.c
SRCS-y += test_rand_perf.c
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 7b1d013..a1351b7 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -99,6 +99,18 @@
"Report": None,
},
{
+ "Name": "KV hash autotest",
+ "Command": "kv_hash_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
+ "Name": "KV hash autotest",
+ "Command": "kv_hash_slow_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
"Name": "LPM autotest",
"Command": "lpm_autotest",
"Func": default_autotest,
diff --git a/app/test/meson.build b/app/test/meson.build
index f8ed9d8..8740255 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -45,6 +45,7 @@ test_sources = files('commands.c',
'test_eventdev.c',
'test_external_mem.c',
'test_fbarray.c',
+ 'test_kv_hash.c',
'test_fib.c',
'test_fib_perf.c',
'test_fib6.c',
@@ -203,6 +204,7 @@ fast_tests = [
['hash_autotest', true],
['interrupt_autotest', true],
['ipfrag_autotest', false],
+ ['kv_hash_autotest', true],
['logs_autotest', true],
['lpm_autotest', true],
['lpm6_autotest', true],
@@ -291,6 +293,7 @@ perf_test_names = [
'hash_readwrite_perf_autotest',
'hash_readwrite_lf_perf_autotest',
'trace_perf_autotest',
+ 'kv_hash_slow_autotest',
]
driver_test_names = [
diff --git a/app/test/test_kv_hash.c b/app/test/test_kv_hash.c
new file mode 100644
index 0000000..646bead
--- /dev/null
+++ b/app/test/test_kv_hash.c
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_lcore.h>
+#include <rte_kv_hash.h>
+#include <rte_hash_crc.h>
+
+#include "test.h"
+
+typedef int32_t (*rte_kv_hash_test)(void);
+
+static int32_t test_create_invalid(void);
+static int32_t test_multiple_create(void);
+static int32_t test_free_null(void);
+static int32_t test_add_del_invalid(void);
+static int32_t test_basic(void);
+
+#define MAX_ENT (1 << 22)
+
+/*
+ * Check that rte_kv_hash_create fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test_create_invalid(void)
+{
+ struct rte_kv_hash_table *kv_hash = NULL;
+ struct rte_kv_hash_params config;
+
+ config.name = "test_kv_hash";
+ config.socket_id = rte_socket_id();
+ config.entries = MAX_ENT;
+ config.type = RTE_KV_HASH_K32V64;
+
+ /* rte_kv_hash_create: kv_hash name == NULL */
+ config.name = NULL;
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.name = "test_kv_hash";
+
+ /* rte_kv_hash_create: config == NULL */
+ kv_hash = rte_kv_hash_create(NULL);
+ RTE_TEST_ASSERT(kv_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+
+ /* socket_id < -1 is invalid */
+ config.socket_id = -2;
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.socket_id = rte_socket_id();
+
+ /* rte_kv_hash_create: entries = 0 */
+ config.entries = 0;
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+ config.entries = MAX_ENT;
+
+ /* rte_kv_hash_create: invalid type*/
+ config.type = RTE_KV_HASH_MAX;
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash == NULL,
+ "Call succeeded with invalid parameters\n");
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Create kv_hash table then delete kv_hash table 10 times
+ * Use a slightly different rules size each time
+ */
+#include <rte_errno.h>
+int32_t
+test_multiple_create(void)
+{
+ struct rte_kv_hash_table *kv_hash = NULL;
+ struct rte_kv_hash_params config;
+ int32_t i;
+
+ for (i = 0; i < 100; i++) {
+ config.name = "test_kv_hash";
+ config.socket_id = -1;
+ config.entries = MAX_ENT - i;
+ config.type = RTE_KV_HASH_K32V64;
+
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash != NULL,
+ "Failed to create kv hash\n");
+ rte_kv_hash_free(kv_hash);
+ }
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call rte_kv_hash_free for NULL pointer user input.
+ * Note: free has no return and therefore it is impossible
+ * to check for failure but this test is added to
+ * increase function coverage metrics and to validate that
+ * freeing null does not crash.
+ */
+int32_t
+test_free_null(void)
+{
+ struct rte_kv_hash_table *kv_hash = NULL;
+ struct rte_kv_hash_params config;
+
+ config.name = "test_kv";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+ config.type = RTE_KV_HASH_K32V64;
+
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash != NULL, "Failed to create kv hash\n");
+
+ rte_kv_hash_free(kv_hash);
+ rte_kv_hash_free(NULL);
+ return TEST_SUCCESS;
+}
+
+/*
+ * Check that rte_kv_hash_add fails gracefully for
+ * incorrect user input arguments
+ */
+int32_t
+test_add_del_invalid(void)
+{
+ uint32_t key = 10;
+ uint64_t val = 20;
+ int ret, found;
+
+ /* rte_kv_hash_add: kv_hash == NULL */
+ ret = rte_kv_hash_add(NULL, &key, rte_hash_crc_4byte(key, 0),
+ &val, &found);
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ /* rte_kv_hash_delete: kv_hash == NULL */
+ ret = rte_kv_hash_delete(NULL, &key, rte_hash_crc_4byte(key, 0), &val);
+ RTE_TEST_ASSERT(ret == -EINVAL,
+ "Call succeeded with invalid parameters\n");
+
+ return TEST_SUCCESS;
+}
+
+/*
+ * Call add, lookup and delete for a single rule
+ */
+int32_t
+test_basic(void)
+{
+ struct rte_kv_hash_table *kv_hash = NULL;
+ struct rte_kv_hash_params config;
+ uint32_t key = 10;
+ uint64_t value = 20;
+ uint64_t ret_val = 0;
+ int ret, found;
+ uint32_t hash_sig;
+
+ config.name = "test_kv";
+ config.socket_id = -1;
+ config.entries = MAX_ENT;
+ config.type = RTE_KV_HASH_K32V64;
+
+ kv_hash = rte_kv_hash_create(&config);
+ RTE_TEST_ASSERT(kv_hash != NULL, "Failed to create kv hash\n");
+
+ hash_sig = rte_hash_crc_4byte(key, 0);
+ ret = rte_kv_hash_bulk_lookup(kv_hash, &key,
+ &hash_sig, &ret_val, 1);
+ RTE_TEST_ASSERT(ret == 0, "Lookup return incorrect result\n");
+
+ ret = rte_kv_hash_delete(kv_hash, &key, hash_sig, &ret_val);
+ RTE_TEST_ASSERT(ret == -ENOENT, "Delete return incorrect result\n");
+
+ ret = rte_kv_hash_add(kv_hash, &key, hash_sig, &value, &found);
+ RTE_TEST_ASSERT(ret == 0, "Can not add key into the table\n");
+
+ ret = rte_kv_hash_bulk_lookup(kv_hash, &key,
+ &hash_sig, &ret_val, 1);
+ RTE_TEST_ASSERT(((ret == 1) && (value == ret_val)),
+ "Lookup return incorrect result\n");
+
+ ret = rte_kv_hash_delete(kv_hash, &key, hash_sig, &ret_val);
+ RTE_TEST_ASSERT(ret == 0, "Can not delete key from table\n");
+
+ ret = rte_kv_hash_bulk_lookup(kv_hash, &key,
+ &hash_sig, &ret_val, 1);
+ RTE_TEST_ASSERT(ret == 0, "Lookup return incorrect result\n");
+
+ rte_kv_hash_free(kv_hash);
+
+ return TEST_SUCCESS;
+}
+
+static struct unit_test_suite kv_hash_tests = {
+ .suite_name = "kv_hash autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_create_invalid),
+ TEST_CASE(test_free_null),
+ TEST_CASE(test_add_del_invalid),
+ TEST_CASE(test_basic),
+ TEST_CASES_END()
+ }
+};
+
+static struct unit_test_suite kv_hash_slow_tests = {
+ .suite_name = "kv_hash slow autotest",
+ .setup = NULL,
+ .teardown = NULL,
+ .unit_test_cases = {
+ TEST_CASE(test_multiple_create),
+ TEST_CASES_END()
+ }
+};
+
+/*
+ * Do all unit tests.
+ */
+static int
+test_kv_hash(void)
+{
+ return unit_test_suite_runner(&kv_hash_tests);
+}
+
+static int
+test_slow_kv_hash(void)
+{
+ return unit_test_suite_runner(&kv_hash_slow_tests);
+}
+
+REGISTER_TEST_COMMAND(kv_hash_autotest, test_kv_hash);
+REGISTER_TEST_COMMAND(kv_hash_slow_autotest, test_slow_kv_hash);
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* [dpdk-dev] [PATCH v4 4/4] test: add kv perf tests
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
` (5 preceding siblings ...)
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 3/4] test: add kv hash autotests Vladimir Medvedkin
@ 2020-05-08 19:58 ` Vladimir Medvedkin
6 siblings, 0 replies; 56+ messages in thread
From: Vladimir Medvedkin @ 2020-05-08 19:58 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Add performance tests for rte_kv_hash
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
app/test/test_hash_perf.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 111 insertions(+)
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index 76cdac5..3d4c13d 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -12,8 +12,10 @@
#include <rte_hash_crc.h>
#include <rte_jhash.h>
#include <rte_fbk_hash.h>
+#include <rte_kv_hash.h>
#include <rte_random.h>
#include <rte_string_fns.h>
+#include <rte_hash_crc.h>
#include "test.h"
@@ -29,6 +31,8 @@
#define NUM_SHUFFLES 10
#define BURST_SIZE 16
+#define CRC_INIT_VAL 0xdeadbeef
+
enum operations {
ADD = 0,
LOOKUP,
@@ -719,6 +723,110 @@ fbk_hash_perf_test(void)
return 0;
}
+static uint32_t *
+shuf_arr(uint32_t *arr, int n, int l)
+{
+ int i, j;
+ uint32_t tmp;
+ uint32_t *ret_arr;
+
+ ret_arr = rte_zmalloc(NULL, l * sizeof(uint32_t), 0);
+ for (i = 0; i < n; i++) {
+ j = rte_rand() % n;
+ tmp = arr[j];
+ arr[j] = arr[i];
+ arr[i] = tmp;
+ }
+ for (i = 0; i < l; i++)
+ ret_arr[i] = arr[i % n];
+
+ return ret_arr;
+}
+
+static int
+kv_hash_perf_test(void)
+{
+ struct rte_kv_hash_params params = {
+ .name = "kv_hash_test",
+ .entries = ENTRIES * 2,
+ .socket_id = rte_socket_id(),
+ .type = RTE_KV_HASH_K32V64,
+ };
+ struct rte_kv_hash_table *handle = NULL;
+ uint32_t *keys;
+ uint32_t *lookup_keys;
+ uint64_t lookup_time = 0;
+ uint64_t begin;
+ uint64_t end;
+ unsigned int added = 0;
+ uint32_t key, hash_sig;
+ uint16_t val;
+ unsigned int i, j, k;
+ int found, ret = 0;
+ uint32_t hashes[64];
+ uint64_t vals[64];
+
+ handle = rte_kv_hash_create(¶ms);
+ if (handle == NULL) {
+ printf("Error creating table\n");
+ return -1;
+ }
+
+ keys = rte_zmalloc(NULL, ENTRIES * sizeof(*keys), 0);
+ if (keys == NULL) {
+ printf("fbk hash: memory allocation for key store failed\n");
+ return -1;
+ }
+
+ /* Generate random keys and values. */
+ for (i = 0; i < ENTRIES; i++) {
+ key = (uint32_t)rte_rand();
+ val = rte_rand();
+ hash_sig = rte_hash_crc_4byte(key, CRC_INIT_VAL);
+
+ if (rte_kv_hash_add(handle, &key, hash_sig, &val,
+ &found) == 0) {
+ keys[added] = key;
+ added++;
+ }
+ }
+
+ lookup_keys = shuf_arr(keys, added, TEST_SIZE);
+
+ lookup_time = 0;
+ for (i = 0; i < TEST_ITERATIONS; i++) {
+
+ begin = rte_rdtsc();
+ /* Do lookups */
+
+ for (j = 0; j < TEST_SIZE; j += 64) {
+ for (k = 0; k < 64; k++) {
+ hashes[k] =
+ rte_hash_crc_4byte(lookup_keys[j + k],
+ CRC_INIT_VAL);
+ }
+
+ ret += rte_kv_hash_bulk_lookup(handle,
+ &lookup_keys[j], hashes, vals, 64);
+ }
+
+ end = rte_rdtsc();
+ lookup_time += (double)(end - begin);
+ }
+
+ printf("\n\n *** KV Hash function performance test results ***\n");
+ if (ret != 0)
+ printf("Number of ticks per bulk lookup = %g\n",
+ (double)lookup_time /
+ ((double)TEST_ITERATIONS * (double)TEST_SIZE));
+
+ rte_free(keys);
+ rte_free(lookup_keys);
+ rte_kv_hash_free(handle);
+
+ return 0;
+}
+
static int
test_hash_perf(void)
{
@@ -746,6 +854,9 @@ test_hash_perf(void)
if (fbk_hash_perf_test() < 0)
return -1;
+ if (kv_hash_perf_test() < 0)
+ return -1;
+
return 0;
}
--
2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
2020-04-17 0:21 ` Wang, Yipeng1
2020-04-23 16:19 ` Ananyev, Konstantin
@ 2020-05-08 20:08 ` Medvedkin, Vladimir
1 sibling, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-05-08 20:08 UTC (permalink / raw)
To: Wang, Yipeng1, Mattias Rönnblom, dev, Dumitrescu, Cristian
Cc: Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce
Hi Yipeng,
Sorry for late reply
On 17/04/2020 01:21, Wang, Yipeng1 wrote:
>> -----Original Message-----
>> From: Mattias Rönnblom<mattias.ronnblom@ericsson.com>
>> Sent: Thursday, April 16, 2020 4:41 AM
>> To: Medvedkin, Vladimir<vladimir.medvedkin@intel.com>;dev@dpdk.org
>> Cc: Ananyev, Konstantin<konstantin.ananyev@intel.com>; Wang, Yipeng1
>> <yipeng1.wang@intel.com>; Gobriel, Sameh<sameh.gobriel@intel.com>;
>> Richardson, Bruce<bruce.richardson@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
>>
>> On 2020-04-16 12:18, Medvedkin, Vladimir wrote:
>>> Hi Mattias,
>>>
>>> -----Original Message-----
>>> From: Mattias Rönnblom<mattias.ronnblom@ericsson.com>
>>> Sent: Wednesday, April 15, 2020 7:52 PM
>>> To: Medvedkin, Vladimir<vladimir.medvedkin@intel.com>;dev@dpdk.org
>>> Cc: Ananyev, Konstantin<konstantin.ananyev@intel.com>; Wang, Yipeng1
>>> <yipeng1.wang@intel.com>; Gobriel, Sameh<sameh.gobriel@intel.com>;
>>> Richardson, Bruce<bruce.richardson@intel.com>
>>> Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
>>>
>>> On 2020-04-15 20:17, Vladimir Medvedkin wrote:
>>>> Currently DPDK has a special implementation of a hash table for
>>>> 4 byte keys which is called FBK hash. Unfortunately its main drawback
>>>> is that it only supports 2 byte values.
>>>> The new implementation called K32V64 hash supports 4 byte keys and 8
>>>> byte associated values, which is enough to store a pointer.
>>>>
>>>> It would also be nice to get feedback on whether to leave the old FBK
>>>> and new k32v64 implementations or deprecate the old one?
>>> Do you think it would be feasible to support custom-sized values and remain
>> efficient, in a similar manner to how rte_ring_elem.h does things?
>>> I'm afraid it is not feasible. For the performance reason keys and
>> corresponding values resides in single cache line so there are no extra memory
>> for bigger values, such as 16B.
>>
>>
>> Well, if you have a smaller value type (or key type) you would fit into
>> something less-than-a-cache line, and thus reduce your memory working set
>> further.
>>
>>
>>>> v3:
>>>> - added bulk lookup
>>>> - avx512 key comparizon is removed from .h
>>>>
>>>> v2:
>>>> - renamed from rte_dwk to rte_k32v64 as was suggested
>>>> - reworked lookup function, added inlined subroutines
>>>> - added avx512 key comparizon routine
>>>> - added documentation
>>>> - added statistic counters for total entries and extended
>>>> entries(linked list)
>>>>
>>>> Vladimir Medvedkin (4):
>>>> hash: add k32v64 hash library
>>>> hash: add documentation for k32v64 hash library
>>>> test: add k32v64 hash autotests
>>>> test: add k32v64 perf tests
>>>>
>>>> app/test/Makefile | 1 +
>>>> app/test/autotest_data.py | 12 ++
>>>> app/test/meson.build | 3 +
>>>> app/test/test_hash_perf.c | 130 ++++++++++++
>>>> app/test/test_k32v64_hash.c | 229 ++++++++++++++++++++++
>>>> doc/api/doxy-api-index.md | 1 +
>>>> doc/guides/prog_guide/index.rst | 1 +
>>>> doc/guides/prog_guide/k32v64_hash_lib.rst | 66 +++++++
>>>> lib/Makefile | 2 +-
>>>> lib/librte_hash/Makefile | 13 +-
>>>> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
>>>> lib/librte_hash/meson.build | 17 +-
>>>> lib/librte_hash/rte_hash_version.map | 6 +-
>>>> lib/librte_hash/rte_k32v64_hash.c | 315
>> ++++++++++++++++++++++++++++++
>>>> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++
>>>> 15 files changed, 1058 insertions(+), 5 deletions(-)
>>>> create mode 100644 app/test/test_k32v64_hash.c
>>>> create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
>>>> create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
>>>> create mode 100644 lib/librte_hash/rte_k32v64_hash.c
>>>> create mode 100644 lib/librte_hash/rte_k32v64_hash.h
>>>>
> [Wang, Yipeng]
> Hi, Vladimir,
> Thanks for responding with the use cases earlier.
> I discussed with Sameh offline, here are some comments.
>
> 1. Since the proposed hash table also has some similarities to rte_table library used by packet framework,
> have you tried it yet? Although it is mainly for packet framework, I believe you can use it independently as well.
> It has implementations for special key value sizes.
> I added Cristian for his comment.
I looked at rte_table_hash. I'm afraid it doesn't fit our requirements
due to it's design.
First, it's API uses mbufs as a key container.
Second, as I can see from the source code it is not safe in multi
threaded environment regarding read-write concurrency by design.
Also there are some information about it
https://doc.dpdk.org/guides/prog_guide/packet_framework.html#shared-data-structures
and Cristian's comment
http://mails.dpdk.org/archives/dev/2015-September/024121.html
> 2. We tend to agree with Mattias that it would be better if we have a more generic API name and with the same
> API we can do multiple key/value size implementations.
> This is to avoid adding new APIs in future to again handle different key/value
> use cases. For example, we call it rte_kv_hash, and through the parameter struct we pass in a key-value size pair
> we want to use.
> Implementation-wise, we may only provide implementations for certain popular use cases (like the one you provided).
> For other general use cases, people should go with the more flexible and generic cuckoo hash.
> Then we should also merge the FBK under the new API.
>
Agree. As was discussed offline, I've made API to be more generic
regarding to key and value sizes.
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-23 13:31 ` Ananyev, Konstantin
@ 2020-05-08 20:14 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-05-08 20:14 UTC (permalink / raw)
To: Ananyev, Konstantin, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Konstantin,
Thanks for review,
On 23/04/2020 14:31, Ananyev, Konstantin wrote:
> Hi Vladimir,
>
> Apologies for late review.
> My comments below.
>
>> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
>> This table is hash function agnostic so user must provide
>> precalculated hash signature for add/delete/lookup operations.
>>
>> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
>> ---
>>
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_k32v64_hash.c
>> @@ -0,0 +1,315 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <string.h>
>> +
>> +#include <rte_eal_memconfig.h>
>> +#include <rte_errno.h>
>> +#include <rte_malloc.h>
>> +#include <rte_memory.h>
>> +#include <rte_tailq.h>
>> +
>> +#include <rte_k32v64_hash.h>
>> +
>> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
>> +
>> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
>> +.name = "RTE_K32V64_HASH",
>> +};
>> +
>> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
>> +
>> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
>> +
>> +#ifdef CC_AVX512VL_SUPPORT
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
>> +uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
>> +#endif
>> +
>> +static int
>> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
>> +uint32_t *hashes, uint64_t *values, unsigned int n)
>> +{
>> +int ret, cnt = 0;
>> +unsigned int i;
>> +
>> +if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> +(values == NULL)))
>> +return -EINVAL;
>> +
>> +for (i = 0; i < n; i++) {
>> +ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
>> +&values[i]);
>> +if (ret == 0)
>> +cnt++;
>> +}
>> +return cnt;
>> +}
>> +
>> +static rte_k32v64_hash_bulk_lookup_t
>> +get_lookup_bulk_fn(void)
>> +{
>> +#ifdef CC_AVX512VL_SUPPORT
>> +if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
>> +return k32v64_hash_bulk_lookup_avx512vl;
>> +#endif
>> +return k32v64_hash_bulk_lookup;
>> +}
>> +
>> +int
>> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t value)
>> +{
>> +uint32_t bucket;
>> +int i, idx, ret;
>> +uint8_t msk;
>> +struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
> I think for add you also need to do update bucket.cnt
> at the start/end of updates (as you do for del).
Agree. We can not guarantee atomic update for 64bit value on 32bit arch.
But I think it is better to make transaction as small as possible so I
update bucket.cnt not on start/end but right before and after key/value
rewrite.
>
>> +bucket = hash & table->bucket_msk;
>> +/* Search key in table. Update value if exists */
>> +for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +table->t[bucket].val[i] = value;
>> +return 0;
>> +}
>> +}
>> +
>> +if (!SLIST_EMPTY(&table->t[bucket].head)) {
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> +if (ent->key == key) {
>> +ent->val = value;
>> +return 0;
>> +}
>> +}
>> +}
>> +
>> +msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
>> +if (msk) {
>> +idx = __builtin_ctz(msk);
>> +table->t[bucket].key[idx] = key;
>> +table->t[bucket].val[idx] = value;
>> +rte_smp_wmb();
>> +table->t[bucket].key_mask |= 1 << idx;
>> +table->nb_ent++;
>> +return 0;
>> +}
>> +
>> +ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
>> +if (ret < 0)
>> +return ret;
>> +
>> +SLIST_NEXT(ent, next) = NULL;
>> +ent->key = key;
>> +ent->val = value;
>> +rte_smp_wmb();
>> +SLIST_FOREACH(tmp, &table->t[bucket].head, next)
>> +prev = tmp;
>> +
>> +if (prev == NULL)
>> +SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
>> +else
>> +SLIST_INSERT_AFTER(prev, ent, next);
>> +
>> +table->nb_ent++;
>> +table->nb_ext_ent++;
>> +return 0;
>> +}
>> +
>> +int
>> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash)
>> +{
>> +uint32_t bucket;
>> +int i;
>> +struct rte_k32v64_ext_ent *ent;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +bucket = hash & table->bucket_msk;
>> +
>> +for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +ent = SLIST_FIRST(&table->t[bucket].head);
>> +if (ent) {
>> +rte_atomic32_inc(&table->t[bucket].cnt);
> I know that right now rte_atomic32 uses _sync gcc builtins underneath,
> so it should be safe.
> But I think the proper way would be:
> table->t[bucket].cnt++;
> rte_smp_wmb();
> or as alternative probably use C11 atomic ACQUIRE/RELEASE
Agree.
>
>> +table->t[bucket].key[i] = ent->key;
>> +table->t[bucket].val[i] = ent->val;
>> +SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
>> +rte_atomic32_inc(&table->t[bucket].cnt);
>> +table->nb_ext_ent--;
>> +} else
>> +table->t[bucket].key_mask &= ~(1 << i);
> I think you protect that update with bucket.cnt.
> From my perspective -a s a rule of thumb any update to the bucket/list
> Should be within that transaction-start/transaction-end.
I think it is possible to update key_mask with C11 atomic
>> +if (ent)
>> +rte_mempool_put(table->ext_ent_pool, ent);
>> +table->nb_ent--;
>> +return 0;
>> +}
>> +}
>> +
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next)
>> +if (ent->key == key)
>> +break;
>> +
>> +if (ent == NULL)
>> +return -ENOENT;
>> +
>> +rte_atomic32_inc(&table->t[bucket].cnt);
>> +SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent, next);
>> +rte_atomic32_inc(&table->t[bucket].cnt);
>> +rte_mempool_put(table->ext_ent_pool, ent);
>> +
>> +table->nb_ext_ent--;
>> +table->nb_ent--;
>> +
>> +return 0;
>> +}
>> +
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_find_existing(const char *name)
>> +{
>> +struct rte_k32v64_hash_table *h = NULL;
>> +struct rte_tailq_entry *te;
>> +struct rte_k32v64_hash_list *k32v64_hash_list;
>> +
>> +k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> +rte_k32v64_hash_list);
>> +
>> +rte_mcfg_tailq_read_lock();
>> +TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> +h = (struct rte_k32v64_hash_table *) te->data;
>> +if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE) == 0)
>> +break;
>> +}
>> +rte_mcfg_tailq_read_unlock();
>> +if (te == NULL) {
>> +rte_errno = ENOENT;
>> +return NULL;
>> +}
>> +return h;
>> +}
>> +
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params)
>> +{
>> +char hash_name[RTE_K32V64_HASH_NAMESIZE];
>> +struct rte_k32v64_hash_table *ht = NULL;
>> +struct rte_tailq_entry *te;
>> +struct rte_k32v64_hash_list *k32v64_hash_list;
>> +uint32_t mem_size, nb_buckets, max_ent;
>> +int ret;
>> +struct rte_mempool *mp;
>> +
>> +if ((params == NULL) || (params->name == NULL) ||
>> +(params->entries == 0)) {
>> +rte_errno = EINVAL;
>> +return NULL;
>> +}
>> +
>> +k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> +rte_k32v64_hash_list);
>> +
>> +ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params->name);
>> +if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
>> +rte_errno = ENAMETOOLONG;
>> +return NULL;
>> +}
>> +
>> +max_ent = rte_align32pow2(params->entries);
>> +nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
>> +mem_size = sizeof(struct rte_k32v64_hash_table) +
>> +sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
>> +
>> +mp = rte_mempool_create(hash_name, max_ent,
>> +sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
>> +params->socket_id, 0);
>> +
>> +if (mp == NULL)
>> +return NULL;
>> +
>> +rte_mcfg_tailq_write_lock();
>> +TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> +ht = (struct rte_k32v64_hash_table *) te->data;
>> +if (strncmp(params->name, ht->name,
>> +RTE_K32V64_HASH_NAMESIZE) == 0)
>> +break;
>> +}
>> +ht = NULL;
>> +if (te != NULL) {
>> +rte_errno = EEXIST;
>> +rte_mempool_free(mp);
>> +goto exit;
>> +}
>> +
>> +te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
>> +if (te == NULL) {
>> +RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
>> +rte_mempool_free(mp);
>> +goto exit;
>> +}
>> +
>> +ht = rte_zmalloc_socket(hash_name, mem_size,
>> +RTE_CACHE_LINE_SIZE, params->socket_id);
>> +if (ht == NULL) {
>> +RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
>> +rte_free(te);
>> +rte_mempool_free(mp);
>> +goto exit;
>> +}
>> +
>> +memcpy(ht->name, hash_name, sizeof(ht->name));
>> +ht->max_ent = max_ent;
>> +ht->bucket_msk = nb_buckets - 1;
>> +ht->ext_ent_pool = mp;
>> +ht->lookup = get_lookup_bulk_fn();
>> +
>> +te->data = (void *)ht;
>> +TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
>> +
>> +exit:
>> +rte_mcfg_tailq_write_unlock();
>> +
>> +return ht;
>> +}
>> +
>> +void
>> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht)
>> +{
>> +struct rte_tailq_entry *te;
>> +struct rte_k32v64_hash_list *k32v64_hash_list;
>> +
>> +if (ht == NULL)
>> +return;
>> +
>> +k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> +rte_k32v64_hash_list);
>> +
>> +rte_mcfg_tailq_write_lock();
>> +
>> +/* find out tailq entry */
>> +TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> +if (te->data == (void *) ht)
>> +break;
>> +}
>> +
>> +
>> +if (te == NULL) {
>> +rte_mcfg_tailq_write_unlock();
>> +return;
>> +}
>> +
>> +TAILQ_REMOVE(k32v64_hash_list, te, next);
>> +
>> +rte_mcfg_tailq_write_unlock();
>> +
>> +rte_mempool_free(ht->ext_ent_pool);
>> +rte_free(ht);
>> +rte_free(te);
>> +}
>> diff --git a/lib/librte_hash/rte_k32v64_hash.h b/lib/librte_hash/rte_k32v64_hash.h
>> new file mode 100644
>> index 0000000..b2c52e9
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_k32v64_hash.h
>> @@ -0,0 +1,211 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#ifndef _RTE_K32V64_HASH_H_
>> +#define _RTE_K32V64_HASH_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <rte_compat.h>
>> +#include <rte_atomic.h>
>> +#include <rte_mempool.h>
>> +
>> +#define RTE_K32V64_HASH_NAMESIZE32
>> +#define RTE_K32V64_KEYS_PER_BUCKET4
>> +#define RTE_K32V64_WRITE_IN_PROGRESS1
>> +
>> +struct rte_k32v64_hash_params {
>> +const char *name;
>> +uint32_t entries;
>> +int socket_id;
>> +};
>> +
>> +struct rte_k32v64_ext_ent {
>> +SLIST_ENTRY(rte_k32v64_ext_ent) next;
>> +uint32_tkey;
>> +uint64_tval;
>> +};
>> +
>> +struct rte_k32v64_hash_bucket {
>> +uint32_tkey[RTE_K32V64_KEYS_PER_BUCKET];
>> +uint64_tval[RTE_K32V64_KEYS_PER_BUCKET];
>> +uint8_tkey_mask;
>> +rte_atomic32_tcnt;
>> +SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head;
>> +} __rte_cache_aligned;
>> +
>> +struct rte_k32v64_hash_table;
>> +
>> +typedef int (*rte_k32v64_hash_bulk_lookup_t)
>> +(struct rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
>> +uint64_t *values, unsigned int n);
>> +
>> +struct rte_k32v64_hash_table {
>> +char name[RTE_K32V64_HASH_NAMESIZE];/**< Name of the hash. */
>> +uint32_tnb_ent;/**< Number of entities in the table*/
>> +uint32_tnb_ext_ent;/**< Number of extended entities */
>> +uint32_tmax_ent;/**< Maximum number of entities */
>> +uint32_tbucket_msk;
>> +struct rte_mempool*ext_ent_pool;
>> +rte_k32v64_hash_bulk_lookup_tlookup;
>> +__extension__ struct rte_k32v64_hash_buckett[0];
>> +};
>> +
>> +typedef int (*rte_k32v64_cmp_fn_t)
>> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
>> +
>> +static inline int
>> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
>> +uint64_t *val)
>> +{
>> +int i;
>> +
>> +for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == bucket->key[i]) &&
>> +(bucket->key_mask & (1 << i))) {
>> +*val = bucket->val[i];
>> +return 1;
>> +}
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static inline int
>> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f)
>> +{
>> +uint64_tval = 0;
>> +struct rte_k32v64_ext_ent *ent;
>> +int32_tcnt;
>> +int found = 0;
>> +uint32_t bucket = hash & table->bucket_msk;
>> +
>> +do {
>> +do
>> +cnt = rte_atomic32_read(&table->t[bucket].cnt);
>> +while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
>> +
>> +found = cmp_f(&table->t[bucket], key, &val);
>> +if (unlikely((found == 0) &&
>> +(!SLIST_EMPTY(&table->t[bucket].head)))) {
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> +if (ent->key == key) {
>> +val = ent->val;
>> +found = 1;
>> +break;
>> +}
>> +}
>> +}
>> +
>> +} while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
> AFAIK atomic32_read is just a normal read op, so it can be reordered with other ops.
> So this construction doesn't protect you from races.
> What you probably need here:
>
> do {
> cnt1 = table->t[bucket].cnt;
> rte_smp_rmb();
> ....
> rte_smp_rmb();
> cnt2 = table->t[bucket].cnt;
> while (cnt1 != cnt2 || (cnt1 & RTE_K32V64_WRITE_IN_PROGRESS) != 0)
Agree, this reads could be reordered. Replace it with C11 atomics.
>
>> +
>> +if (found == 1) {
>> +*value = val;
>> +return 0;
>> +} else
>> +return -ENOENT;
>> +}
>> +
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
2020-04-29 21:29 ` Honnappa Nagarahalli
@ 2020-05-08 20:38 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-05-08 20:38 UTC (permalink / raw)
To: Honnappa Nagarahalli, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson, nd
Hi Honnappa,
Thanks for the comments.
On 29/04/2020 22:29, Honnappa Nagarahalli wrote:
> Hi Vladimir,
> I am not sure which way the decision is made, but few comments inline. Please use C11 built-ins for synchronization.
>
>> -----Original Message-----
>> From: dev<dev-bounces@dpdk.org> On Behalf Of Vladimir Medvedkin
>> Sent: Wednesday, April 15, 2020 1:17 PM
>> To:dev@dpdk.org
>> Cc:konstantin.ananyev@intel.com;yipeng1.wang@intel.com;
>> sameh.gobriel@intel.com;bruce.richardson@intel.com
>> Subject: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library
>>
>> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
>> This table is hash function agnostic so user must provide precalculated hash
>> signature for add/delete/lookup operations.
>>
>> Signed-off-by: Vladimir Medvedkin<vladimir.medvedkin@intel.com>
>> ---
>> lib/Makefile | 2 +-
>> lib/librte_hash/Makefile | 13 +-
>> lib/librte_hash/k32v64_hash_avx512vl.c | 56 ++++++
>> lib/librte_hash/meson.build | 17 +-
>> lib/librte_hash/rte_hash_version.map | 6 +-
>> lib/librte_hash/rte_k32v64_hash.c | 315
>> +++++++++++++++++++++++++++++++++
>> lib/librte_hash/rte_k32v64_hash.h | 211 ++++++++++++++++++++++
>> 7 files changed, 615 insertions(+), 5 deletions(-) create mode 100644
>> lib/librte_hash/k32v64_hash_avx512vl.c
>> create mode 100644 lib/librte_hash/rte_k32v64_hash.c create mode
>> 100644 lib/librte_hash/rte_k32v64_hash.h
>>
>> diff --git a/lib/Makefile b/lib/Makefile index 46b91ae..a8c02e4 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev
>> \
>> librte_net librte_hash librte_cryptodev
>> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash :=
>> librte_eal librte_ring
>> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
>> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd :=
>> librte_eal librte_ring librte_hash
>> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
>> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
>> ec9f864..023689d 100644
>> --- a/lib/librte_hash/Makefile
>> +++ b/lib/librte_hash/Makefile
>> @@ -8,13 +8,14 @@ LIB = librte_hash.a
>>
>> CFLAGS += -O3
>> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
>> -LDLIBS += -lrte_eal -lrte_ring
>> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>>
>> EXPORT_MAP := rte_hash_version.map
>>
>> # all source are stored in SRCS-y
>> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
>> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_k32v64_hash.c
>>
>> # install this header file
>> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
>> +28,15 @@ endif SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include +=
>> rte_jhash.h SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
>> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_k32v64_hash.h
>> +
>> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1
>> |
>> +\ grep -q __AVX512VL__ && echo 1)
>> +
>> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
>> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
>> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
>> + CFLAGS_rte_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>>
>> include $(RTE_SDK)/mk/rte.lib.mk
>> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
>> b/lib/librte_hash/k32v64_hash_avx512vl.c
>> new file mode 100644
>> index 0000000..7c70dd2
>> --- /dev/null
>> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
>> @@ -0,0 +1,56 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <rte_k32v64_hash.h>
>> +
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
>> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
>> +
>> +static inline int
>> +k32v64_cmp_keys_avx512vl(struct rte_k32v64_hash_bucket *bucket,
>> uint32_t key,
>> + uint64_t *val)
>> +{
>> + __m128i keys, srch_key;
>> + __mmask8 msk;
>> +
>> + keys = _mm_load_si128((void *)bucket);
>> + srch_key = _mm_set1_epi32(key);
>> +
>> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys,
>> srch_key);
>> + if (msk) {
>> + *val = bucket->val[__builtin_ctz(msk)];
>> + return 1;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static inline int
>> +k32v64_hash_lookup_avx512vl(struct rte_k32v64_hash_table *table,
>> uint32_t key,
>> + uint32_t hash, uint64_t *value)
>> +{
>> + return __k32v64_hash_lookup(table, key, hash, value,
>> + k32v64_cmp_keys_avx512vl);
>> +}
>> +
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
>> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n) {
>> + int ret, cnt = 0;
>> + unsigned int i;
>> +
>> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> + (values == NULL)))
>> + return -EINVAL;
>> +
>> + for (i = 0; i < n; i++) {
>> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
>> + &values[i]);
>> + if (ret == 0)
>> + cnt++;
>> + }
>> + return cnt;
>> +}
>> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build index
>> 6ab46ae..8a37cc4 100644
>> --- a/lib/librte_hash/meson.build
>> +++ b/lib/librte_hash/meson.build
>> @@ -3,10 +3,23 @@
>>
>> headers = files('rte_crc_arm64.h',
>> 'rte_fbk_hash.h',
>> + 'rte_k32v64_hash.h',
>> 'rte_hash_crc.h',
>> 'rte_hash.h',
>> 'rte_jhash.h',
>> 'rte_thash.h')
>>
>> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps += ['ring']
>> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c',
>> +'rte_k32v64_hash.c') deps += ['ring', 'mempool']
>> +
>> +if dpdk_conf.has('RTE_ARCH_X86')
>> + if cc.has_argument('-mavx512vl')
>> + avx512_tmplib = static_library('avx512_tmp',
>> + 'k32v64_hash_avx512vl.c',
>> + dependencies: static_rte_mempool,
>> + c_args: cflags + ['-mavx512vl'])
>> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
>> + cflags += '-DCC_AVX512VL_SUPPORT'
>> +
>> + endif
>> +endif
>> diff --git a/lib/librte_hash/rte_hash_version.map
>> b/lib/librte_hash/rte_hash_version.map
>> index a8fbbc3..9a4f2f6 100644
>> --- a/lib/librte_hash/rte_hash_version.map
>> +++ b/lib/librte_hash/rte_hash_version.map
>> @@ -34,5 +34,9 @@ EXPERIMENTAL {
>>
>> rte_hash_free_key_with_position;
>> rte_hash_max_key_id;
>> -
>> + rte_k32v64_hash_create;
>> + rte_k32v64_hash_find_existing;
>> + rte_k32v64_hash_free;
>> + rte_k32v64_hash_add;
>> + rte_k32v64_hash_delete;
>> };
>> diff --git a/lib/librte_hash/rte_k32v64_hash.c
>> b/lib/librte_hash/rte_k32v64_hash.c
>> new file mode 100644
>> index 0000000..7ed94b4
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_k32v64_hash.c
>> @@ -0,0 +1,315 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <string.h>
>> +
>> +#include <rte_eal_memconfig.h>
>> +#include <rte_errno.h>
>> +#include <rte_malloc.h>
>> +#include <rte_memory.h>
>> +#include <rte_tailq.h>
>> +
>> +#include <rte_k32v64_hash.h>
>> +
>> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
>> +
>> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
>> + .name = "RTE_K32V64_HASH",
>> +};
>> +
>> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
>> +
>> +#define VALID_KEY_MSK ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
>> +
>> +#ifdef CC_AVX512VL_SUPPORT
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
>> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
>> +#endif
>> +
>> +static int
>> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t
>> *keys,
>> + uint32_t *hashes, uint64_t *values, unsigned int n) {
>> + int ret, cnt = 0;
>> + unsigned int i;
>> +
>> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> + (values == NULL)))
>> + return -EINVAL;
>> +
>> + for (i = 0; i < n; i++) {
>> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
>> + &values[i]);
>> + if (ret == 0)
>> + cnt++;
>> + }
>> + return cnt;
>> +}
>> +
>> +static rte_k32v64_hash_bulk_lookup_t
>> +get_lookup_bulk_fn(void)
>> +{
>> +#ifdef CC_AVX512VL_SUPPORT
>> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
>> + return k32v64_hash_bulk_lookup_avx512vl; #endif
>> + return k32v64_hash_bulk_lookup;
>> +}
>> +
>> +int
>> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash, uint64_t value)
>> +{
>> + uint32_t bucket;
>> + int i, idx, ret;
>> + uint8_t msk;
>> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
>> +
>> + if (table == NULL)
>> + return -EINVAL;
>> +
>> + bucket = hash & table->bucket_msk;
>> + /* Search key in table. Update value if exists */
>> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> + if ((key == table->t[bucket].key[i]) &&
>> + (table->t[bucket].key_mask & (1 << i))) {
>> + table->t[bucket].val[i] = value;
> The old value of val[i] might be getting used in the reader. It needs to be returned to the caller so that it can be put on the RCU defer queue to free later.
> You might also want to use an atomic store to ensure the stores are not split.
Agree, I will change API.
>> + return 0;
>> + }
>> + }
>> +
>> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
>> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> + if (ent->key == key) {
>> + ent->val = value;
> Same here, need to return the old value.
>
>> + return 0;
>> + }
>> + }
>> + }
>> +
>> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
>> + if (msk) {
>> + idx = __builtin_ctz(msk);
>> + table->t[bucket].key[idx] = key;
>> + table->t[bucket].val[idx] = value;
>> + rte_smp_wmb();
>> + table->t[bucket].key_mask |= 1 << idx;
> The barrier and the store can be replaced with a store-release in C11.
Agree
>
>> + table->nb_ent++;
>> + return 0;
>> + }
>> +
>> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
>> + if (ret < 0)
>> + return ret;
>> +
>> + SLIST_NEXT(ent, next) = NULL;
>> + ent->key = key;
>> + ent->val = value;
>> + rte_smp_wmb();
>> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
>> + prev = tmp;
>> +
>> + if (prev == NULL)
>> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
>> + else
>> + SLIST_INSERT_AFTER(prev, ent, next);
> Need C11 SLIST
Please clarify
>> +
>> + table->nb_ent++;
>> + table->nb_ext_ent++;
>> + return 0;
>> +}
>> +
>> +int
>> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash)
> This should return the value corresponding to the deleted key
Agree.
>
>> +{
>> + uint32_t bucket;
>> + int i;
>> + struct rte_k32v64_ext_ent *ent;
>> +
>> + if (table == NULL)
>> + return -EINVAL;
>> +
>> + bucket = hash & table->bucket_msk;
>> +
>> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> + if ((key == table->t[bucket].key[i]) &&
>> + (table->t[bucket].key_mask & (1 << i))) {
>> + ent = SLIST_FIRST(&table->t[bucket].head);
>> + if (ent) {
>> + rte_atomic32_inc(&table->t[bucket].cnt);
>> + table->t[bucket].key[i] = ent->key;
> In this case, both key and value are changing and it is not atomic. There is a possibility that the lookup function will receive incorrect value of 'val[i]'. Suggest following the method described below.
>
>> + table->t[bucket].val[i] = ent->val;
>> + SLIST_REMOVE_HEAD(&table->t[bucket].head,
>> next);
>> + rte_atomic32_inc(&table->t[bucket].cnt);
>> + table->nb_ext_ent--;
>> + } else
> Suggest changing this into a 2 step process:
Thanks, it is a good idea, but
> 1) Delete the entry from the fixed bucket (update key_mask)
> 2) Move the head of extended bucket to the fixed bucket
> 2a) Insert the key/value from the head into the deleted index and update the key_mask (this will ensure that the reader has the entry available while it is being removed from the extended bucket)
With reader logic you suggested for __k32v64_cmp_keys() there could be
possible race:
reader writer
read bucket.cnt
load_acquire -> key_mask
if (key == key[i]) <- true
----interrupt----------
1) Delete the entry from the fixed bucket (update key_mask)
2a)
--------interrupt-----
---wake------
get updated wrong value
cmp bucket.cnt
return with success but with invalid value
Please correct me if I'm wrong.
> 2b) increment the counter indicating the extended bucket is changing
> 2c) remove the head
> 2d) increment the counter indicating the extended bucket change is done
>
> The above procedure will result in removing the spinning (resulting from step 2b) in the lookup function (comment is added in the lookup function), readers will not be blocked if the writer is scheduled out.
As far as I can see we must somehow indicate write_in_progress state for
a bucket, at least because we can not guarantee atomic write for
uint64_t on 32bit arch.
> This logic is implemented in rte_hash using cuckoo hash, suggest to take a look for required memory orderings.
>
>> + table->t[bucket].key_mask &= ~(1 << i);
>> + if (ent)
>> + rte_mempool_put(table->ext_ent_pool, ent);
> Entry cannot be put to free list immediately as the readers are still using it.
With current transaction logic if reader is using ent while writer puts
it in a pool I don't see any problem, because reader will detect
bucket.cnt update and rerun search again.
>
>> + table->nb_ent--;
>> + return 0;
>> + }
>> + }
>> +
>> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
>> + if (ent->key == key)
>> + break;
>> +
>> + if (ent == NULL)
>> + return -ENOENT;
>> +
>> + rte_atomic32_inc(&table->t[bucket].cnt);
>> + SLIST_REMOVE(&table->t[bucket].head, ent, rte_k32v64_ext_ent,
>> next);
>> + rte_atomic32_inc(&table->t[bucket].cnt);
>> + rte_mempool_put(table->ext_ent_pool, ent);
> Entry might be getting used by the readers still.
>
>> +
>> + table->nb_ext_ent--;
>> + table->nb_ent--;
>> +
>> + return 0;
>> +}
>> +
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_find_existing(const char *name) {
>> + struct rte_k32v64_hash_table *h = NULL;
>> + struct rte_tailq_entry *te;
>> + struct rte_k32v64_hash_list *k32v64_hash_list;
>> +
>> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> + rte_k32v64_hash_list);
>> +
>> + rte_mcfg_tailq_read_lock();
>> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> + h = (struct rte_k32v64_hash_table *) te->data;
>> + if (strncmp(name, h->name, RTE_K32V64_HASH_NAMESIZE)
>> == 0)
>> + break;
>> + }
>> + rte_mcfg_tailq_read_unlock();
>> + if (te == NULL) {
>> + rte_errno = ENOENT;
>> + return NULL;
>> + }
>> + return h;
>> +}
>> +
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params) {
>> + char hash_name[RTE_K32V64_HASH_NAMESIZE];
>> + struct rte_k32v64_hash_table *ht = NULL;
>> + struct rte_tailq_entry *te;
>> + struct rte_k32v64_hash_list *k32v64_hash_list;
>> + uint32_t mem_size, nb_buckets, max_ent;
>> + int ret;
>> + struct rte_mempool *mp;
>> +
>> + if ((params == NULL) || (params->name == NULL) ||
>> + (params->entries == 0)) {
>> + rte_errno = EINVAL;
>> + return NULL;
>> + }
>> +
>> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> + rte_k32v64_hash_list);
>> +
>> + ret = snprintf(hash_name, sizeof(hash_name), "K32V64_%s", params-
>>> name);
>> + if (ret < 0 || ret >= RTE_K32V64_HASH_NAMESIZE) {
>> + rte_errno = ENAMETOOLONG;
>> + return NULL;
>> + }
>> +
>> + max_ent = rte_align32pow2(params->entries);
>> + nb_buckets = max_ent / RTE_K32V64_KEYS_PER_BUCKET;
>> + mem_size = sizeof(struct rte_k32v64_hash_table) +
>> + sizeof(struct rte_k32v64_hash_bucket) * nb_buckets;
>> +
>> + mp = rte_mempool_create(hash_name, max_ent,
>> + sizeof(struct rte_k32v64_ext_ent), 0, 0, NULL, NULL, NULL,
>> NULL,
>> + params->socket_id, 0);
>> +
>> + if (mp == NULL)
>> + return NULL;
>> +
>> + rte_mcfg_tailq_write_lock();
>> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> + ht = (struct rte_k32v64_hash_table *) te->data;
>> + if (strncmp(params->name, ht->name,
>> + RTE_K32V64_HASH_NAMESIZE) == 0)
>> + break;
>> + }
>> + ht = NULL;
>> + if (te != NULL) {
>> + rte_errno = EEXIST;
>> + rte_mempool_free(mp);
>> + goto exit;
>> + }
>> +
>> + te = rte_zmalloc("K32V64_HASH_TAILQ_ENTRY", sizeof(*te), 0);
>> + if (te == NULL) {
>> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
>> + rte_mempool_free(mp);
>> + goto exit;
>> + }
>> +
>> + ht = rte_zmalloc_socket(hash_name, mem_size,
>> + RTE_CACHE_LINE_SIZE, params->socket_id);
>> + if (ht == NULL) {
>> + RTE_LOG(ERR, HASH, "Failed to allocate fbk hash table\n");
>> + rte_free(te);
>> + rte_mempool_free(mp);
>> + goto exit;
>> + }
>> +
>> + memcpy(ht->name, hash_name, sizeof(ht->name));
>> + ht->max_ent = max_ent;
>> + ht->bucket_msk = nb_buckets - 1;
>> + ht->ext_ent_pool = mp;
>> + ht->lookup = get_lookup_bulk_fn();
>> +
>> + te->data = (void *)ht;
>> + TAILQ_INSERT_TAIL(k32v64_hash_list, te, next);
>> +
>> +exit:
>> + rte_mcfg_tailq_write_unlock();
>> +
>> + return ht;
>> +}
>> +
>> +void
>> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *ht) {
>> + struct rte_tailq_entry *te;
>> + struct rte_k32v64_hash_list *k32v64_hash_list;
>> +
>> + if (ht == NULL)
>> + return;
>> +
>> + k32v64_hash_list = RTE_TAILQ_CAST(rte_k32v64_hash_tailq.head,
>> + rte_k32v64_hash_list);
>> +
>> + rte_mcfg_tailq_write_lock();
>> +
>> + /* find out tailq entry */
>> + TAILQ_FOREACH(te, k32v64_hash_list, next) {
>> + if (te->data == (void *) ht)
>> + break;
>> + }
>> +
>> +
>> + if (te == NULL) {
>> + rte_mcfg_tailq_write_unlock();
>> + return;
>> + }
>> +
>> + TAILQ_REMOVE(k32v64_hash_list, te, next);
>> +
>> + rte_mcfg_tailq_write_unlock();
>> +
>> + rte_mempool_free(ht->ext_ent_pool);
>> + rte_free(ht);
>> + rte_free(te);
>> +}
>> diff --git a/lib/librte_hash/rte_k32v64_hash.h
>> b/lib/librte_hash/rte_k32v64_hash.h
>> new file mode 100644
>> index 0000000..b2c52e9
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_k32v64_hash.h
>> @@ -0,0 +1,211 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#ifndef _RTE_K32V64_HASH_H_
>> +#define _RTE_K32V64_HASH_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <rte_compat.h>
>> +#include <rte_atomic.h>
>> +#include <rte_mempool.h>
>> +
>> +#define RTE_K32V64_HASH_NAMESIZE 32
>> +#define RTE_K32V64_KEYS_PER_BUCKET 4
>> +#define RTE_K32V64_WRITE_IN_PROGRESS 1
>> +
>> +struct rte_k32v64_hash_params {
>> + const char *name;
>> + uint32_t entries;
>> + int socket_id;
>> +};
>> +
>> +struct rte_k32v64_ext_ent {
>> + SLIST_ENTRY(rte_k32v64_ext_ent) next;
>> + uint32_t key;
>> + uint64_t val;
>> +};
>> +
>> +struct rte_k32v64_hash_bucket {
>> + uint32_t key[RTE_K32V64_KEYS_PER_BUCKET];
>> + uint64_t val[RTE_K32V64_KEYS_PER_BUCKET];
>> + uint8_t key_mask;
>> + rte_atomic32_t cnt;
>> + SLIST_HEAD(rte_k32v64_list_head, rte_k32v64_ext_ent) head; }
>> +__rte_cache_aligned;
>> +
>> +struct rte_k32v64_hash_table;
>> +
>> +typedef int (*rte_k32v64_hash_bulk_lookup_t) (struct
>> +rte_k32v64_hash_table *table, uint32_t *keys, uint32_t *hashes,
>> + uint64_t *values, unsigned int n);
>> +
>> +struct rte_k32v64_hash_table {
>> + char name[RTE_K32V64_HASH_NAMESIZE]; /**< Name of the
>> hash. */
>> + uint32_t nb_ent; /**< Number of entities in the table*/
>> + uint32_t nb_ext_ent; /**< Number of extended entities */
>> + uint32_t max_ent; /**< Maximum number of entities */
>> + uint32_t bucket_msk;
>> + struct rte_mempool *ext_ent_pool;
>> + rte_k32v64_hash_bulk_lookup_t lookup;
>> + __extension__ struct rte_k32v64_hash_bucket t[0];
>> +};
>> +
>> +typedef int (*rte_k32v64_cmp_fn_t)
>> +(struct rte_k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
>> +
>> +static inline int
>> +__k32v64_cmp_keys(struct rte_k32v64_hash_bucket *bucket, uint32_t key,
>> + uint64_t *val)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
>> + if ((key == bucket->key[i]) &&
>> + (bucket->key_mask & (1 << i))) {
>> + *val = bucket->val[i];
> This load can happen speculatively.
> You have to use the guard variable and payload concept here for reader-writer concurrency.
> On the writer:
> store -> key
> store -> val
> store_release -> key_mask (or there will be a rte_smp_wmb before this) 'key_mask' acts as the guard variable
>
> On the reader:
> load_acquire -> key_mask (or there will be a rte_smp_rmb after this)
> if statement etc.....
>
>> + return 1;
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static inline int
>> +__k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash, uint64_t *value, rte_k32v64_cmp_fn_t cmp_f) {
>> + uint64_t val = 0;
>> + struct rte_k32v64_ext_ent *ent;
>> + int32_t cnt;
>> + int found = 0;
>> + uint32_t bucket = hash & table->bucket_msk;
>> +
>> + do {
>> + do
>> + cnt = rte_atomic32_read(&table->t[bucket].cnt);
>> + while (unlikely(cnt & RTE_K32V64_WRITE_IN_PROGRESS));
> The reader can hang here if the writer increments the counter and gets scheduled out. Using the method suggested above, you can remove this code.
Agree, but we tried to make transaction critical section as small as we
can so probability is very low.
>
>> +
>> + found = cmp_f(&table->t[bucket], key, &val);
>> + if (unlikely((found == 0) &&
>> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
>> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> + if (ent->key == key) {
>> + val = ent->val;
>> + found = 1;
>> + break;
>> + }
>> + }
>> + }
>> +
>> + } while (unlikely(cnt != rte_atomic32_read(&table->t[bucket].cnt)));
> With the logic mentioned in delete function, the counter needs to be read at the beginning of the loop. Suggest looking at rte_hash-cuckoo hash for the memory ordering required for the counter.
>
>> +
>> + if (found == 1) {
>> + *value = val;
>> + return 0;
>> + } else
>> + return -ENOENT;
>> +}
>> +
>> +static inline int
>> +rte_k32v64_hash_lookup(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash, uint64_t *value)
>> +{
>> + return __k32v64_hash_lookup(table, key, hash, value,
>> +__k32v64_cmp_keys); }
>> +
>> +static inline int
>> +rte_k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table,
>> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n) {
>> + return table->lookup(table, keys, hashes, values, n); }
>> +
>> +/**
>> + * Add a key to an existing hash table with hash value.
>> + * This operation is not multi-thread safe
>> + * and should only be called from one thread.
>> + *
>> + * @param ht
>> + * Hash table to add the key to.
>> + * @param key
>> + * Key to add to the hash table.
>> + * @param value
>> + * Value to associate with key.
>> + * @param hash
>> + * Hash value associated with key.
>> + * @return
>> + * 0 if ok, or negative value on error.
>> + */
>> +__rte_experimental
>> +int
>> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash, uint64_t value);
>> +
>> +/**
>> + * Remove a key with a given hash value from an existing hash table.
>> + * This operation is not multi-thread
>> + * safe and should only be called from one thread.
>> + *
>> + * @param ht
>> + * Hash table to remove the key from.
>> + * @param key
>> + * Key to remove from the hash table.
>> + * @param hash
>> + * hash value associated with key.
>> + * @return
>> + * 0 if ok, or negative value on error.
>> + */
>> +__rte_experimental
>> +int
>> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
>> + uint32_t hash);
>> +
>> +
>> +/**
>> + * Performs a lookup for an existing hash table, and returns a pointer
>> +to
>> + * the table if found.
>> + *
>> + * @param name
>> + * Name of the hash table to find
>> + *
>> + * @return
>> + * pointer to hash table structure or NULL on error with rte_errno
>> + * set appropriately.
>> + */
>> +__rte_experimental
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_find_existing(const char *name);
>> +
>> +/**
>> + * Create a new hash table for use with four byte keys.
>> + *
>> + * @param params
>> + * Parameters used in creation of hash table.
>> + *
>> + * @return
>> + * Pointer to hash table structure that is used in future hash table
>> + * operations, or NULL on error with rte_errno set appropriately.
>> + */
>> +__rte_experimental
>> +struct rte_k32v64_hash_table *
>> +rte_k32v64_hash_create(const struct rte_k32v64_hash_params *params);
>> +
>> +/**
>> + * Free all memory used by a hash table.
>> + *
>> + * @param table
>> + * Hash table to deallocate.
>> + */
>> +__rte_experimental
>> +void
>> +rte_k32v64_hash_free(struct rte_k32v64_hash_table *table);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_K32V64_HASH_H_ */
>> --
>> 2.7.4
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/4] add new kv hash table
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
@ 2020-06-16 16:37 ` Thomas Monjalon
2021-03-24 21:28 ` Thomas Monjalon
1 sibling, 0 replies; 56+ messages in thread
From: Thomas Monjalon @ 2020-06-16 16:37 UTC (permalink / raw)
To: dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, Vladimir Medvedkin
Waiting for reviews please.
08/05/2020 21:58, Vladimir Medvedkin:
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called KV hash
> supports 4 byte keys and 8 byte associated values,
> which is enough to store a pointer.
>
> v4:
> - internal implementation is hided under universal API for any key and value sizes
> - add and delete API now return old value
> - add transaction counter modification to _add()
> - transaction counter now modifies with c11 atomics
>
> v3:
> - added bulk lookup
> - avx512 key comparizon is removed from .h
>
> v2:
> - renamed from rte_dwk to rte_k32v64 as was suggested
> - reworked lookup function, added inlined subroutines
> - added avx512 key comparizon routine
> - added documentation
> - added statistic counters for total entries and extended entries(linked list)
>
> Vladimir Medvedkin (4):
> hash: add kv hash library
> hash: add documentation for kv hash library
> test: add kv hash autotests
> test: add kv perf tests
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
@ 2020-06-23 15:44 ` Ananyev, Konstantin
2020-06-23 23:06 ` Ananyev, Konstantin
2020-06-25 19:49 ` Medvedkin, Vladimir
2020-06-24 1:19 ` Wang, Yipeng1
2020-06-25 4:27 ` Honnappa Nagarahalli
2 siblings, 2 replies; 56+ messages in thread
From: Ananyev, Konstantin @ 2020-06-23 15:44 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Vladimir,
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash.c
> @@ -0,0 +1,277 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +
> +#include "k32v64_hash.h"
> +
> +static inline int
> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value, __kv_cmp_keys);
> +}
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n)
> +{
> + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> + uint32_t *keys = keys_p;
> + uint64_t *values = values_p;
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
> + void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
> +#endif
> +
> +static rte_kv_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
> + return k32v64_hash_bulk_lookup_avx512vl;
> +#endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +static int
> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value, uint64_t *old_value, int *found)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].val[i] = value;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + *old_value = ent->val;
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + ent->val = value;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + __atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
> + __ATOMIC_RELEASE);
I think this case also has to guarded with table->t[bucket].cnt updates.
> + table->nb_ent++;
> + *found = 0;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
__atomic_thread_fence(__ATOMIC_RELEASE);
?
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + *found = 0;
> + return 0;
> +}
> +
> +static int
> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *old_value)
> +{
> + uint32_t bucket;
> + int i;
> + struct k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + table->nb_ext_ent--;
> + } else
> + __atomic_and_fetch(&table->t[bucket].key_mask,
> + ~(1 << i), __ATOMIC_RELEASE);
Same thought as above - safer to guard this mask update with cnt update.
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
Can't this 'if(ent)' be merged with previous 'if (ent) {...}' above?
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + *old_value = ent->val;
> +
> + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
> + SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
> + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-06-23 15:44 ` Ananyev, Konstantin
@ 2020-06-23 23:06 ` Ananyev, Konstantin
2020-06-25 19:56 ` Medvedkin, Vladimir
2020-06-25 19:49 ` Medvedkin, Vladimir
1 sibling, 1 reply; 56+ messages in thread
From: Ananyev, Konstantin @ 2020-06-23 23:06 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
> Hi Vladimir,
>
> > --- /dev/null
> > +++ b/lib/librte_hash/k32v64_hash.c
> > @@ -0,0 +1,277 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation
> > + */
> > +
> > +#include <string.h>
> > +
> > +#include <rte_errno.h>
> > +#include <rte_malloc.h>
> > +#include <rte_memory.h>
> > +
> > +#include "k32v64_hash.h"
> > +
> > +static inline int
> > +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> > + uint32_t hash, uint64_t *value)
> > +{
> > + return __k32v64_hash_lookup(table, key, hash, value, __kv_cmp_keys);
> > +}
> > +
> > +static int
> > +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
> > + uint32_t *hashes, void *values_p, unsigned int n)
> > +{
> > + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> > + uint32_t *keys = keys_p;
> > + uint64_t *values = values_p;
> > + int ret, cnt = 0;
> > + unsigned int i;
> > +
> > + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> > + (values == NULL)))
> > + return -EINVAL;
As a nit - this formal parameter checking better to do in public function
(rte_kv_hash_bulk_lookup) before de-refencing table and calling actual lookup().
Same story for modify() - formal parameter checking can be done at the top of public function.
BTW, why to unite add/delete into modify(), if internally you have 2 different functions
(for add/delete) anyway?
> > +
> > + for (i = 0; i < n; i++) {
> > + ret = k32v64_hash_lookup(table, keys[i], hashes[i],
> > + &values[i]);
> > + if (ret == 0)
> > + cnt++;
> > + }
> > + return cnt;
> > +}
> > +
> > +#ifdef CC_AVX512VL_SUPPORT
> > +int
> > +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
> > + void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
> > +#endif
> > +
> > +static rte_kv_hash_bulk_lookup_t
> > +get_lookup_bulk_fn(void)
> > +{
> > +#ifdef CC_AVX512VL_SUPPORT
> > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
> > + return k32v64_hash_bulk_lookup_avx512vl;
> > +#endif
> > + return k32v64_hash_bulk_lookup;
> > +}
> > +
> > +static int
> > +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
> > + uint32_t hash, uint64_t value, uint64_t *old_value, int *found)
> > +{
> > + uint32_t bucket;
> > + int i, idx, ret;
> > + uint8_t msk;
> > + struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
> > +
> > + if (table == NULL)
> > + return -EINVAL;
> > +
> > + bucket = hash & table->bucket_msk;
> > + /* Search key in table. Update value if exists */
> > + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> > + if ((key == table->t[bucket].key[i]) &&
> > + (table->t[bucket].key_mask & (1 << i))) {
> > + *old_value = table->t[bucket].val[i];
> > + *found = 1;
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_ACQUIRE);
After another thought - atomic add is probably an overkill here.
Something like:
void update_start(struct k32v64_hash_bucket *bkt)
{
bkt->cnt++;
__atomic_thread_fence(__ATOMIC_ACQ_REL);
}
void update_finish(struct k32v64_hash_bucket *bkt)
{
__atomic_thread_fence(__ATOMIC_ACQ_REL);
bkt->cnt++;
}
I think should be sufficient here.
> > + table->t[bucket].val[i] = value;
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_RELEASE);
> > + return 0;
> > + }
> > + }
> > +
> > + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> > + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> > + if (ent->key == key) {
> > + *old_value = ent->val;
> > + *found = 1;
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_ACQUIRE);
> > + ent->val = value;
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_RELEASE);
> > + return 0;
> > + }
> > + }
> > + }
> > +
> > + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> > + if (msk) {
> > + idx = __builtin_ctz(msk);
> > + table->t[bucket].key[idx] = key;
> > + table->t[bucket].val[idx] = value;
> > + __atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
> > + __ATOMIC_RELEASE);
>
> I think this case also has to guarded with table->t[bucket].cnt updates.
>
> > + table->nb_ent++;
> > + *found = 0;
> > + return 0;
> > + }
> > +
> > + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> > + if (ret < 0)
> > + return ret;
> > +
> > + SLIST_NEXT(ent, next) = NULL;
> > + ent->key = key;
> > + ent->val = value;
> > + rte_smp_wmb();
>
> __atomic_thread_fence(__ATOMIC_RELEASE);
> ?
>
> > + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> > + prev = tmp;
> > +
> > + if (prev == NULL)
> > + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> > + else
> > + SLIST_INSERT_AFTER(prev, ent, next);
> > +
> > + table->nb_ent++;
> > + table->nb_ext_ent++;
> > + *found = 0;
> > + return 0;
> > +}
> > +
> > +static int
> > +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
> > + uint32_t hash, uint64_t *old_value)
> > +{
> > + uint32_t bucket;
> > + int i;
> > + struct k32v64_ext_ent *ent;
> > +
> > + if (table == NULL)
> > + return -EINVAL;
> > +
> > + bucket = hash & table->bucket_msk;
> > +
> > + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> > + if ((key == table->t[bucket].key[i]) &&
> > + (table->t[bucket].key_mask & (1 << i))) {
> > + *old_value = table->t[bucket].val[i];
> > + ent = SLIST_FIRST(&table->t[bucket].head);
> > + if (ent) {
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_ACQUIRE);
> > + table->t[bucket].key[i] = ent->key;
> > + table->t[bucket].val[i] = ent->val;
> > + SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> > + __ATOMIC_RELEASE);
> > + table->nb_ext_ent--;
> > + } else
> > + __atomic_and_fetch(&table->t[bucket].key_mask,
> > + ~(1 << i), __ATOMIC_RELEASE);
>
> Same thought as above - safer to guard this mask update with cnt update.
>
> > + if (ent)
> > + rte_mempool_put(table->ext_ent_pool, ent);
>
> Can't this 'if(ent)' be merged with previous 'if (ent) {...}' above?
>
> > + table->nb_ent--;
> > + return 0;
> > + }
> > + }
> > +
> > + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> > + if (ent->key == key)
> > + break;
> > +
> > + if (ent == NULL)
> > + return -ENOENT;
> > +
> > + *old_value = ent->val;
> > +
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
> > + SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
> > + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
> > + rte_mempool_put(table->ext_ent_pool, ent);
> > +
> > + table->nb_ext_ent--;
> > + table->nb_ent--;
> > +
> > + return 0;
> > +}
> > +
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
2020-06-23 15:44 ` Ananyev, Konstantin
@ 2020-06-24 1:19 ` Wang, Yipeng1
2020-06-25 20:26 ` Medvedkin, Vladimir
2020-06-25 4:27 ` Honnappa Nagarahalli
2 siblings, 1 reply; 56+ messages in thread
From: Wang, Yipeng1 @ 2020-06-24 1:19 UTC (permalink / raw)
To: Medvedkin, Vladimir, dev
Cc: Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce
> -----Original Message-----
> From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> Sent: Friday, May 8, 2020 12:59 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
> <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [PATCH v4 1/4] hash: add kv hash library
>
> KV hash is a special optimized key-value storage for fixed key and value sizes.
> At the moment it supports 32 bit keys and 64 bit values. This table is hash
> function agnostic so user must provide precalculated hash signature for
> add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 14 +-
> lib/librte_hash/k32v64_hash.c | 277
> +++++++++++++++++++++++++++++++++
> lib/librte_hash/k32v64_hash.h | 98 ++++++++++++
> lib/librte_hash/k32v64_hash_avx512vl.c | 59 +++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_kv_hash.c | 184 ++++++++++++++++++++++
> lib/librte_hash/rte_kv_hash.h | 169 ++++++++++++++++++++
> 9 files changed, 821 insertions(+), 5 deletions(-) create mode 100644
> lib/librte_hash/k32v64_hash.c create mode 100644
> lib/librte_hash/k32v64_hash.h create mode 100644
> lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_kv_hash.c create mode 100644
> lib/librte_hash/rte_kv_hash.h
>
> diff --git a/lib/Makefile b/lib/Makefile index 9d24609..42769e9 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> librte_net librte_hash librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash :=
> librte_eal librte_ring
> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd :=
> librte_eal librte_ring librte_hash
> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
> ec9f864..a0cdee9 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -8,13 +8,15 @@ LIB = librte_hash.a
>
> CFLAGS += -O3
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> -LDLIBS += -lrte_eal -lrte_ring
> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>
> EXPORT_MAP := rte_hash_version.map
>
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_kv_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash.c
>
> # install this header file
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
> +29,15 @@ endif SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include +=
> rte_jhash.h SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_kv_hash.h
> +
> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null
> 2>&1 |
> +\ grep -q __AVX512VL__ && echo 1)
> +
> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
> + CFLAGS_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_hash/k32v64_hash.c b/lib/librte_hash/k32v64_hash.c
> new file mode 100644 index 0000000..24cd63a
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash.c
> @@ -0,0 +1,277 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +
> +#include "k32v64_hash.h"
> +
> +static inline int
> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> __kv_cmp_keys); }
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n) {
> + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> + uint32_t *keys = keys_p;
> + uint64_t *values = values_p;
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
[Wang, Yipeng] You don't need to start a new line for values.
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +#ifdef CC_AVX512VL_SUPPORT
[Wang, Yipeng] Why not use the already provided, e.g. #if defined(RTE_MACHINE_CPUFLAG_SSE2) like rte_hash does?
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
> + void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
> +#endif
> +
> +static rte_kv_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
> + return k32v64_hash_bulk_lookup_avx512vl; #endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +static int
> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value, uint64_t *old_value, int *found) {
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].val[i] = value;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + *old_value = ent->val;
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt,
> 1,
> + __ATOMIC_ACQUIRE);
> + ent->val = value;
> + __atomic_fetch_add(&table->t[bucket].cnt,
> 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + __atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
> + __ATOMIC_RELEASE);
> + table->nb_ent++;
> + *found = 0;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + *found = 0;
> + return 0;
> +}
> +
> +static int
> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *old_value)
> +{
> + uint32_t bucket;
> + int i;
> + struct k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + __atomic_fetch_add(&table->t[bucket].cnt,
> 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table-
> >t[bucket].head, next);
> + __atomic_fetch_add(&table->t[bucket].cnt,
> 1,
> + __ATOMIC_RELEASE);
> + table->nb_ext_ent--;
> + } else
> + __atomic_and_fetch(&table-
> >t[bucket].key_mask,
> + ~(1 << i), __ATOMIC_RELEASE);
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool,
> ent);
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + *old_value = ent->val;
> +
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> __ATOMIC_ACQUIRE);
> + SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
> + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
[Wang, Yipeng] I am not sure if delete can be called safely with other lookup threads.
The item could be recycled to be used by another add while a lookup thread traversing the linked list.
This is similar to why we need the RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL flag and related API in rte_hash.
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +static int
> +k32v64_modify(struct rte_kv_hash_table *table, void *key_p, uint32_t
> hash,
> + enum rte_kv_modify_op op, void *value_p, int *found) {
> + struct k32v64_hash_table *ht = (struct k32v64_hash_table *)table;
> + uint32_t *key = key_p;
> + uint64_t value;
> +
> + if ((ht == NULL) || (key == NULL) || (value_p == NULL) ||
> + (found == NULL) || (op >=
> RTE_KV_MODIFY_OP_MAX)) {
> + return -EINVAL;
> + }
> +
> + value = *(uint64_t *)value_p;
> + switch (op) {
> + case RTE_KV_MODIFY_ADD:
> + return k32v64_hash_add(ht, *key, hash, value, value_p,
> found);
> + case RTE_KV_MODIFY_DEL:
> + return k32v64_hash_delete(ht, *key, hash, value_p);
> + default:
> + break;
> + }
[Wang, Yipeng] A question would be why put del and add inside another mod wrapper.
If we don't wrap del and add like this, we could get rid of this branch.
> +
> + return -EINVAL;
> +}
> +
> +struct rte_kv_hash_table *
> +k32v64_hash_create(const struct rte_kv_hash_params *params) {
> + char hash_name[RTE_KV_HASH_NAMESIZE];
> + struct k32v64_hash_table *ht = NULL;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
[Wang, Yipeng] Should we check if entry count larger than keys_per_bucket as well?
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
> >name);
> + if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / K32V64_KEYS_PER_BUCKET;
[Wang, Yipeng] Macro to check if keys_per_bucket need to be power of 2
> + mem_size = sizeof(struct k32v64_hash_table) +
> + sizeof(struct k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + rte_mempool_free(mp);
> + return NULL;
> + }
> +
> + memcpy(ht->pub.name, hash_name, sizeof(ht->pub.name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->pub.lookup = get_lookup_bulk_fn();
[Wang, Yipeng] Inside the function, we also need to check the CPUID during runtime to decide if AVX can be used, not only compile time.
You could refer to example from rte_hash: ... if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) ...
> + ht->pub.modify = k32v64_modify;
> +
> + return (struct rte_kv_hash_table *)ht; }
> +
> +void
> +k32v64_hash_free(struct rte_kv_hash_table *ht) {
> + if (ht == NULL)
> + return;
> +
> + rte_mempool_free(((struct k32v64_hash_table *)ht)-
> >ext_ent_pool);
> + rte_free(ht);
> +}
> diff --git a/lib/librte_hash/k32v64_hash.h b/lib/librte_hash/k32v64_hash.h
> new file mode 100644 index 0000000..10061a5
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash.h
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <rte_kv_hash.h>
> +
> +#define K32V64_KEYS_PER_BUCKET 4
> +#define K32V64_WRITE_IN_PROGRESS 1
> +#define VALID_KEY_MSK ((1 << K32V64_KEYS_PER_BUCKET) - 1)
> +
> +struct k32v64_ext_ent {
> + SLIST_ENTRY(k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct k32v64_hash_bucket {
> + uint32_t key[K32V64_KEYS_PER_BUCKET];
> + uint64_t val[K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + uint32_t cnt;
> + SLIST_HEAD(k32v64_list_head, k32v64_ext_ent) head; }
> +__rte_cache_aligned;
> +
> +struct k32v64_hash_table {
> + struct rte_kv_hash_table pub; /**< Public part */
[Wang, Yipeng] Do we need to have the pub filed? Could we just init the public part in the public create function?
> + uint32_t nb_ent; /**< Number of entities in
> the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + __extension__ struct k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*k32v64_cmp_fn_t)
> +(struct k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__kv_cmp_keys(struct k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
> +{
> + int i;
> +
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, k32v64_cmp_fn_t cmp_f) {
> + uint64_t val = 0;
> + struct k32v64_ext_ent *ent;
> + uint32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> +
> + do {
> + cnt = __atomic_load_n(&table->t[bucket].cnt,
> + __ATOMIC_ACQUIRE);
> + } while (unlikely(cnt & K32V64_WRITE_IN_PROGRESS));
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> + __atomic_thread_fence(__ATOMIC_RELEASE);
> + } while (unlikely(cnt != __atomic_load_n(&table->t[bucket].cnt,
> + __ATOMIC_RELAXED)));
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +struct rte_kv_hash_table *
> +k32v64_hash_create(const struct rte_kv_hash_params *params);
> +
> +void
> +k32v64_hash_free(struct rte_kv_hash_table *ht);
> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
> b/lib/librte_hash/k32v64_hash_avx512vl.c
> new file mode 100644
> index 0000000..75cede5
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include "k32v64_hash.h"
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
> *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n);
> +
> +static inline int
> +k32v64_cmp_keys_avx512vl(struct k32v64_hash_bucket *bucket, uint32_t
> key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys,
> srch_key);
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +k32v64_hash_lookup_avx512vl(struct k32v64_hash_table *table, uint32_t
> key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> + k32v64_cmp_keys_avx512vl);
> +}
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
> *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n) {
> + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> + uint32_t *keys = keys_p;
> + uint64_t *values = values_p;
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build index
> 6ab46ae..0d014ea 100644
> --- a/lib/librte_hash/meson.build
> +++ b/lib/librte_hash/meson.build
> @@ -3,10 +3,23 @@
>
> headers = files('rte_crc_arm64.h',
> 'rte_fbk_hash.h',
> + 'rte_kv_hash.h',
> 'rte_hash_crc.h',
> 'rte_hash.h',
> 'rte_jhash.h',
> 'rte_thash.h')
>
> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps += ['ring']
> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_kv_hash.c',
> +'k32v64_hash.c') deps += ['ring', 'mempool']
> +
> +if dpdk_conf.has('RTE_ARCH_X86')
> + if cc.has_argument('-mavx512vl')
> + avx512_tmplib = static_library('avx512_tmp',
> + 'k32v64_hash_avx512vl.c',
> + dependencies: static_rte_mempool,
> + c_args: cflags + ['-mavx512vl'])
> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
> + cflags += '-DCC_AVX512VL_SUPPORT'
> +
> + endif
> +endif
> diff --git a/lib/librte_hash/rte_hash_version.map
> b/lib/librte_hash/rte_hash_version.map
> index c2a9094..614e0a5 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -36,5 +36,9 @@ EXPERIMENTAL {
> rte_hash_lookup_with_hash_bulk;
> rte_hash_lookup_with_hash_bulk_data;
> rte_hash_max_key_id;
> -
> + rte_kv_hash_create;
> + rte_kv_hash_find_existing;
> + rte_kv_hash_free;
> + rte_kv_hash_add;
> + rte_kv_hash_delete;
> };
> diff --git a/lib/librte_hash/rte_kv_hash.c b/lib/librte_hash/rte_kv_hash.c
> new file mode 100644 index 0000000..03df8db
> --- /dev/null
> +++ b/lib/librte_hash/rte_kv_hash.c
> @@ -0,0 +1,184 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_kv_hash.h>
[Wang, Yipeng] I think for this header we should use quotes ""
> +#include "k32v64_hash.h"
> +
> +TAILQ_HEAD(rte_kv_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_kv_hash_tailq = {
> + .name = "RTE_KV_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_kv_hash_tailq);
> +
> +int
> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value, int *found) {
> + if (table == NULL)
[Wang, Yipeng] To be more consistent,
I think we should either also check key/value/found here, or just not checking at all and leave it to the next functions.
> + return -EINVAL;
> +
> + return table->modify(table, key, hash, RTE_KV_MODIFY_ADD,
> + value, found);
> +}
> +
> +int
> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value)
> +{
> + int found;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + return table->modify(table, key, hash, RTE_KV_MODIFY_DEL,
> + value, &found);
> +}
> +
> +struct rte_kv_hash_table *
> +rte_kv_hash_find_existing(const char *name) {
> + struct rte_kv_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + h = (struct rte_kv_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_KV_HASH_NAMESIZE) == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_kv_hash_table *
> +rte_kv_hash_create(const struct rte_kv_hash_params *params) {
> + char hash_name[RTE_KV_HASH_NAMESIZE];
> + struct rte_kv_hash_table *ht, *tmp_ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> + int ret;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0) ||
> + (params->type >= RTE_KV_HASH_MAX)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
> >name);
> + if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + switch (params->type) {
> + case RTE_KV_HASH_K32V64:
> + ht = k32v64_hash_create(params);
> + break;
> + default:
> + rte_errno = EINVAL;
> + return NULL;
> + }
> + if (ht == NULL)
> + return ht;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + tmp_ht = (struct rte_kv_hash_table *) te->data;
> + if (strncmp(params->name, tmp_ht->name,
> + RTE_KV_HASH_NAMESIZE) == 0)
> + break;
> + }
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + goto exit;
> + }
> +
> + te = rte_zmalloc("KV_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + goto exit;
> + }
> +
> + ht->type = params->type;
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(kv_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> + switch (params->type) {
> + case RTE_KV_HASH_K32V64:
> + k32v64_hash_free(ht);
> + break;
> + default:
> + break;
> + }
> + return NULL;
> +}
> +
> +void
> +rte_kv_hash_free(struct rte_kv_hash_table *ht) {
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(kv_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + switch (ht->type) {
> + case RTE_KV_HASH_K32V64:
> + k32v64_hash_free(ht);
> + break;
> + default:
> + break;
> + }
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_kv_hash.h b/lib/librte_hash/rte_kv_hash.h
> new file mode 100644 index 0000000..c0375d1
> --- /dev/null
> +++ b/lib/librte_hash/rte_kv_hash.h
> @@ -0,0 +1,169 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_KV_HASH_H_
> +#define _RTE_KV_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_KV_HASH_NAMESIZE 32
> +
> +enum rte_kv_hash_type {
> + RTE_KV_HASH_K32V64,
> + RTE_KV_HASH_MAX
> +};
> +
> +enum rte_kv_modify_op {
> + RTE_KV_MODIFY_ADD,
> + RTE_KV_MODIFY_DEL,
> + RTE_KV_MODIFY_OP_MAX
> +};
[Wang, Yipeng] Again, any particular reason that you combine add and del into an additional struct mod?
> +
> +struct rte_kv_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> + enum rte_kv_hash_type type;
> +};
> +
> +struct rte_kv_hash_table;
> +
> +typedef int (*rte_kv_hash_bulk_lookup_t) (struct rte_kv_hash_table
> +*table, void *keys, uint32_t *hashes,
> + void *values, unsigned int n);
> +
> +typedef int (*rte_kv_hash_modify_t)
> +(struct rte_kv_hash_table *table, void *key, uint32_t hash,
> + enum rte_kv_modify_op op, void *value, int *found);
> +
> +struct rte_kv_hash_table {
> + char name[RTE_KV_HASH_NAMESIZE]; /**< Name of the
> hash. */
> + rte_kv_hash_bulk_lookup_t lookup;
> + rte_kv_hash_modify_t modify;
> + enum rte_kv_hash_type type;
> +};
> +
> +/**
> + * Lookup bulk of keys.
> + * This function is multi-thread safe with regarding to other lookup threads.
> + *
> + * @param table
> + * Hash table to add the key to.
> + * @param keys
> + * Pointer to array of keys
> + * @param hashes
> + * Pointer to array of hash values associated with keys.
> + * @param values
> + * Pointer to array of value corresponded to keys
> + * If the key was not found the corresponding value remains intact.
> + * @param n
> + * Number of keys to lookup in batch.
> + * @return
> + * -EINVAL if there's an error, otherwise number of successful lookups.
> + */
[Wang, Yipeng] experimental tag
> +static inline int
> +rte_kv_hash_bulk_lookup(struct rte_kv_hash_table *table,
> + void *keys, uint32_t *hashes, void *values, unsigned int n) {
> + return table->lookup(table, keys, hashes, values, n); }
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe regarding to add/delete
> +functions
> + * and should only be called from one thread.
> + * However it is safe to call it along with lookup.
> + *
> + * @param table
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
> + * @param hash
> + * Hash value associated with key.
> + * @found
> + * 0 if no previously added key was found
> + * 1 previously added key was found, old value associated with the key
> + * was written to *value
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value, int *found);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread safe regarding to add/delete
> +functions
> + * and should only be called from one thread.
> + * However it is safe to call it along with lookup.
> + *
> + * @param table
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @param value
> + * pointer to memory where the old value will be written to on success
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value);
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a pointer
> +to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_kv_hash_table *
> +rte_kv_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
[Wang, Yipeng] inaccurate comment, not four byte keys
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_kv_hash_table *
> +rte_kv_hash_create(const struct rte_kv_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_kv_hash_free(struct rte_kv_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_KV_HASH_H_ */
> --
> 2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
2020-06-23 15:44 ` Ananyev, Konstantin
2020-06-24 1:19 ` Wang, Yipeng1
@ 2020-06-25 4:27 ` Honnappa Nagarahalli
2 siblings, 0 replies; 56+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-25 4:27 UTC (permalink / raw)
To: Vladimir Medvedkin, dev
Cc: konstantin.ananyev, yipeng1.wang, sameh.gobriel,
bruce.richardson, nd, Honnappa Nagarahalli, nd
Hi Vladimir,
Few comments inline.
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Vladimir Medvedkin
> Sent: Friday, May 8, 2020 2:59 PM
> To: dev@dpdk.org
> Cc: konstantin.ananyev@intel.com; yipeng1.wang@intel.com;
> sameh.gobriel@intel.com; bruce.richardson@intel.com
> Subject: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
>
> KV hash is a special optimized key-value storage for fixed key and value sizes.
> At the moment it supports 32 bit keys and 64 bit values. This table is hash
> function agnostic so user must provide precalculated hash signature for
> add/delete/lookup operations.
>
> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> lib/Makefile | 2 +-
> lib/librte_hash/Makefile | 14 +-
> lib/librte_hash/k32v64_hash.c | 277
> +++++++++++++++++++++++++++++++++
> lib/librte_hash/k32v64_hash.h | 98 ++++++++++++
> lib/librte_hash/k32v64_hash_avx512vl.c | 59 +++++++
> lib/librte_hash/meson.build | 17 +-
> lib/librte_hash/rte_hash_version.map | 6 +-
> lib/librte_hash/rte_kv_hash.c | 184 ++++++++++++++++++++++
> lib/librte_hash/rte_kv_hash.h | 169 ++++++++++++++++++++
> 9 files changed, 821 insertions(+), 5 deletions(-) create mode 100644
> lib/librte_hash/k32v64_hash.c create mode 100644
> lib/librte_hash/k32v64_hash.h create mode 100644
> lib/librte_hash/k32v64_hash_avx512vl.c
> create mode 100644 lib/librte_hash/rte_kv_hash.c create mode 100644
> lib/librte_hash/rte_kv_hash.h
>
> diff --git a/lib/Makefile b/lib/Makefile index 9d24609..42769e9 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
> librte_net librte_hash librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash :=
> librte_eal librte_ring
> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd :=
> librte_eal librte_ring librte_hash
> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index ec9f864..a0cdee9
> 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -8,13 +8,15 @@ LIB = librte_hash.a
>
> CFLAGS += -O3
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> -LDLIBS += -lrte_eal -lrte_ring
> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>
> EXPORT_MAP := rte_hash_version.map
>
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_kv_hash.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash.c
>
> # install this header file
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
> +29,15 @@ endif SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include +=
> rte_jhash.h SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_kv_hash.h
> +
> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null 2>&1 |
> +\ grep -q __AVX512VL__ && echo 1)
> +
> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
> + SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
> + CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
> + CFLAGS_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_hash/k32v64_hash.c b/lib/librte_hash/k32v64_hash.c
> new file mode 100644 index 0000000..24cd63a
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash.c
> @@ -0,0 +1,277 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +
> +#include "k32v64_hash.h"
> +
> +static inline int
> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> __kv_cmp_keys); }
Since, this is an inline function, would it be better to push this to the header file?
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n) {
> + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> + uint32_t *keys = keys_p;
> + uint64_t *values = values_p;
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
> + void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
> +#endif
> +
> +static rte_kv_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
> + return k32v64_hash_bulk_lookup_avx512vl; #endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +static int
> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value, uint64_t *old_value, int *found) {
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].val[i] = value;
Suggest using C11 builtin to store value. As far as I know all the supported platforms in DPDK have 64b atomic store in 32b mode.
With this we will be able to avoid incrementing the counter. The reader will either get old or the new value.
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + *old_value = ent->val;
> + *found = 1;
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + ent->val = value;
Same here. __atomic_store(&ent->val, value, __ATOMIC_RELAXED).
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + __atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
> + __ATOMIC_RELEASE);
> + table->nb_ent++;
Looks like bucket counter logic is needed for this case too. Please see the comments below in k32v64_hash_delete.
> + *found = 0;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
We need to avoid using rte_smp_* barriers as we are adopting C11 built-in atomics. See the below comment.
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
Both the inserts need to use release order when the 'ent' is inserted. I am not sure where the SLIST is being picked up from. But looking at the SLIST implementation in 'lib/librte_eal/windows/include/sys/queue.h', it is not implemented using C11. I think we could move queue.h from windows directory to a common directory and change the SLIST implementation.
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + *found = 0;
> + return 0;
> +}
> +
> +static int
> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *old_value)
> +{
> + uint32_t bucket;
> + int i;
> + struct k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + *old_value = table->t[bucket].val[i];
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_ACQUIRE);
> + table->t[bucket].key[i] = ent->key;
> + table->t[bucket].val[i] = ent->val;
> + SLIST_REMOVE_HEAD(&table->t[bucket].head,
> next);
> + __atomic_fetch_add(&table->t[bucket].cnt, 1,
> + __ATOMIC_RELEASE);
> + table->nb_ext_ent--;
> + } else
> + __atomic_and_fetch(&table-
> >t[bucket].key_mask,
> + ~(1 << i), __ATOMIC_RELEASE);
(taking note from your responses to my comments in v3)
It is possible that the reader might match the old key but get the new value.
1) Reader: successful key match
2) Writer: k32v64_hash_delete followed by k32v64_hash_add
3) Reader: reads the value
IMO, there are 2 ways to solve this issue:
1) Use the bucket count logic while adding an entry in the non-extended bucket (marked it in k32v64_hash_add).
2) Do not use the entry in the bucket till the readers indicate that they have stopped using the entry
> + if (ent)
> + rte_mempool_put(table->ext_ent_pool, ent);
> + table->nb_ent--;
> + return 0;
> + }
> + }
> +
> + SLIST_FOREACH(ent, &table->t[bucket].head, next)
> + if (ent->key == key)
> + break;
> +
> + if (ent == NULL)
> + return -ENOENT;
> +
> + *old_value = ent->val;
> +
> + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
> + SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
> + __atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
> + rte_mempool_put(table->ext_ent_pool, ent);
> +
> + table->nb_ext_ent--;
> + table->nb_ent--;
> +
> + return 0;
> +}
> +
> +static int
> +k32v64_modify(struct rte_kv_hash_table *table, void *key_p, uint32_t hash,
> + enum rte_kv_modify_op op, void *value_p, int *found) {
> + struct k32v64_hash_table *ht = (struct k32v64_hash_table *)table;
> + uint32_t *key = key_p;
> + uint64_t value;
> +
> + if ((ht == NULL) || (key == NULL) || (value_p == NULL) ||
> + (found == NULL) || (op >= RTE_KV_MODIFY_OP_MAX))
> {
> + return -EINVAL;
> + }
Suggest doing this check in rte_kv_hash_add/rte_kv_hash_delete. Then every implementation does not have to do this.
> +
> + value = *(uint64_t *)value_p;
In the API 'rte_kv_hash_add', value_p is 'void *' which does not convey that it is a pointer to 64b data. What happens while running on 32b systems?
> + switch (op) {
> + case RTE_KV_MODIFY_ADD:
> + return k32v64_hash_add(ht, *key, hash, value, value_p,
> found);
> + case RTE_KV_MODIFY_DEL:
> + return k32v64_hash_delete(ht, *key, hash, value_p);
> + default:
> + break;
> + }
> +
> + return -EINVAL;
> +}
> +
> +struct rte_kv_hash_table *
> +k32v64_hash_create(const struct rte_kv_hash_params *params) {
This is a private symbol, I think it needs to have '__rte' suffix?
> + char hash_name[RTE_KV_HASH_NAMESIZE];
> + struct k32v64_hash_table *ht = NULL;
> + uint32_t mem_size, nb_buckets, max_ent;
> + int ret;
> + struct rte_mempool *mp;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
nit, these checks were done already in 'rte_kv_hash_create', these checks can be skipped.
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
> >name);
> + if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
Same here, this is checked in the calling function.
> +
> + max_ent = rte_align32pow2(params->entries);
> + nb_buckets = max_ent / K32V64_KEYS_PER_BUCKET;
> + mem_size = sizeof(struct k32v64_hash_table) +
> + sizeof(struct k32v64_hash_bucket) * nb_buckets;
> +
> + mp = rte_mempool_create(hash_name, max_ent,
> + sizeof(struct k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
> + params->socket_id, 0);
> +
> + if (mp == NULL)
> + return NULL;
> +
> + ht = rte_zmalloc_socket(hash_name, mem_size,
> + RTE_CACHE_LINE_SIZE, params->socket_id);
> + if (ht == NULL) {
> + rte_mempool_free(mp);
> + return NULL;
> + }
> +
> + memcpy(ht->pub.name, hash_name, sizeof(ht->pub.name));
> + ht->max_ent = max_ent;
> + ht->bucket_msk = nb_buckets - 1;
> + ht->ext_ent_pool = mp;
> + ht->pub.lookup = get_lookup_bulk_fn();
> + ht->pub.modify = k32v64_modify;
> +
> + return (struct rte_kv_hash_table *)ht; }
> +
> +void
> +k32v64_hash_free(struct rte_kv_hash_table *ht) {
This is a private symbol, I think it needs to have '__rte' suffix?
> + if (ht == NULL)
> + return;
> +
> + rte_mempool_free(((struct k32v64_hash_table *)ht)->ext_ent_pool);
> + rte_free(ht);
> +}
> diff --git a/lib/librte_hash/k32v64_hash.h b/lib/librte_hash/k32v64_hash.h
> new file mode 100644 index 0000000..10061a5
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash.h
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <rte_kv_hash.h>
> +
> +#define K32V64_KEYS_PER_BUCKET 4
> +#define K32V64_WRITE_IN_PROGRESS 1
> +#define VALID_KEY_MSK ((1 << K32V64_KEYS_PER_BUCKET) - 1)
> +
> +struct k32v64_ext_ent {
> + SLIST_ENTRY(k32v64_ext_ent) next;
> + uint32_t key;
> + uint64_t val;
> +};
> +
> +struct k32v64_hash_bucket {
> + uint32_t key[K32V64_KEYS_PER_BUCKET];
> + uint64_t val[K32V64_KEYS_PER_BUCKET];
> + uint8_t key_mask;
> + uint32_t cnt;
> + SLIST_HEAD(k32v64_list_head, k32v64_ext_ent) head; }
> +__rte_cache_aligned;
> +
> +struct k32v64_hash_table {
> + struct rte_kv_hash_table pub; /**< Public part */
> + uint32_t nb_ent; /**< Number of entities in the table*/
> + uint32_t nb_ext_ent; /**< Number of extended entities */
> + uint32_t max_ent; /**< Maximum number of entities */
> + uint32_t bucket_msk;
> + struct rte_mempool *ext_ent_pool;
> + __extension__ struct k32v64_hash_bucket t[0];
> +};
> +
> +typedef int (*k32v64_cmp_fn_t)
> +(struct k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
> +
> +static inline int
> +__kv_cmp_keys(struct k32v64_hash_bucket *bucket, uint32_t key,
> + uint64_t *val)
Changing __kv_cmp_keys to __k32v64_cmp_keys would be consistent
> +{
> + int i;
> +
> + for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == bucket->key[i]) &&
> + (bucket->key_mask & (1 << i))) {
> + *val = bucket->val[i];
> + return 1;
> + }
> + }
You have to load-acquire 'key_mask' (corresponding to store-release on 'key_mask' in add). Suggest changing this as follows:
__atomic_load(&bucket->key_mask, &key_mask, __ATOMIC_ACQUIRE);
for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
if ((key == bucket->key[i]) && (key_mask & (1 << i))) {
*val = bucket->val[i];
return 1;
}
}
> +
> + return 0;
> +}
> +
> +static inline int
> +__k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t *value, k32v64_cmp_fn_t cmp_f) {
> + uint64_t val = 0;
> + struct k32v64_ext_ent *ent;
> + uint32_t cnt;
> + int found = 0;
> + uint32_t bucket = hash & table->bucket_msk;
> +
> + do {
> +
> + do {
> + cnt = __atomic_load_n(&table->t[bucket].cnt,
> + __ATOMIC_ACQUIRE);
> + } while (unlikely(cnt & K32V64_WRITE_IN_PROGRESS));
Agree that it is a small section. But the issue can happen. This also makes the algorithm un-acceptable in many use cases. IMHO, since we have identified the issue, we should fix it.
The issue is mainly due to not following the reader-writer concurrency design principles. i.e. the data that we want to communicate from writer to reader (key and value in this case) is not being communicated atomically. For ex: in rte_hash/cuckoo hash library, you can see that the key and pData are being communicated atomically using the key store index.
I might be wrong, but, I do not think we can make this block-free (readers move forward even when the writer is not scheduled) using bucket counter.
This problem does not exist in existing rte_hash library. It might not be performing as good as this, but it is block-free.
> +
> + found = cmp_f(&table->t[bucket], key, &val);
> + if (unlikely((found == 0) &&
> + (!SLIST_EMPTY(&table->t[bucket].head)))) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + val = ent->val;
> + found = 1;
> + break;
> + }
> + }
> + }
> + __atomic_thread_fence(__ATOMIC_RELEASE);
> + } while (unlikely(cnt != __atomic_load_n(&table->t[bucket].cnt,
> + __ATOMIC_RELAXED)));
> +
> + if (found == 1) {
> + *value = val;
> + return 0;
> + } else
> + return -ENOENT;
> +}
> +
> +struct rte_kv_hash_table *
> +k32v64_hash_create(const struct rte_kv_hash_params *params);
> +
> +void
> +k32v64_hash_free(struct rte_kv_hash_table *ht);
> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
> b/lib/librte_hash/k32v64_hash_avx512vl.c
> new file mode 100644
> index 0000000..75cede5
> --- /dev/null
> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include "k32v64_hash.h"
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
> *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n);
> +
> +static inline int
> +k32v64_cmp_keys_avx512vl(struct k32v64_hash_bucket *bucket, uint32_t
> key,
> + uint64_t *val)
> +{
> + __m128i keys, srch_key;
> + __mmask8 msk;
> +
> + keys = _mm_load_si128((void *)bucket);
> + srch_key = _mm_set1_epi32(key);
> +
> + msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys,
> srch_key);
> + if (msk) {
> + *val = bucket->val[__builtin_ctz(msk)];
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static inline int
> +k32v64_hash_lookup_avx512vl(struct k32v64_hash_table *table, uint32_t
> key,
> + uint32_t hash, uint64_t *value)
> +{
> + return __k32v64_hash_lookup(table, key, hash, value,
> + k32v64_cmp_keys_avx512vl);
> +}
> +
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
> *keys_p,
> + uint32_t *hashes, void *values_p, unsigned int n) {
> + struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
> + uint32_t *keys = keys_p;
> + uint64_t *values = values_p;
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build index
> 6ab46ae..0d014ea 100644
> --- a/lib/librte_hash/meson.build
> +++ b/lib/librte_hash/meson.build
> @@ -3,10 +3,23 @@
>
> headers = files('rte_crc_arm64.h',
> 'rte_fbk_hash.h',
> + 'rte_kv_hash.h',
> 'rte_hash_crc.h',
> 'rte_hash.h',
> 'rte_jhash.h',
> 'rte_thash.h')
>
> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps += ['ring']
> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_kv_hash.c',
> +'k32v64_hash.c') deps += ['ring', 'mempool']
> +
> +if dpdk_conf.has('RTE_ARCH_X86')
> + if cc.has_argument('-mavx512vl')
> + avx512_tmplib = static_library('avx512_tmp',
> + 'k32v64_hash_avx512vl.c',
> + dependencies: static_rte_mempool,
> + c_args: cflags + ['-mavx512vl'])
> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
> + cflags += '-DCC_AVX512VL_SUPPORT'
> +
> + endif
> +endif
> diff --git a/lib/librte_hash/rte_hash_version.map
> b/lib/librte_hash/rte_hash_version.map
> index c2a9094..614e0a5 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -36,5 +36,9 @@ EXPERIMENTAL {
> rte_hash_lookup_with_hash_bulk;
> rte_hash_lookup_with_hash_bulk_data;
> rte_hash_max_key_id;
> -
> + rte_kv_hash_create;
> + rte_kv_hash_find_existing;
> + rte_kv_hash_free;
> + rte_kv_hash_add;
> + rte_kv_hash_delete;
> };
> diff --git a/lib/librte_hash/rte_kv_hash.c b/lib/librte_hash/rte_kv_hash.c new
> file mode 100644 index 0000000..03df8db
> --- /dev/null
> +++ b/lib/librte_hash/rte_kv_hash.c
> @@ -0,0 +1,184 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_eal_memconfig.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_memory.h>
> +#include <rte_tailq.h>
> +
> +#include <rte_kv_hash.h>
> +#include "k32v64_hash.h"
> +
> +TAILQ_HEAD(rte_kv_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_kv_hash_tailq = {
> + .name = "RTE_KV_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_kv_hash_tailq);
> +
> +int
> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value, int *found) {
> + if (table == NULL)
> + return -EINVAL;
> +
> + return table->modify(table, key, hash, RTE_KV_MODIFY_ADD,
> + value, found);
> +}
> +
> +int
> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value)
> +{
> + int found;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + return table->modify(table, key, hash, RTE_KV_MODIFY_DEL,
> + value, &found);
> +}
> +
> +struct rte_kv_hash_table *
> +rte_kv_hash_find_existing(const char *name) {
I did not see a test case for this. Please add a test case for 'rte_kv_hash_find_existing'.
> + struct rte_kv_hash_table *h = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + rte_mcfg_tailq_read_lock();
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + h = (struct rte_kv_hash_table *) te->data;
> + if (strncmp(name, h->name, RTE_KV_HASH_NAMESIZE) == 0)
> + break;
> + }
> + rte_mcfg_tailq_read_unlock();
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return h;
> +}
> +
> +struct rte_kv_hash_table *
> +rte_kv_hash_create(const struct rte_kv_hash_params *params) {
> + char hash_name[RTE_KV_HASH_NAMESIZE];
> + struct rte_kv_hash_table *ht, *tmp_ht = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> + int ret;
> +
> + if ((params == NULL) || (params->name == NULL) ||
> + (params->entries == 0) ||
> + (params->type >= RTE_KV_HASH_MAX)) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
> >name);
RTE_KV_HASH_NAMESIZE is a public #define. It is natural for the user to use this to define the size of the hash table name. Now, we are taking 3 characters from that space. I think it is better to increase the size of 'hash_name' by 3 characters or skip adding 'KV_' to the name.
This also has an impact on 'rte_kv_hash_find_existing' where the user passed string is being used as is without adding 'KV_'. Suggest adding a test case for 'rte_kv_hash_find_existing'.
> + if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
> + rte_errno = ENAMETOOLONG;
> + return NULL;
> + }
> +
> + switch (params->type) {
> + case RTE_KV_HASH_K32V64:
> + ht = k32v64_hash_create(params);
> + break;
> + default:
> + rte_errno = EINVAL;
> + return NULL;
> + }
> + if (ht == NULL)
> + return ht;
> +
> + rte_mcfg_tailq_write_lock();
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + tmp_ht = (struct rte_kv_hash_table *) te->data;
> + if (strncmp(params->name, tmp_ht->name,
> + RTE_KV_HASH_NAMESIZE) == 0)
> + break;
> + }
> + if (te != NULL) {
> + rte_errno = EEXIST;
> + goto exit;
> + }
> +
> + te = rte_zmalloc("KV_HASH_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
> + goto exit;
> + }
> +
> + ht->type = params->type;
> + te->data = (void *)ht;
> + TAILQ_INSERT_TAIL(kv_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
> +
> + return ht;
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> + switch (params->type) {
> + case RTE_KV_HASH_K32V64:
> + k32v64_hash_free(ht);
> + break;
> + default:
> + break;
> + }
> + return NULL;
> +}
> +
> +void
> +rte_kv_hash_free(struct rte_kv_hash_table *ht) {
> + struct rte_tailq_entry *te;
> + struct rte_kv_hash_list *kv_hash_list;
> +
> + if (ht == NULL)
> + return;
> +
> + kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
> + rte_kv_hash_list);
> +
> + rte_mcfg_tailq_write_lock();
> +
> + /* find out tailq entry */
> + TAILQ_FOREACH(te, kv_hash_list, next) {
> + if (te->data == (void *) ht)
> + break;
> + }
> +
> +
> + if (te == NULL) {
> + rte_mcfg_tailq_write_unlock();
> + return;
> + }
> +
> + TAILQ_REMOVE(kv_hash_list, te, next);
> +
> + rte_mcfg_tailq_write_unlock();
I understand that the free is not thread safe. But, it might be safer if unlock is after the call to 'k32v64_hash_free'.
> +
> + switch (ht->type) {
> + case RTE_KV_HASH_K32V64:
> + k32v64_hash_free(ht);
> + break;
> + default:
> + break;
> + }
> + rte_free(te);
> +}
> diff --git a/lib/librte_hash/rte_kv_hash.h b/lib/librte_hash/rte_kv_hash.h new
> file mode 100644 index 0000000..c0375d1
> --- /dev/null
> +++ b/lib/librte_hash/rte_kv_hash.h
> @@ -0,0 +1,169 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_KV_HASH_H_
> +#define _RTE_KV_HASH_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_compat.h>
> +#include <rte_atomic.h>
> +#include <rte_mempool.h>
> +
> +#define RTE_KV_HASH_NAMESIZE 32
> +
> +enum rte_kv_hash_type {
> + RTE_KV_HASH_K32V64,
> + RTE_KV_HASH_MAX
> +};
> +
> +enum rte_kv_modify_op {
> + RTE_KV_MODIFY_ADD,
> + RTE_KV_MODIFY_DEL,
> + RTE_KV_MODIFY_OP_MAX
> +};
This could be in a private header file.
> +
> +struct rte_kv_hash_params {
> + const char *name;
> + uint32_t entries;
> + int socket_id;
> + enum rte_kv_hash_type type;
> +};
Since this is a public structure, suggest adding some reserved or flags field which are ignored now but could be used in the future for enhancements.
> +
> +struct rte_kv_hash_table;
> +
> +typedef int (*rte_kv_hash_bulk_lookup_t) (struct rte_kv_hash_table
> +*table, void *keys, uint32_t *hashes,
> + void *values, unsigned int n);
> +
> +typedef int (*rte_kv_hash_modify_t)
> +(struct rte_kv_hash_table *table, void *key, uint32_t hash,
> + enum rte_kv_modify_op op, void *value, int *found);
> +
> +struct rte_kv_hash_table {
> + char name[RTE_KV_HASH_NAMESIZE]; /**< Name of the hash. */
> + rte_kv_hash_bulk_lookup_t lookup;
> + rte_kv_hash_modify_t modify;
There are separate APIs provided for add and delete. Is there any advantage for combining add/delete into a single function pointer in the backend?
If we keep 2 separate pointers, we can get rid of 'enum rte_kv_modify_op'. It is simple to understand as well.
> + enum rte_kv_hash_type type;
> +};
> +
> +/**
> + * Lookup bulk of keys.
> + * This function is multi-thread safe with regarding to other lookup threads.
It is safe with respect to the writer as well (reader-writer concurrency), please capture this as well.
> + *
> + * @param table
> + * Hash table to add the key to.
nit, 'Hash table to lookup the keys'
> + * @param keys
> + * Pointer to array of keys
> + * @param hashes
> + * Pointer to array of hash values associated with keys.
> + * @param values
> + * Pointer to array of value corresponded to keys
> + * If the key was not found the corresponding value remains intact.
> + * @param n
> + * Number of keys to lookup in batch.
> + * @return
> + * -EINVAL if there's an error, otherwise number of successful lookups.
> + */
> +static inline int
> +rte_kv_hash_bulk_lookup(struct rte_kv_hash_table *table,
> + void *keys, uint32_t *hashes, void *values, unsigned int n) {
> + return table->lookup(table, keys, hashes, values, n); }
Consider making this a non-inline function. This is a bulk lookup and I think cost of calling a function should get amortized well.
This will also allow for hiding the 'struct rte_kv_hash_table' from the user which is better for ABI. You can move the definition of the 'struct rte_kv_hash_table' and function pointer declarations to a private header file.
> +
> +/**
> + * Add a key to an existing hash table with hash value.
> + * This operation is not multi-thread safe regarding to add/delete
> +functions
> + * and should only be called from one thread.
> + * However it is safe to call it along with lookup.
> + *
> + * @param table
> + * Hash table to add the key to.
> + * @param key
> + * Key to add to the hash table.
> + * @param value
> + * Value to associate with key.
I think it needs to be called out here that it the data is of size 64b (even in a 32b system) because of the implementation in function 'k32v64_modify'. Why not make 'value' of type 'uint64_t *'?
> + * @param hash
> + * Hash value associated with key.
> + * @found
> + * 0 if no previously added key was found
> + * 1 previously added key was found, old value associated with the key
> + * was written to *value
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value, int *found);
> +
> +/**
> + * Remove a key with a given hash value from an existing hash table.
> + * This operation is not multi-thread safe regarding to add/delete
> +functions
> + * and should only be called from one thread.
> + * However it is safe to call it along with lookup.
> + *
> + * @param table
> + * Hash table to remove the key from.
> + * @param key
> + * Key to remove from the hash table.
> + * @param hash
> + * hash value associated with key.
> + * @param value
> + * pointer to memory where the old value will be written to on success
> + * @return
> + * 0 if ok, or negative value on error.
> + */
> +__rte_experimental
> +int
> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
> + uint32_t hash, void *value);
> +
> +/**
> + * Performs a lookup for an existing hash table, and returns a pointer
> +to
> + * the table if found.
> + *
> + * @param name
> + * Name of the hash table to find
> + *
> + * @return
> + * pointer to hash table structure or NULL on error with rte_errno
> + * set appropriately.
> + */
> +__rte_experimental
> +struct rte_kv_hash_table *
> +rte_kv_hash_find_existing(const char *name);
> +
> +/**
> + * Create a new hash table for use with four byte keys.
> + *
> + * @param params
> + * Parameters used in creation of hash table.
> + *
> + * @return
> + * Pointer to hash table structure that is used in future hash table
> + * operations, or NULL on error with rte_errno set appropriately.
> + */
> +__rte_experimental
> +struct rte_kv_hash_table *
> +rte_kv_hash_create(const struct rte_kv_hash_params *params);
> +
> +/**
> + * Free all memory used by a hash table.
> + *
> + * @param table
> + * Hash table to deallocate.
> + */
> +__rte_experimental
> +void
> +rte_kv_hash_free(struct rte_kv_hash_table *table);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_KV_HASH_H_ */
> --
> 2.7.4
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-06-23 15:44 ` Ananyev, Konstantin
2020-06-23 23:06 ` Ananyev, Konstantin
@ 2020-06-25 19:49 ` Medvedkin, Vladimir
1 sibling, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-06-25 19:49 UTC (permalink / raw)
To: Ananyev, Konstantin, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
Hi Konstantin,
Thanks for the review. See below
On 23/06/2020 16:44, Ananyev, Konstantin wrote:
> Hi Vladimir,
>
>> --- /dev/null
>> +++ b/lib/librte_hash/k32v64_hash.c
>> @@ -0,0 +1,277 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <string.h>
>> +
>> +#include <rte_errno.h>
>> +#include <rte_malloc.h>
>> +#include <rte_memory.h>
>> +
>> +#include "k32v64_hash.h"
>> +
>> +static inline int
>> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *value)
>> +{
>> +return __k32v64_hash_lookup(table, key, hash, value, __kv_cmp_keys);
>> +}
>> +
>> +static int
>> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
>> +uint32_t *hashes, void *values_p, unsigned int n)
>> +{
>> +struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
>> +uint32_t *keys = keys_p;
>> +uint64_t *values = values_p;
>> +int ret, cnt = 0;
>> +unsigned int i;
>> +
>> +if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> +(values == NULL)))
>> +return -EINVAL;
>> +
>> +for (i = 0; i < n; i++) {
>> +ret = k32v64_hash_lookup(table, keys[i], hashes[i],
>> +&values[i]);
>> +if (ret == 0)
>> +cnt++;
>> +}
>> +return cnt;
>> +}
>> +
>> +#ifdef CC_AVX512VL_SUPPORT
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
>> +void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
>> +#endif
>> +
>> +static rte_kv_hash_bulk_lookup_t
>> +get_lookup_bulk_fn(void)
>> +{
>> +#ifdef CC_AVX512VL_SUPPORT
>> +if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
>> +return k32v64_hash_bulk_lookup_avx512vl;
>> +#endif
>> +return k32v64_hash_bulk_lookup;
>> +}
>> +
>> +static int
>> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t value, uint64_t *old_value, int *found)
>> +{
>> +uint32_t bucket;
>> +int i, idx, ret;
>> +uint8_t msk;
>> +struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +bucket = hash & table->bucket_msk;
>> +/* Search key in table. Update value if exists */
>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +*old_value = table->t[bucket].val[i];
>> +*found = 1;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_ACQUIRE);
>> +table->t[bucket].val[i] = value;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_RELEASE);
>> +return 0;
>> +}
>> +}
>> +
>> +if (!SLIST_EMPTY(&table->t[bucket].head)) {
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> +if (ent->key == key) {
>> +*old_value = ent->val;
>> +*found = 1;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_ACQUIRE);
>> +ent->val = value;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_RELEASE);
>> +return 0;
>> +}
>> +}
>> +}
>> +
>> +msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
>> +if (msk) {
>> +idx = __builtin_ctz(msk);
>> +table->t[bucket].key[idx] = key;
>> +table->t[bucket].val[idx] = value;
>> +__atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
>> +__ATOMIC_RELEASE);
> I think this case also has to guarded with table->t[bucket].cnt updates.
Agree, will change it.
>
>> +table->nb_ent++;
>> +*found = 0;
>> +return 0;
>> +}
>> +
>> +ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
>> +if (ret < 0)
>> +return ret;
>> +
>> +SLIST_NEXT(ent, next) = NULL;
>> +ent->key = key;
>> +ent->val = value;
>> +rte_smp_wmb();
> __atomic_thread_fence(__ATOMIC_RELEASE);
> ?
Sure, looks like I missed it, will fix
>
>> +SLIST_FOREACH(tmp, &table->t[bucket].head, next)
>> +prev = tmp;
>> +
>> +if (prev == NULL)
>> +SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
>> +else
>> +SLIST_INSERT_AFTER(prev, ent, next);
>> +
>> +table->nb_ent++;
>> +table->nb_ext_ent++;
>> +*found = 0;
>> +return 0;
>> +}
>> +
>> +static int
>> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *old_value)
>> +{
>> +uint32_t bucket;
>> +int i;
>> +struct k32v64_ext_ent *ent;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +bucket = hash & table->bucket_msk;
>> +
>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +*old_value = table->t[bucket].val[i];
>> +ent = SLIST_FIRST(&table->t[bucket].head);
>> +if (ent) {
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_ACQUIRE);
>> +table->t[bucket].key[i] = ent->key;
>> +table->t[bucket].val[i] = ent->val;
>> +SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_RELEASE);
>> +table->nb_ext_ent--;
>> +} else
>> +__atomic_and_fetch(&table->t[bucket].key_mask,
>> +~(1 << i), __ATOMIC_RELEASE);
> Same thought as above - safer to guard this mask update with cnt update.
Agree
>
>> +if (ent)
>> +rte_mempool_put(table->ext_ent_pool, ent);
> Can't this 'if(ent)' be merged with previous 'if (ent) {...}' above?
Yes, will merge
>
>> +table->nb_ent--;
>> +return 0;
>> +}
>> +}
>> +
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next)
>> +if (ent->key == key)
>> +break;
>> +
>> +if (ent == NULL)
>> +return -ENOENT;
>> +
>> +*old_value = ent->val;
>> +
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
>> +SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
>> +rte_mempool_put(table->ext_ent_pool, ent);
>> +
>> +table->nb_ext_ent--;
>> +table->nb_ent--;
>> +
>> +return 0;
>> +}
>> +
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-06-23 23:06 ` Ananyev, Konstantin
@ 2020-06-25 19:56 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-06-25 19:56 UTC (permalink / raw)
To: Ananyev, Konstantin, dev; +Cc: Wang, Yipeng1, Gobriel, Sameh, Richardson, Bruce
On 24/06/2020 00:06, Ananyev, Konstantin wrote:
>> Hi Vladimir,
>>
>>> --- /dev/null
>>> +++ b/lib/librte_hash/k32v64_hash.c
>>> @@ -0,0 +1,277 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2020 Intel Corporation
>>> + */
>>> +
>>> +#include <string.h>
>>> +
>>> +#include <rte_errno.h>
>>> +#include <rte_malloc.h>
>>> +#include <rte_memory.h>
>>> +
>>> +#include "k32v64_hash.h"
>>> +
>>> +static inline int
>>> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
>>> +uint32_t hash, uint64_t *value)
>>> +{
>>> +return __k32v64_hash_lookup(table, key, hash, value, __kv_cmp_keys);
>>> +}
>>> +
>>> +static int
>>> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
>>> +uint32_t *hashes, void *values_p, unsigned int n)
>>> +{
>>> +struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
>>> +uint32_t *keys = keys_p;
>>> +uint64_t *values = values_p;
>>> +int ret, cnt = 0;
>>> +unsigned int i;
>>> +
>>> +if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>>> +(values == NULL)))
>>> +return -EINVAL;
>
> As a nit - this formal parameter checking better to do in public function
> (rte_kv_hash_bulk_lookup) before de-refencing table and calling actual lookup().
> Same story for modify() - formal parameter checking can be done at the top of public function.
Agree, will move checking in public part
> BTW, why to unite add/delete into modify(), if internally you have 2 different functions
> (for add/delete) anyway?
This was done for future extensibility, if we want to add some
additional control plane features in the future, other than add or
delete. if you think it's unnecessary I'll change it with add()/del() pair.
>
>>> +
>>> +for (i = 0; i < n; i++) {
>>> +ret = k32v64_hash_lookup(table, keys[i], hashes[i],
>>> +&values[i]);
>>> +if (ret == 0)
>>> +cnt++;
>>> +}
>>> +return cnt;
>>> +}
>>> +
>>> +#ifdef CC_AVX512VL_SUPPORT
>>> +int
>>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
>>> +void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
>>> +#endif
>>> +
>>> +static rte_kv_hash_bulk_lookup_t
>>> +get_lookup_bulk_fn(void)
>>> +{
>>> +#ifdef CC_AVX512VL_SUPPORT
>>> +if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
>>> +return k32v64_hash_bulk_lookup_avx512vl;
>>> +#endif
>>> +return k32v64_hash_bulk_lookup;
>>> +}
>>> +
>>> +static int
>>> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
>>> +uint32_t hash, uint64_t value, uint64_t *old_value, int *found)
>>> +{
>>> +uint32_t bucket;
>>> +int i, idx, ret;
>>> +uint8_t msk;
>>> +struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
>>> +
>>> +if (table == NULL)
>>> +return -EINVAL;
>>> +
>>> +bucket = hash & table->bucket_msk;
>>> +/* Search key in table. Update value if exists */
>>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>>> +if ((key == table->t[bucket].key[i]) &&
>>> +(table->t[bucket].key_mask & (1 << i))) {
>>> +*old_value = table->t[bucket].val[i];
>>> +*found = 1;
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_ACQUIRE);
> After another thought - atomic add is probably an overkill here.
> Something like:
> void update_start(struct k32v64_hash_bucket *bkt)
> {
> bkt->cnt++;
> __atomic_thread_fence(__ATOMIC_ACQ_REL);
> }
>
> void update_finish(struct k32v64_hash_bucket *bkt)
> {
> __atomic_thread_fence(__ATOMIC_ACQ_REL);
> bkt->cnt++;
> }
>
> I think should be sufficient here.
Good point
>
>>> +table->t[bucket].val[i] = value;
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_RELEASE);
>>> +return 0;
>>> +}
>>> +}
>>> +
>>> +if (!SLIST_EMPTY(&table->t[bucket].head)) {
>>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>>> +if (ent->key == key) {
>>> +*old_value = ent->val;
>>> +*found = 1;
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_ACQUIRE);
>>> +ent->val = value;
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_RELEASE);
>>> +return 0;
>>> +}
>>> +}
>>> +}
>>> +
>>> +msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
>>> +if (msk) {
>>> +idx = __builtin_ctz(msk);
>>> +table->t[bucket].key[idx] = key;
>>> +table->t[bucket].val[idx] = value;
>>> +__atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
>>> +__ATOMIC_RELEASE);
>> I think this case also has to guarded with table->t[bucket].cnt updates.
>>
>>> +table->nb_ent++;
>>> +*found = 0;
>>> +return 0;
>>> +}
>>> +
>>> +ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
>>> +if (ret < 0)
>>> +return ret;
>>> +
>>> +SLIST_NEXT(ent, next) = NULL;
>>> +ent->key = key;
>>> +ent->val = value;
>>> +rte_smp_wmb();
>> __atomic_thread_fence(__ATOMIC_RELEASE);
>> ?
>>
>>> +SLIST_FOREACH(tmp, &table->t[bucket].head, next)
>>> +prev = tmp;
>>> +
>>> +if (prev == NULL)
>>> +SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
>>> +else
>>> +SLIST_INSERT_AFTER(prev, ent, next);
>>> +
>>> +table->nb_ent++;
>>> +table->nb_ext_ent++;
>>> +*found = 0;
>>> +return 0;
>>> +}
>>> +
>>> +static int
>>> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
>>> +uint32_t hash, uint64_t *old_value)
>>> +{
>>> +uint32_t bucket;
>>> +int i;
>>> +struct k32v64_ext_ent *ent;
>>> +
>>> +if (table == NULL)
>>> +return -EINVAL;
>>> +
>>> +bucket = hash & table->bucket_msk;
>>> +
>>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>>> +if ((key == table->t[bucket].key[i]) &&
>>> +(table->t[bucket].key_mask & (1 << i))) {
>>> +*old_value = table->t[bucket].val[i];
>>> +ent = SLIST_FIRST(&table->t[bucket].head);
>>> +if (ent) {
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_ACQUIRE);
>>> +table->t[bucket].key[i] = ent->key;
>>> +table->t[bucket].val[i] = ent->val;
>>> +SLIST_REMOVE_HEAD(&table->t[bucket].head, next);
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>>> +__ATOMIC_RELEASE);
>>> +table->nb_ext_ent--;
>>> +} else
>>> +__atomic_and_fetch(&table->t[bucket].key_mask,
>>> +~(1 << i), __ATOMIC_RELEASE);
>> Same thought as above - safer to guard this mask update with cnt update.
>>
>>> +if (ent)
>>> +rte_mempool_put(table->ext_ent_pool, ent);
>> Can't this 'if(ent)' be merged with previous 'if (ent) {...}' above?
>>
>>> +table->nb_ent--;
>>> +return 0;
>>> +}
>>> +}
>>> +
>>> +SLIST_FOREACH(ent, &table->t[bucket].head, next)
>>> +if (ent->key == key)
>>> +break;
>>> +
>>> +if (ent == NULL)
>>> +return -ENOENT;
>>> +
>>> +*old_value = ent->val;
>>> +
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_ACQUIRE);
>>> +SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
>>> +__atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
>>> +rte_mempool_put(table->ext_ent_pool, ent);
>>> +
>>> +table->nb_ext_ent--;
>>> +table->nb_ent--;
>>> +
>>> +return 0;
>>> +}
>>> +
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library
2020-06-24 1:19 ` Wang, Yipeng1
@ 2020-06-25 20:26 ` Medvedkin, Vladimir
0 siblings, 0 replies; 56+ messages in thread
From: Medvedkin, Vladimir @ 2020-06-25 20:26 UTC (permalink / raw)
To: Wang, Yipeng1, dev; +Cc: Ananyev, Konstantin, Gobriel, Sameh, Richardson, Bruce
Hi Yipeng,
Thanks for the review. See below
On 24/06/2020 02:19, Wang, Yipeng1 wrote:
>> -----Original Message-----
>> From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
>> Sent: Friday, May 8, 2020 12:59 PM
>> To: dev@dpdk.org
>> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Wang, Yipeng1
>> <yipeng1.wang@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>;
>> Richardson, Bruce <bruce.richardson@intel.com>
>> Subject: [PATCH v4 1/4] hash: add kv hash library
>>
>> KV hash is a special optimized key-value storage for fixed key and value sizes.
>> At the moment it supports 32 bit keys and 64 bit values. This table is hash
>> function agnostic so user must provide precalculated hash signature for
>> add/delete/lookup operations.
>>
>> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
>> ---
>> lib/Makefile | 2 +-
>> lib/librte_hash/Makefile | 14 +-
>> lib/librte_hash/k32v64_hash.c | 277
>> +++++++++++++++++++++++++++++++++
>> lib/librte_hash/k32v64_hash.h | 98 ++++++++++++
>> lib/librte_hash/k32v64_hash_avx512vl.c | 59 +++++++
>> lib/librte_hash/meson.build | 17 +-
>> lib/librte_hash/rte_hash_version.map | 6 +-
>> lib/librte_hash/rte_kv_hash.c | 184 ++++++++++++++++++++++
>> lib/librte_hash/rte_kv_hash.h | 169 ++++++++++++++++++++
>> 9 files changed, 821 insertions(+), 5 deletions(-) create mode 100644
>> lib/librte_hash/k32v64_hash.c create mode 100644
>> lib/librte_hash/k32v64_hash.h create mode 100644
>> lib/librte_hash/k32v64_hash_avx512vl.c
>> create mode 100644 lib/librte_hash/rte_kv_hash.c create mode 100644
>> lib/librte_hash/rte_kv_hash.h
>>
>> diff --git a/lib/Makefile b/lib/Makefile index 9d24609..42769e9 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -48,7 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
>> librte_ethdev \
>> librte_net librte_hash librte_cryptodev
>> DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash -DEPDIRS-librte_hash :=
>> librte_eal librte_ring
>> +DEPDIRS-librte_hash := librte_eal librte_ring librte_mempool
>> DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd :=
>> librte_eal librte_ring librte_hash
>> DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib diff --git
>> a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
>> ec9f864..a0cdee9 100644
>> --- a/lib/librte_hash/Makefile
>> +++ b/lib/librte_hash/Makefile
>> @@ -8,13 +8,15 @@ LIB = librte_hash.a
>>
>> CFLAGS += -O3
>> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
>> -LDLIBS += -lrte_eal -lrte_ring
>> +LDLIBS += -lrte_eal -lrte_ring -lrte_mempool
>>
>> EXPORT_MAP := rte_hash_version.map
>>
>> # all source are stored in SRCS-y
>> SRCS-$(CONFIG_RTE_LIBRTE_HASH) := rte_cuckoo_hash.c
>> SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_kv_hash.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash.c
>>
>> # install this header file
>> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h @@ -27,5
>> +29,15 @@ endif SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include +=
>> rte_jhash.h SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
>> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_kv_hash.h
>> +
>> +CC_AVX512VL_SUPPORT=$(shell $(CC) -mavx512vl -dM -E - </dev/null
>> 2>&1 |
>> +\ grep -q __AVX512VL__ && echo 1)
>> +
>> +ifeq ($(CC_AVX512VL_SUPPORT), 1)
>> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += k32v64_hash_avx512vl.c
>> +CFLAGS_k32v64_hash_avx512vl.o += -mavx512vl
>> +CFLAGS_k32v64_hash.o += -DCC_AVX512VL_SUPPORT endif
>>
>> include $(RTE_SDK)/mk/rte.lib.mk
>> diff --git a/lib/librte_hash/k32v64_hash.c b/lib/librte_hash/k32v64_hash.c
>> new file mode 100644 index 0000000..24cd63a
>> --- /dev/null
>> +++ b/lib/librte_hash/k32v64_hash.c
>> @@ -0,0 +1,277 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <string.h>
>> +
>> +#include <rte_errno.h>
>> +#include <rte_malloc.h>
>> +#include <rte_memory.h>
>> +
>> +#include "k32v64_hash.h"
>> +
>> +static inline int
>> +k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *value)
>> +{
>> +return __k32v64_hash_lookup(table, key, hash, value,
>> __kv_cmp_keys); }
>> +
>> +static int
>> +k32v64_hash_bulk_lookup(struct rte_kv_hash_table *ht, void *keys_p,
>> +uint32_t *hashes, void *values_p, unsigned int n) {
>> +struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
>> +uint32_t *keys = keys_p;
>> +uint64_t *values = values_p;
>> +int ret, cnt = 0;
>> +unsigned int i;
>> +
>> +if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> +(values == NULL)))
>> +return -EINVAL;
>> +
>> +for (i = 0; i < n; i++) {
>> +ret = k32v64_hash_lookup(table, keys[i], hashes[i],
>> +&values[i]);
> [Wang, Yipeng] You don't need to start a new line for values.
Then line will be too long (81), checkpatch have warning here
>> +if (ret == 0)
>> +cnt++;
>> +}
>> +return cnt;
>> +}
>> +
>> +#ifdef CC_AVX512VL_SUPPORT
> [Wang, Yipeng] Why not use the already provided, e.g. #if defined(RTE_MACHINE_CPUFLAG_SSE2) like rte_hash does?
For AVX512 implementation we are using avx512vl intrinsics which are not
supported by some compilers. In this case supporting for AVX512F is not
enough (there is only RTE_MACHINE_CPUFLAG_AVX512F flag).
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht,
>> +void *keys_p, uint32_t *hashes, void *values_p, unsigned int n);
>> +#endif
>> +
>> +static rte_kv_hash_bulk_lookup_t
>> +get_lookup_bulk_fn(void)
>> +{
>> +#ifdef CC_AVX512VL_SUPPORT
>> +if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
>> +return k32v64_hash_bulk_lookup_avx512vl; #endif
>> +return k32v64_hash_bulk_lookup;
>> +}
>> +
>> +static int
>> +k32v64_hash_add(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t value, uint64_t *old_value, int *found) {
>> +uint32_t bucket;
>> +int i, idx, ret;
>> +uint8_t msk;
>> +struct k32v64_ext_ent *tmp, *ent, *prev = NULL;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +bucket = hash & table->bucket_msk;
>> +/* Search key in table. Update value if exists */
>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +*old_value = table->t[bucket].val[i];
>> +*found = 1;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_ACQUIRE);
>> +table->t[bucket].val[i] = value;
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> +__ATOMIC_RELEASE);
>> +return 0;
>> +}
>> +}
>> +
>> +if (!SLIST_EMPTY(&table->t[bucket].head)) {
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> +if (ent->key == key) {
>> +*old_value = ent->val;
>> +*found = 1;
>> +__atomic_fetch_add(&table->t[bucket].cnt,
>> 1,
>> +__ATOMIC_ACQUIRE);
>> +ent->val = value;
>> +__atomic_fetch_add(&table->t[bucket].cnt,
>> 1,
>> +__ATOMIC_RELEASE);
>> +return 0;
>> +}
>> +}
>> +}
>> +
>> +msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
>> +if (msk) {
>> +idx = __builtin_ctz(msk);
>> +table->t[bucket].key[idx] = key;
>> +table->t[bucket].val[idx] = value;
>> +__atomic_or_fetch(&table->t[bucket].key_mask, 1 << idx,
>> +__ATOMIC_RELEASE);
>> +table->nb_ent++;
>> +*found = 0;
>> +return 0;
>> +}
>> +
>> +ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
>> +if (ret < 0)
>> +return ret;
>> +
>> +SLIST_NEXT(ent, next) = NULL;
>> +ent->key = key;
>> +ent->val = value;
>> +rte_smp_wmb();
>> +SLIST_FOREACH(tmp, &table->t[bucket].head, next)
>> +prev = tmp;
>> +
>> +if (prev == NULL)
>> +SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
>> +else
>> +SLIST_INSERT_AFTER(prev, ent, next);
>> +
>> +table->nb_ent++;
>> +table->nb_ext_ent++;
>> +*found = 0;
>> +return 0;
>> +}
>> +
>> +static int
>> +k32v64_hash_delete(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *old_value)
>> +{
>> +uint32_t bucket;
>> +int i;
>> +struct k32v64_ext_ent *ent;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +bucket = hash & table->bucket_msk;
>> +
>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == table->t[bucket].key[i]) &&
>> +(table->t[bucket].key_mask & (1 << i))) {
>> +*old_value = table->t[bucket].val[i];
>> +ent = SLIST_FIRST(&table->t[bucket].head);
>> +if (ent) {
>> +__atomic_fetch_add(&table->t[bucket].cnt,
>> 1,
>> +__ATOMIC_ACQUIRE);
>> +table->t[bucket].key[i] = ent->key;
>> +table->t[bucket].val[i] = ent->val;
>> +SLIST_REMOVE_HEAD(&table-
>>> t[bucket].head, next);
>> +__atomic_fetch_add(&table->t[bucket].cnt,
>> 1,
>> +__ATOMIC_RELEASE);
>> +table->nb_ext_ent--;
>> +} else
>> +__atomic_and_fetch(&table-
>>> t[bucket].key_mask,
>> +~(1 << i), __ATOMIC_RELEASE);
>> +if (ent)
>> +rte_mempool_put(table->ext_ent_pool,
>> ent);
>> +table->nb_ent--;
>> +return 0;
>> +}
>> +}
>> +
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next)
>> +if (ent->key == key)
>> +break;
>> +
>> +if (ent == NULL)
>> +return -ENOENT;
>> +
>> +*old_value = ent->val;
>> +
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1,
>> __ATOMIC_ACQUIRE);
>> +SLIST_REMOVE(&table->t[bucket].head, ent, k32v64_ext_ent, next);
>> +__atomic_fetch_add(&table->t[bucket].cnt, 1, __ATOMIC_RELEASE);
>> +rte_mempool_put(table->ext_ent_pool, ent);
>> +
> [Wang, Yipeng] I am not sure if delete can be called safely with other lookup threads.
> The item could be recycled to be used by another add while a lookup thread traversing the linked list.
> This is similar to why we need the RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL flag and related API in rte_hash.
We discussed it with Konstantin previously. In this (very rare) case
lookup thread will traverse across linked list which belongs to
different bucket, won't find proper key and restart lookup again
(because bucket cnt was changed). So it will never face with a linked
list entry with inconsistent next pointer that does not point to some
other list entry.
>> +table->nb_ext_ent--;
>> +table->nb_ent--;
>> +
>> +return 0;
>> +}
>> +
>> +static int
>> +k32v64_modify(struct rte_kv_hash_table *table, void *key_p, uint32_t
>> hash,
>> +enum rte_kv_modify_op op, void *value_p, int *found) {
>> +struct k32v64_hash_table *ht = (struct k32v64_hash_table *)table;
>> +uint32_t *key = key_p;
>> +uint64_t value;
>> +
>> +if ((ht == NULL) || (key == NULL) || (value_p == NULL) ||
>> +(found == NULL) || (op >=
>> RTE_KV_MODIFY_OP_MAX)) {
>> +return -EINVAL;
>> +}
>> +
>> +value = *(uint64_t *)value_p;
>> +switch (op) {
>> +case RTE_KV_MODIFY_ADD:
>> +return k32v64_hash_add(ht, *key, hash, value, value_p,
>> found);
>> +case RTE_KV_MODIFY_DEL:
>> +return k32v64_hash_delete(ht, *key, hash, value_p);
>> +default:
>> +break;
>> +}
> [Wang, Yipeng] A question would be why put del and add inside another mod wrapper.
> If we don't wrap del and add like this, we could get rid of this branch.
This was done for future extensibility, if we want to add some
additional control plane features in the future, other than add or
delete. It is a control plane operation so branch here doesn't make any
difference. If you think it's unnecessary I'll change it with
add()/del() pair.
>> +
>> +return -EINVAL;
>> +}
>> +
>> +struct rte_kv_hash_table *
>> +k32v64_hash_create(const struct rte_kv_hash_params *params) {
>> +char hash_name[RTE_KV_HASH_NAMESIZE];
>> +struct k32v64_hash_table *ht = NULL;
>> +uint32_t mem_size, nb_buckets, max_ent;
>> +int ret;
>> +struct rte_mempool *mp;
>> +
>> +if ((params == NULL) || (params->name == NULL) ||
>> +(params->entries == 0)) {
> [Wang, Yipeng] Should we check if entry count larger than keys_per_bucket as well?
No, because in general we are not limited to the number of keys in a
bucket. keys_per_bucket reflects number of keys in open addressing part
of a bucket.
>> +rte_errno = EINVAL;
>> +return NULL;
>> +}
>> +
>> +ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
>>> name);
>> +if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
>> +rte_errno = ENAMETOOLONG;
>> +return NULL;
>> +}
>> +
>> +max_ent = rte_align32pow2(params->entries);
>> +nb_buckets = max_ent / K32V64_KEYS_PER_BUCKET;
> [Wang, Yipeng] Macro to check if keys_per_bucket need to be power of 2
In general it doesn't have to be a power of 2.
>> +mem_size = sizeof(struct k32v64_hash_table) +
>> +sizeof(struct k32v64_hash_bucket) * nb_buckets;
>> +
>> +mp = rte_mempool_create(hash_name, max_ent,
>> +sizeof(struct k32v64_ext_ent), 0, 0, NULL, NULL, NULL, NULL,
>> +params->socket_id, 0);
>> +
>> +if (mp == NULL)
>> +return NULL;
>> +
>> +ht = rte_zmalloc_socket(hash_name, mem_size,
>> +RTE_CACHE_LINE_SIZE, params->socket_id);
>> +if (ht == NULL) {
>> +rte_mempool_free(mp);
>> +return NULL;
>> +}
>> +
>> +memcpy(ht->pub.name, hash_name, sizeof(ht->pub.name));
>> +ht->max_ent = max_ent;
>> +ht->bucket_msk = nb_buckets - 1;
>> +ht->ext_ent_pool = mp;
>> +ht->pub.lookup = get_lookup_bulk_fn();
> [Wang, Yipeng] Inside the function, we also need to check the CPUID during runtime to decide if AVX can be used, not only compile time.
> You could refer to example from rte_hash: ... if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) ...
The same is here. In get_lookup_bulk_fn():
...
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
...
>
>> +ht->pub.modify = k32v64_modify;
>> +
>> +return (struct rte_kv_hash_table *)ht; }
>> +
>> +void
>> +k32v64_hash_free(struct rte_kv_hash_table *ht) {
>> +if (ht == NULL)
>> +return;
>> +
>> +rte_mempool_free(((struct k32v64_hash_table *)ht)-
>>> ext_ent_pool);
>> +rte_free(ht);
>> +}
>> diff --git a/lib/librte_hash/k32v64_hash.h b/lib/librte_hash/k32v64_hash.h
>> new file mode 100644 index 0000000..10061a5
>> --- /dev/null
>> +++ b/lib/librte_hash/k32v64_hash.h
>> @@ -0,0 +1,98 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <rte_kv_hash.h>
>> +
>> +#define K32V64_KEYS_PER_BUCKET4
>> +#define K32V64_WRITE_IN_PROGRESS1
>> +#define VALID_KEY_MSK ((1 << K32V64_KEYS_PER_BUCKET) - 1)
>> +
>> +struct k32v64_ext_ent {
>> +SLIST_ENTRY(k32v64_ext_ent) next;
>> +uint32_tkey;
>> +uint64_tval;
>> +};
>> +
>> +struct k32v64_hash_bucket {
>> +uint32_tkey[K32V64_KEYS_PER_BUCKET];
>> +uint64_tval[K32V64_KEYS_PER_BUCKET];
>> +uint8_tkey_mask;
>> +uint32_tcnt;
>> +SLIST_HEAD(k32v64_list_head, k32v64_ext_ent) head; }
>> +__rte_cache_aligned;
>> +
>> +struct k32v64_hash_table {
>> +struct rte_kv_hash_table pub;/**< Public part */
> [Wang, Yipeng] Do we need to have the pub filed? Could we just init the public part in the public create function?
As for me it is better to init it inside algorithm specific part. Other
way we need to expose our internal algo-specific lookup_bulk and modify
functions to kv_hash layer.
>> +uint32_tnb_ent;/**< Number of entities in
>> the table*/
>> +uint32_tnb_ext_ent;/**< Number of extended entities */
>> +uint32_tmax_ent;/**< Maximum number of entities */
>> +uint32_tbucket_msk;
>> +struct rte_mempool*ext_ent_pool;
>> +__extension__ struct k32v64_hash_buckett[0];
>> +};
>> +
>> +typedef int (*k32v64_cmp_fn_t)
>> +(struct k32v64_hash_bucket *bucket, uint32_t key, uint64_t *val);
>> +
>> +static inline int
>> +__kv_cmp_keys(struct k32v64_hash_bucket *bucket, uint32_t key,
>> +uint64_t *val)
>> +{
>> +int i;
>> +
>> +for (i = 0; i < K32V64_KEYS_PER_BUCKET; i++) {
>> +if ((key == bucket->key[i]) &&
>> +(bucket->key_mask & (1 << i))) {
>> +*val = bucket->val[i];
>> +return 1;
>> +}
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static inline int
>> +__k32v64_hash_lookup(struct k32v64_hash_table *table, uint32_t key,
>> +uint32_t hash, uint64_t *value, k32v64_cmp_fn_t cmp_f) {
>> +uint64_tval = 0;
>> +struct k32v64_ext_ent *ent;
>> +uint32_tcnt;
>> +int found = 0;
>> +uint32_t bucket = hash & table->bucket_msk;
>> +
>> +do {
>> +
>> +do {
>> +cnt = __atomic_load_n(&table->t[bucket].cnt,
>> +__ATOMIC_ACQUIRE);
>> +} while (unlikely(cnt & K32V64_WRITE_IN_PROGRESS));
>> +
>> +found = cmp_f(&table->t[bucket], key, &val);
>> +if (unlikely((found == 0) &&
>> +(!SLIST_EMPTY(&table->t[bucket].head)))) {
>> +SLIST_FOREACH(ent, &table->t[bucket].head, next) {
>> +if (ent->key == key) {
>> +val = ent->val;
>> +found = 1;
>> +break;
>> +}
>> +}
>> +}
>> +__atomic_thread_fence(__ATOMIC_RELEASE);
>> +} while (unlikely(cnt != __atomic_load_n(&table->t[bucket].cnt,
>> + __ATOMIC_RELAXED)));
>> +
>> +if (found == 1) {
>> +*value = val;
>> +return 0;
>> +} else
>> +return -ENOENT;
>> +}
>> +
>> +struct rte_kv_hash_table *
>> +k32v64_hash_create(const struct rte_kv_hash_params *params);
>> +
>> +void
>> +k32v64_hash_free(struct rte_kv_hash_table *ht);
>> diff --git a/lib/librte_hash/k32v64_hash_avx512vl.c
>> b/lib/librte_hash/k32v64_hash_avx512vl.c
>> new file mode 100644
>> index 0000000..75cede5
>> --- /dev/null
>> +++ b/lib/librte_hash/k32v64_hash_avx512vl.c
>> @@ -0,0 +1,59 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include "k32v64_hash.h"
>> +
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
>> *keys_p,
>> +uint32_t *hashes, void *values_p, unsigned int n);
>> +
>> +static inline int
>> +k32v64_cmp_keys_avx512vl(struct k32v64_hash_bucket *bucket, uint32_t
>> key,
>> +uint64_t *val)
>> +{
>> +__m128i keys, srch_key;
>> +__mmask8 msk;
>> +
>> +keys = _mm_load_si128((void *)bucket);
>> +srch_key = _mm_set1_epi32(key);
>> +
>> +msk = _mm_mask_cmpeq_epi32_mask(bucket->key_mask, keys,
>> srch_key);
>> +if (msk) {
>> +*val = bucket->val[__builtin_ctz(msk)];
>> +return 1;
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static inline int
>> +k32v64_hash_lookup_avx512vl(struct k32v64_hash_table *table, uint32_t
>> key,
>> +uint32_t hash, uint64_t *value)
>> +{
>> +return __k32v64_hash_lookup(table, key, hash, value,
>> +k32v64_cmp_keys_avx512vl);
>> +}
>> +
>> +int
>> +k32v64_hash_bulk_lookup_avx512vl(struct rte_kv_hash_table *ht, void
>> *keys_p,
>> +uint32_t *hashes, void *values_p, unsigned int n) {
>> +struct k32v64_hash_table *table = (struct k32v64_hash_table *)ht;
>> +uint32_t *keys = keys_p;
>> +uint64_t *values = values_p;
>> +int ret, cnt = 0;
>> +unsigned int i;
>> +
>> +if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
>> +(values == NULL)))
>> +return -EINVAL;
>> +
>> +for (i = 0; i < n; i++) {
>> +ret = k32v64_hash_lookup_avx512vl(table, keys[i], hashes[i],
>> +&values[i]);
>> +if (ret == 0)
>> +cnt++;
>> +}
>> +return cnt;
>> +}
>> diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build index
>> 6ab46ae..0d014ea 100644
>> --- a/lib/librte_hash/meson.build
>> +++ b/lib/librte_hash/meson.build
>> @@ -3,10 +3,23 @@
>>
>> headers = files('rte_crc_arm64.h',
>> 'rte_fbk_hash.h',
>> +'rte_kv_hash.h',
>> 'rte_hash_crc.h',
>> 'rte_hash.h',
>> 'rte_jhash.h',
>> 'rte_thash.h')
>>
>> -sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') -deps += ['ring']
>> +sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c', 'rte_kv_hash.c',
>> +'k32v64_hash.c') deps += ['ring', 'mempool']
>> +
>> +if dpdk_conf.has('RTE_ARCH_X86')
>> +if cc.has_argument('-mavx512vl')
>> +avx512_tmplib = static_library('avx512_tmp',
>> + 'k32v64_hash_avx512vl.c',
>> + dependencies: static_rte_mempool,
>> + c_args: cflags + ['-mavx512vl'])
>> + objs += avx512_tmplib.extract_objects('k32v64_hash_avx512vl.c')
>> + cflags += '-DCC_AVX512VL_SUPPORT'
>> +
>> +endif
>> +endif
>> diff --git a/lib/librte_hash/rte_hash_version.map
>> b/lib/librte_hash/rte_hash_version.map
>> index c2a9094..614e0a5 100644
>> --- a/lib/librte_hash/rte_hash_version.map
>> +++ b/lib/librte_hash/rte_hash_version.map
>> @@ -36,5 +36,9 @@ EXPERIMENTAL {
>> rte_hash_lookup_with_hash_bulk;
>> rte_hash_lookup_with_hash_bulk_data;
>> rte_hash_max_key_id;
>> -
>> +rte_kv_hash_create;
>> +rte_kv_hash_find_existing;
>> +rte_kv_hash_free;
>> +rte_kv_hash_add;
>> +rte_kv_hash_delete;
>> };
>> diff --git a/lib/librte_hash/rte_kv_hash.c b/lib/librte_hash/rte_kv_hash.c
>> new file mode 100644 index 0000000..03df8db
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_kv_hash.c
>> @@ -0,0 +1,184 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#include <string.h>
>> +
>> +#include <rte_eal_memconfig.h>
>> +#include <rte_errno.h>
>> +#include <rte_malloc.h>
>> +#include <rte_memory.h>
>> +#include <rte_tailq.h>
>> +
>> +#include <rte_kv_hash.h>
> [Wang, Yipeng] I think for this header we should use quotes ""
rte_kv_hash.h is a public header file, we we should do that?
>> +#include "k32v64_hash.h"
>> +
>> +TAILQ_HEAD(rte_kv_hash_list, rte_tailq_entry);
>> +
>> +static struct rte_tailq_elem rte_kv_hash_tailq = {
>> +.name = "RTE_KV_HASH",
>> +};
>> +
>> +EAL_REGISTER_TAILQ(rte_kv_hash_tailq);
>> +
>> +int
>> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
>> +uint32_t hash, void *value, int *found) {
>> +if (table == NULL)
> [Wang, Yipeng] To be more consistent,
> I think we should either also check key/value/found here, or just not checking at all and leave it to the next functions.
Agree, I will change it next version
>> +return -EINVAL;
>> +
>> +return table->modify(table, key, hash, RTE_KV_MODIFY_ADD,
>> +value, found);
>> +}
>> +
>> +int
>> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
>> +uint32_t hash, void *value)
>> +{
>> +int found;
>> +
>> +if (table == NULL)
>> +return -EINVAL;
>> +
>> +return table->modify(table, key, hash, RTE_KV_MODIFY_DEL,
>> +value, &found);
>> +}
>> +
>> +struct rte_kv_hash_table *
>> +rte_kv_hash_find_existing(const char *name) {
>> +struct rte_kv_hash_table *h = NULL;
>> +struct rte_tailq_entry *te;
>> +struct rte_kv_hash_list *kv_hash_list;
>> +
>> +kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
>> +rte_kv_hash_list);
>> +
>> +rte_mcfg_tailq_read_lock();
>> +TAILQ_FOREACH(te, kv_hash_list, next) {
>> +h = (struct rte_kv_hash_table *) te->data;
>> +if (strncmp(name, h->name, RTE_KV_HASH_NAMESIZE) == 0)
>> +break;
>> +}
>> +rte_mcfg_tailq_read_unlock();
>> +if (te == NULL) {
>> +rte_errno = ENOENT;
>> +return NULL;
>> +}
>> +return h;
>> +}
>> +
>> +struct rte_kv_hash_table *
>> +rte_kv_hash_create(const struct rte_kv_hash_params *params) {
>> +char hash_name[RTE_KV_HASH_NAMESIZE];
>> +struct rte_kv_hash_table *ht, *tmp_ht = NULL;
>> +struct rte_tailq_entry *te;
>> +struct rte_kv_hash_list *kv_hash_list;
>> +int ret;
>> +
>> +if ((params == NULL) || (params->name == NULL) ||
>> +(params->entries == 0) ||
>> +(params->type >= RTE_KV_HASH_MAX)) {
>> +rte_errno = EINVAL;
>> +return NULL;
>> +}
>> +
>> +kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
>> +rte_kv_hash_list);
>> +
>> +ret = snprintf(hash_name, sizeof(hash_name), "KV_%s", params-
>>> name);
>> +if (ret < 0 || ret >= RTE_KV_HASH_NAMESIZE) {
>> +rte_errno = ENAMETOOLONG;
>> +return NULL;
>> +}
>> +
>> +switch (params->type) {
>> +case RTE_KV_HASH_K32V64:
>> +ht = k32v64_hash_create(params);
>> +break;
>> +default:
>> +rte_errno = EINVAL;
>> +return NULL;
>> +}
>> +if (ht == NULL)
>> +return ht;
>> +
>> +rte_mcfg_tailq_write_lock();
>> +TAILQ_FOREACH(te, kv_hash_list, next) {
>> +tmp_ht = (struct rte_kv_hash_table *) te->data;
>> +if (strncmp(params->name, tmp_ht->name,
>> +RTE_KV_HASH_NAMESIZE) == 0)
>> +break;
>> +}
>> +if (te != NULL) {
>> +rte_errno = EEXIST;
>> +goto exit;
>> +}
>> +
>> +te = rte_zmalloc("KV_HASH_TAILQ_ENTRY", sizeof(*te), 0);
>> +if (te == NULL) {
>> +RTE_LOG(ERR, HASH, "Failed to allocate tailq entry\n");
>> +goto exit;
>> +}
>> +
>> +ht->type = params->type;
>> +te->data = (void *)ht;
>> +TAILQ_INSERT_TAIL(kv_hash_list, te, next);
>> +
>> +rte_mcfg_tailq_write_unlock();
>> +
>> +return ht;
>> +
>> +exit:
>> +rte_mcfg_tailq_write_unlock();
>> +switch (params->type) {
>> +case RTE_KV_HASH_K32V64:
>> +k32v64_hash_free(ht);
>> +break;
>> +default:
>> +break;
>> +}
>> +return NULL;
>> +}
>> +
>> +void
>> +rte_kv_hash_free(struct rte_kv_hash_table *ht) {
>> +struct rte_tailq_entry *te;
>> +struct rte_kv_hash_list *kv_hash_list;
>> +
>> +if (ht == NULL)
>> +return;
>> +
>> +kv_hash_list = RTE_TAILQ_CAST(rte_kv_hash_tailq.head,
>> +rte_kv_hash_list);
>> +
>> +rte_mcfg_tailq_write_lock();
>> +
>> +/* find out tailq entry */
>> +TAILQ_FOREACH(te, kv_hash_list, next) {
>> +if (te->data == (void *) ht)
>> +break;
>> +}
>> +
>> +
>> +if (te == NULL) {
>> +rte_mcfg_tailq_write_unlock();
>> +return;
>> +}
>> +
>> +TAILQ_REMOVE(kv_hash_list, te, next);
>> +
>> +rte_mcfg_tailq_write_unlock();
>> +
>> +switch (ht->type) {
>> +case RTE_KV_HASH_K32V64:
>> +k32v64_hash_free(ht);
>> +break;
>> +default:
>> +break;
>> +}
>> +rte_free(te);
>> +}
>> diff --git a/lib/librte_hash/rte_kv_hash.h b/lib/librte_hash/rte_kv_hash.h
>> new file mode 100644 index 0000000..c0375d1
>> --- /dev/null
>> +++ b/lib/librte_hash/rte_kv_hash.h
>> @@ -0,0 +1,169 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2020 Intel Corporation
>> + */
>> +
>> +#ifndef _RTE_KV_HASH_H_
>> +#define _RTE_KV_HASH_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <rte_compat.h>
>> +#include <rte_atomic.h>
>> +#include <rte_mempool.h>
>> +
>> +#define RTE_KV_HASH_NAMESIZE32
>> +
>> +enum rte_kv_hash_type {
>> +RTE_KV_HASH_K32V64,
>> +RTE_KV_HASH_MAX
>> +};
>> +
>> +enum rte_kv_modify_op {
>> +RTE_KV_MODIFY_ADD,
>> +RTE_KV_MODIFY_DEL,
>> +RTE_KV_MODIFY_OP_MAX
>> +};
> [Wang, Yipeng] Again, any particular reason that you combine add and del into an additional struct mod?
>> +
>> +struct rte_kv_hash_params {
>> +const char *name;
>> +uint32_t entries;
>> +int socket_id;
>> +enum rte_kv_hash_type type;
>> +};
>> +
>> +struct rte_kv_hash_table;
>> +
>> +typedef int (*rte_kv_hash_bulk_lookup_t) (struct rte_kv_hash_table
>> +*table, void *keys, uint32_t *hashes,
>> +void *values, unsigned int n);
>> +
>> +typedef int (*rte_kv_hash_modify_t)
>> +(struct rte_kv_hash_table *table, void *key, uint32_t hash,
>> +enum rte_kv_modify_op op, void *value, int *found);
>> +
>> +struct rte_kv_hash_table {
>> +char name[RTE_KV_HASH_NAMESIZE];/**< Name of the
>> hash. */
>> +rte_kv_hash_bulk_lookup_tlookup;
>> +rte_kv_hash_modify_tmodify;
>> +enum rte_kv_hash_typetype;
>> +};
>> +
>> +/**
>> + * Lookup bulk of keys.
>> + * This function is multi-thread safe with regarding to other lookup threads.
>> + *
>> + * @param table
>> + * Hash table to add the key to.
>> + * @param keys
>> + * Pointer to array of keys
>> + * @param hashes
>> + * Pointer to array of hash values associated with keys.
>> + * @param values
>> + * Pointer to array of value corresponded to keys
>> + * If the key was not found the corresponding value remains intact.
>> + * @param n
>> + * Number of keys to lookup in batch.
>> + * @return
>> + * -EINVAL if there's an error, otherwise number of successful lookups.
>> + */
> [Wang, Yipeng] experimental tag
Yes, will add
>> +static inline int
>> +rte_kv_hash_bulk_lookup(struct rte_kv_hash_table *table,
>> +void *keys, uint32_t *hashes, void *values, unsigned int n) {
>> +return table->lookup(table, keys, hashes, values, n); }
>> +
>> +/**
>> + * Add a key to an existing hash table with hash value.
>> + * This operation is not multi-thread safe regarding to add/delete
>> +functions
>> + * and should only be called from one thread.
>> + * However it is safe to call it along with lookup.
>> + *
>> + * @param table
>> + * Hash table to add the key to.
>> + * @param key
>> + * Key to add to the hash table.
>> + * @param value
>> + * Value to associate with key.
>> + * @param hash
>> + * Hash value associated with key.
>> + * @found
>> + * 0 if no previously added key was found
>> + * 1 previously added key was found, old value associated with the key
>> + * was written to *value
>> + * @return
>> + * 0 if ok, or negative value on error.
>> + */
>> +__rte_experimental
>> +int
>> +rte_kv_hash_add(struct rte_kv_hash_table *table, void *key,
>> +uint32_t hash, void *value, int *found);
>> +
>> +/**
>> + * Remove a key with a given hash value from an existing hash table.
>> + * This operation is not multi-thread safe regarding to add/delete
>> +functions
>> + * and should only be called from one thread.
>> + * However it is safe to call it along with lookup.
>> + *
>> + * @param table
>> + * Hash table to remove the key from.
>> + * @param key
>> + * Key to remove from the hash table.
>> + * @param hash
>> + * hash value associated with key.
>> + * @param value
>> + * pointer to memory where the old value will be written to on success
>> + * @return
>> + * 0 if ok, or negative value on error.
>> + */
>> +__rte_experimental
>> +int
>> +rte_kv_hash_delete(struct rte_kv_hash_table *table, void *key,
>> +uint32_t hash, void *value);
>> +
>> +/**
>> + * Performs a lookup for an existing hash table, and returns a pointer
>> +to
>> + * the table if found.
>> + *
>> + * @param name
>> + * Name of the hash table to find
>> + *
>> + * @return
>> + * pointer to hash table structure or NULL on error with rte_errno
>> + * set appropriately.
>> + */
>> +__rte_experimental
>> +struct rte_kv_hash_table *
>> +rte_kv_hash_find_existing(const char *name);
>> +
>> +/**
>> + * Create a new hash table for use with four byte keys.
> [Wang, Yipeng] inaccurate comment, not four byte keys
Yes, artifact from previous implementation
>> + *
>> + * @param params
>> + * Parameters used in creation of hash table.
>> + *
>> + * @return
>> + * Pointer to hash table structure that is used in future hash table
>> + * operations, or NULL on error with rte_errno set appropriately.
>> + */
>> +__rte_experimental
>> +struct rte_kv_hash_table *
>> +rte_kv_hash_create(const struct rte_kv_hash_params *params);
>> +
>> +/**
>> + * Free all memory used by a hash table.
>> + *
>> + * @param table
>> + * Hash table to deallocate.
>> + */
>> +__rte_experimental
>> +void
>> +rte_kv_hash_free(struct rte_kv_hash_table *table);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_KV_HASH_H_ */
>> --
>> 2.7.4
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/4] add new kv hash table
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
2020-06-16 16:37 ` Thomas Monjalon
@ 2021-03-24 21:28 ` Thomas Monjalon
2021-03-25 12:03 ` Medvedkin, Vladimir
1 sibling, 1 reply; 56+ messages in thread
From: Thomas Monjalon @ 2021-03-24 21:28 UTC (permalink / raw)
To: Vladimir Medvedkin
Cc: dev, konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
08/05/2020 21:58, Vladimir Medvedkin:
> Currently DPDK has a special implementation of a hash table for
> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> is that it only supports 2 byte values.
> The new implementation called KV hash
> supports 4 byte keys and 8 byte associated values,
> which is enough to store a pointer.
Waiting for a v5.
Is it abandoned?
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/4] add new kv hash table
2021-03-24 21:28 ` Thomas Monjalon
@ 2021-03-25 12:03 ` Medvedkin, Vladimir
2023-06-12 16:11 ` Stephen Hemminger
0 siblings, 1 reply; 56+ messages in thread
From: Medvedkin, Vladimir @ 2021-03-25 12:03 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, konstantin.ananyev, yipeng1.wang, sameh.gobriel, bruce.richardson
Hi Thomas,
On 25/03/2021 00:28, Thomas Monjalon wrote:
> 08/05/2020 21:58, Vladimir Medvedkin:
>> Currently DPDK has a special implementation of a hash table for
>> 4 byte keys which is called FBK hash. Unfortunately its main drawback
>> is that it only supports 2 byte values.
>> The new implementation called KV hash
>> supports 4 byte keys and 8 byte associated values,
>> which is enough to store a pointer.
>
> Waiting for a v5.
> Is it abandoned?
It is suspended till further rework.
>
>
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/4] add new kv hash table
2021-03-25 12:03 ` Medvedkin, Vladimir
@ 2023-06-12 16:11 ` Stephen Hemminger
0 siblings, 0 replies; 56+ messages in thread
From: Stephen Hemminger @ 2023-06-12 16:11 UTC (permalink / raw)
To: Medvedkin, Vladimir
Cc: Thomas Monjalon, dev, konstantin.ananyev, yipeng1.wang,
sameh.gobriel, bruce.richardson
On Thu, 25 Mar 2021 15:03:24 +0300
"Medvedkin, Vladimir" <vladimir.medvedkin@intel.com> wrote:
> Hi Thomas,
>
> On 25/03/2021 00:28, Thomas Monjalon wrote:
> > 08/05/2020 21:58, Vladimir Medvedkin:
> >> Currently DPDK has a special implementation of a hash table for
> >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> >> is that it only supports 2 byte values.
> >> The new implementation called KV hash
> >> supports 4 byte keys and 8 byte associated values,
> >> which is enough to store a pointer.
> >
> > Waiting for a v5.
> > Is it abandoned?
>
> It is suspended till further rework.
>
> >
Since nothing more has arrived in 2 years,
marking the original patchset as Changes requested.
^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2023-06-12 16:12 UTC | newest]
Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-16 13:38 [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 1/3] hash: add dwk hash library Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 2/3] test: add dwk hash autotests Vladimir Medvedkin
2020-03-16 13:38 ` [dpdk-dev] [PATCH 3/3] test: add dwk perf tests Vladimir Medvedkin
2020-03-16 14:39 ` [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table Morten Brørup
2020-03-16 18:27 ` Medvedkin, Vladimir
2020-03-16 19:32 ` Stephen Hemminger
2020-03-17 19:52 ` Wang, Yipeng1
2020-03-26 17:28 ` Medvedkin, Vladimir
2020-03-31 19:55 ` Thomas Monjalon
2020-03-31 21:17 ` Honnappa Nagarahalli
2020-04-01 18:37 ` Medvedkin, Vladimir
2020-04-01 18:28 ` Medvedkin, Vladimir
2020-03-16 19:33 ` Morten Brørup
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 0/4] add new k32v64 " Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 " Vladimir Medvedkin
2020-04-15 18:51 ` Mattias Rönnblom
2020-04-16 10:18 ` Medvedkin, Vladimir
2020-04-16 11:40 ` Mattias Rönnblom
2020-04-17 0:21 ` Wang, Yipeng1
2020-04-23 16:19 ` Ananyev, Konstantin
2020-05-08 20:08 ` Medvedkin, Vladimir
2020-04-16 9:39 ` Thomas Monjalon
2020-04-16 14:02 ` Medvedkin, Vladimir
2020-04-16 14:38 ` Thomas Monjalon
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 0/4] add new kv " Vladimir Medvedkin
2020-06-16 16:37 ` Thomas Monjalon
2021-03-24 21:28 ` Thomas Monjalon
2021-03-25 12:03 ` Medvedkin, Vladimir
2023-06-12 16:11 ` Stephen Hemminger
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 1/4] hash: add kv hash library Vladimir Medvedkin
2020-06-23 15:44 ` Ananyev, Konstantin
2020-06-23 23:06 ` Ananyev, Konstantin
2020-06-25 19:56 ` Medvedkin, Vladimir
2020-06-25 19:49 ` Medvedkin, Vladimir
2020-06-24 1:19 ` Wang, Yipeng1
2020-06-25 20:26 ` Medvedkin, Vladimir
2020-06-25 4:27 ` Honnappa Nagarahalli
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 2/4] hash: add documentation for " Vladimir Medvedkin
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 3/4] test: add kv hash autotests Vladimir Medvedkin
2020-05-08 19:58 ` [dpdk-dev] [PATCH v4 4/4] test: add kv perf tests Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-15 18:59 ` Mattias Rönnblom
2020-04-16 10:23 ` Medvedkin, Vladimir
2020-04-23 13:31 ` Ananyev, Konstantin
2020-05-08 20:14 ` Medvedkin, Vladimir
2020-04-29 21:29 ` Honnappa Nagarahalli
2020-05-08 20:38 ` Medvedkin, Vladimir
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 2/4] hash: add documentation for " Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-15 18:17 ` [dpdk-dev] [PATCH v3 4/4] test: add k32v64 perf tests Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 1/4] hash: add k32v64 hash library Vladimir Medvedkin
2020-04-08 23:23 ` Ananyev, Konstantin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 2/4] hash: add documentation for " Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 3/4] test: add k32v64 hash autotests Vladimir Medvedkin
2020-04-08 18:19 ` [dpdk-dev] [PATCH v2 4/4] test: add k32v64 perf tests Vladimir Medvedkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).